This is the mail archive of the
mailing list for the GCC project.
Re: Fix three inliner issues
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Jan Hubicka <hubicka at ucw dot cz>
- Cc: gcc-patches at gcc dot gnu dot org, rguenther at suse dot de, matz at suse dot de
- Date: Mon, 19 Apr 2010 14:46:06 -0700
- Subject: Re: Fix three inliner issues
- References: <20100413160951.GB31756@kam.mff.cuni.cz>
On Tue, Apr 13, 2010 at 9:09 AM, Jan Hubicka <firstname.lastname@example.org> wrote:
> this patch fixes several problems in inliner:
> ?1) Michael and Richi noticed that when we optimize out offline function, we account
> ? ? it twice in code size computations.
> ?2) update_caller_keys got wrong updating of fibheap, so sometimes the
> ? ? key in heap was not updated to new cost.
> ?3) small function inlining did not update cost of the calls of the offline
> ? ? copy. ?This is wrong, as inliner is trying to prioritize functions that
> ? ? gets fully inlined by making the cost of every call depend on the estimated
> ? ? growth after inlining everything. ?When one of calls gets inlined, all
> ? ? the other calls gets cheaper.
> I also added a dumping facility to cgraph_edge_badness that helps to track down
> the reasons for inliner (mis)decisions. ?This needs extra recoputation of the
> cost since the costs are determined when functions are inserted into heap and
> updated several times before function becomes minimum of the heap. ?Dumping
> reasons at that time makes dumps very difficult to parse, so I am dumping at
> the time we are removing function from heap only.
> Fixing 1) leads to regression in tramp3d and few other bechmarks, where this
> double accounting happens particularly often and thus we now give up before we
> inline everything important. ?I did some testing here last two weeks and this
> can be fixed by bumping unit growth from 30% to 50%. ?Everything except for
> tramp3d is fixed by bumping to 40% and code size is still at average smaller.
> All these three fixes together makes tramp3d happy (10% smaller, 7% faster)
> than unpatched mainline. ?So I am going to commit the patch once testing on
> x86_64-linux is complette. Because all the changes are fixes of quite obvious
> I did some of analysis of tramp3d. ?The main problem here is that the unit size
> growth cutoff happens at quite arbitrary place. ?The badness computation gives
> result in very small range (0-400) that is sort of correct. ?All the functions
> are very small and inlining them is locally good idea. ?Just we can't inline
> them all.
> We can account more thoroughly the loop levels (frequencies), but I guess we
> also should take into account whether the function being inlined contains other
> calls that are candidates for inlining or not thus attempting to produce more
> leaf functions (at least leaf in the compilation unit POV. The actual leaf
> functions are already prioritized both in early and late inlining). ?I will
> experiment with this on pretty-ipa.
> ? ? ? ?* ipa-inline.c (cgraph_mark_inline_edge): Avoid double accounting
> ? ? ? ?of optimized out static functions.
> ? ? ? ?(cgraph_edge_badness): Add DUMP parameter and dump reasons for the
> ? ? ? ?cost computation. ?Also sanity check for overflows.
> ? ? ? ?(update_caller_keys): Update cgraph_edge_badness call; properly
> ? ? ? ?update fibheap and sanity check that it is up to date.
> ? ? ? ?(add_new_edges_to_heap): Update cgraph_edge_badness.
> ? ? ? ?(cgraph_decide_inlining_of_small_function): Likewise;
> ? ? ? ?add sanity checking that badness in heap is up to date;
> ? ? ? ?improve dumping of reason; Update badness of calls to the
> ? ? ? ?offline copy of function currently inlined; dump badness
> ? ? ? ?of functions not inlined because of unit growth limits.