This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: LTO inliner -- sensitivity to increasing register pressure

> Honza,
>   Seeing your recent patches relating to inliner heuristics for LTO,
> I thought I should mention some related work I'm doing.
> By way of introduction, I've recently joined the IBM LTC's PPC
> Toolchain team, working on gcc performance.
> We have not generally seen good results using LTO on IBM power
> processors and one of the problems seems to be excessive inlining
> that results in the generation of excessive spill code. So, I have
> set out to tackle this by doing some analysis at the time of the
> inliner pass to compute something analogous to register pressure,
> which is then used to shut down inlining of routines that have a lot
> of pressure.

This is intresting.  I sort of planned to add register pressure logic
but always tought it is somewhat hard to do at GIMPLE level in a way
that would work for all CPUs.
> The analysis is basically a liveness analysis on the SSA names per
> basic block and looking for the maximum number live in any block.
> I've been using "liveness pressure" as a shorthand name for this.

I believe this is usually called width
> This can then be used in two ways.
> 1) want_inline_function_to_all_callers_p at present always says to
> inline things that have only one call site without regard to size or
> what this may do to the register allocator downstream. In
> particular, BZ2_decompress in bzip2 gets inlined and this causes the
> pressure reported downstream for the int register class to increase
> 10x. Looking at some combination of pressure in caller/callee may
> help avoid this kind of situation.
> 2) I also want to experiment with adding the liveness pressure in
> the callee into the badness calculation in edge_badness used by
> inline_small_functions. The idea here is to try to inline functions
> that are less likely to cause register allocator difficulty
> downstream first.

Sounds interesting.  I am very curious if you can get consistent improvements
with this.  I only implemented logic for large stack frames, but in C++ code
it seems often to do more harm than good.

If you find examples of bad inlining, can you also fill it into bugzilla?
Perhaps the individual cases could be handled better by improving IRA.

> I am just at the point of getting a prototype working, I will get a
> patch you could take a look at posted next week. In the meantime, do
> you have any comments or feedback?
> Thanks,
>    Aaron
> -- 
> Aaron Sawdey, Ph.D.
> 050-2/C113  (507) 253-7520 home: 507/263-0782
> IBM Linux Technology Center - PPC Toolchain

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]