This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Whole program optimization and functions-only-called-once.


Peter Bergner wrote:
On Sun, 2009-11-22 at 12:41 +0100, Toon Moene wrote:
[ This might be wrong-headed ] Is it so that every pseudo *has* to (possibly) conflict with *every* other one (i.e., that the conflict table has as many elements as the square of the number of pseudo's in a function) ?

Intuitively one would think that no pseudo lives over the entire range of such a large function, so there might be "pockets" of conflict, but not the whole function ...

This is exactly what I did for the 4.3 (ie, pre IRA) compiler and I saw some good reductions in the size of the interference graph, so no, you're not wrong-headed. :)

http://gcc.gnu.org/ml/gcc-patches/2007-09/msg00529.html

Basically, if you ever ask whether two pseudos interfere, then you need to
reserve space in the interference graph to hold their interference info.
The way Chaitin/Briggs and GCC's (4.3) implementations work, we only "ask"
whether two pseudos conflict if they are live simultaneously (actually,
due to copy coalescing, we have to relax this to being live in the same
instruction).  The patch above which is in the FSF 4.3 sources, eliminated
the space for live ranges that live in separate blocks.

For SPEC2000, we roughly use 84% less space than the old square bit
matirx and 68% less space than a conventional triangular bit matrix.
In some case like 177.mesa/get.c:gl_GetBooleanv (it's basically just a
huge switch statement), we use a LOT less (99.6%).  I'll note that the
-fdump-rtl-greg output displays how much space the old square matrix would
use, how much space a conventional triangular bit matrix would use and how
much we used for our compressed triangular bit matrix.  It'd be interesting
to see how much space is used compiling with the 4.3 compiler.

The patch above does not help with single basic block functions (like the
old SPEC92 fpppp routine) or where all live ranges are global.  I'll note
I do have code to help the fppp case, but it didn't make the 4.3 timeframe
and now that we've switched to IRA, it's moot.

Peter, the analogous code you mentioned is actually in IRA. Your approach has been even improved because it works not on BB base but on live ranges. Therefore it helps even fpppp or for compression of global pseudos conflicts.
I'll also note that I have completely ignored the fact that by switching
to a triangular bit matrix (compressed or conventional), we had to
introduce an adjacency list which does take some additional space, but
allows much faster access to ones neighbors.


It has been done in IRA too. IRA decides what is the best representation for adjacent list.

The problem with IRA regional allocation is that #allocnos might be much more #pseudos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]