Use separate sections to stream non-trivial constructors

Richard Biener rguenther@suse.de
Fri Jul 11 12:00:00 GMT 2014


On Fri, 11 Jul 2014, Jan Hubicka wrote:

> > On Fri, 11 Jul 2014, Jan Hubicka wrote:
> > 
> > > Hi,
> > > since we both agreed offlining constructors from global decl stream is a good
> > > idea, I went ahead and implemented it.  I would like to followup by an
> > > cleanups; for example the sections are still tagged as function sections, but I
> > > would like to do it incrementally. There is quite some uglyness in the way we
> > > handle function sections and the patch started to snowball very quickly.
> > > 
> > > The patch conceptually copies what we do for functions and re-uses most of
> > > infrastructure. varpool_get_constructor is cgraph_get_body (i.e. mean of
> > > getting function in) and it is used by output machinery, by ipa-visibility
> > > while rewritting the constructor and by ctor_for_folding (which makes us to
> > > load the ctor whenever it is needed by ipa-cp or ipa-devirt).
> > > 
> > > I kept get_symbol_initial_value as an authority to decide if we want to encode
> > > given constructor or not.  The section itself for trivial ctor is about 25
> > > bytes and with header it is probably close to double of it. Currently the heuristic
> > > is to offline only constructors that are CONSTRUCTOR and keep simple expressions
> > > inline.  We may want to tweak it.
> > 
> > Hmm, so what about artificial testcase with gazillions of
> > 
> > struct X { int i; };
> > 
> > struct X a0001 = { 1 };
> > struct X a0002 = { 2 };
> > ....
> > 
> > how does it explode LTO IL size and streaming time (compile-out and
> > LTRANS in)?  I suppose it still helps WPA stage.
> 
> Well, nothing really artificial, except that gazzilions of static variables
> called a0001 to a000gazzilion are ugly :))
> 
> I just put the CONSRUCTOR bits in the initial varsion to not have the path unused
> at all.  Either we can base our decision on size of the variable or do simple
> walk to see if it needs more than, say 8 trees.

Hum, probably not worth special-casing.

> I will play with this incrementally after cleaning up the headers (as those
> accounts for the overhead)
> > 
> > Also what we desparately miss is to put CONST_DECLs into the symbol 
> > table (and thus eventually move the constant pool to symtab).  That
> > and no longer allowing STRING_CSTs in the IL but only CONST_DECLs
> > with STRING_CST initializers (to fix PR50199).
> 
> Yep, I have patch for putting CONST_DECLs into symbol table. It however
> does not help partitionability because at the moment output machinery do
> not expect const decls to have visibilities.

Well, just make them regular (anonymous) VAR_DECLs then ... (the fact
that a CONST_DECL is anonymous is probably the only real difference - 
and that they are mergeable by content).

> I will push out that change (and LABEL_DECL, too) after Martin's renaming
> patches lands to mainline.

Thanks.

> > 
> > > The patch does not bring miraculous savings to firefox WPA, but it does some:
> > > 
> > > GGC memory after global stream is read goes from 1376898k to 1250533k
> > > overall GGC allocations from 4156478 kB to 4012462 kB
> > > read 11006599 SCCs of average size 1.907692 -> read 9119433 SCCs of average size 2.037867
> > > 20997206 tree bodies read in total -> 18584194 tree bodies read in total
> > > Size of mmap'd section decls: 299540188 bytes -> Size of mmap'd section decls: 271557265 bytes
> > > Size of mmap'd section function_body: 5711078 bytes -> Size of mmap'd section function_body: 7548680 bytes 
> > > 
> > > Things would be better if ipa-visibility and ipa-devirt did not load most of
> > > the virtual tables into memory (still better than loading each into memory 20
> > > times at average).  I will work on that incrementally. We load 10311 ctors into
> > > memory at WPA time.
> > > 
> > > Note that firefox seems to feature really huge data segment these days.
> > > http://hubicka.blogspot.ca/2014/04/linktime-optimization-in-gcc-2-firefox.html
> > > 
> > > Bootstrapped/regtested x86_64-linux, tested with firefox, lto bootstrap 
> > > in progress, OK?
> > 
> > The patch looks ok to me.  How about simply doing 
> > s/LTO_section_function_body/LTO_section_symbol_content/ instead of
> > adding LTO_section_variable_initializer?
> 
> Yeah, I was thinking about it, too.
> I think variable and constructor sections may differ in its header however, since we do
> not need CFG stream for variables.
> 
> Thanks!
> Honza
> > 
> > Thanks,
> > Richard.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer



More information about the Gcc-patches mailing list