[RFC] Old school parallelization of WPA streaming

Jan Hubicka hubicka@ucw.cz
Fri Feb 21 01:58:00 GMT 2014


> > I plan to commit it shortly (i am just slowly progressing through the
> > bugreports and TODOs cumulated)
> > - indeed for bigger apps and edit/relink cycle it is an life saver ;)
> 
> I haven't tested exactly around this, but I see a ~10s (~5%) improved kernel
> LTO build time going from 4.9-20140209 to 20140220

Good to know! It also improves firefox build noticeably.
I added some comments to the PR itself. I do not see how it can make too many
WPA processes, given that it does not fork without explicit -flto=N argument
that is not passed by the bootstrap-lto.mk config. So perhaps something is
wrong with parsing command liine arguments and setting lto_parallelizm?

Also for all builds I tested so far the memory is not dominated by WPA
streaming but by the subsequent ltrans-es now.  Things are different with
-fprofile-generate that adds a lot of extra datastructures to stream. Generally
I am trying to convince people to profile without LTO as it is much faster.
I will try to reproduce the problem - but I am running ltobootstraps and
profiled-ltobootstrap regularly and never saw too many WPA processes at
once.

Honza



More information about the Gcc-patches mailing list