This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Old school parallelization of WPA streaming
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Andi Kleen <ak at linux dot intel dot com>
- Cc: Jan Hubicka <hubicka at ucw dot cz>, Markus Trippelsdorf <markus at trippelsdorf dot de>, Richard Biener <rguenther at suse dot de>, Michael Matz <matz at suse dot de>, gcc-patches at gcc dot gnu dot org, dnovillo at google dot com, dmalcolm at redhat dot com, jakub at redhat dot com
- Date: Fri, 21 Feb 2014 02:58:44 +0100
- Subject: Re: [RFC] Old school parallelization of WPA streaming
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot LNX dot 2 dot 00 dot 1308291536280 dot 20077 at zhemvz dot fhfr dot qr> <20131120230239 dot GA28683 at atrey dot karlin dot mff dot cuni dot cz> <alpine dot LNX dot 2 dot 00 dot 1311211021570 dot 8615 at zhemvz dot fhfr dot qr> <20131121101909 dot GC10503 at kam dot mff dot cuni dot cz> <alpine dot LNX dot 2 dot 00 dot 1311211131110 dot 8615 at zhemvz dot fhfr dot qr> <20131205235410 dot GB9577 at kam dot mff dot cuni dot cz> <alpine dot LNX dot 2 dot 00 dot 1312061042590 dot 8615 at zhemvz dot fhfr dot qr> <20131213123738 dot GB274 at x4> <20131213130603 dot GA9977 at kam dot mff dot cuni dot cz> <20140221004344 dot GA12219 at tassilo dot jf dot intel dot com>
> > I plan to commit it shortly (i am just slowly progressing through the
> > bugreports and TODOs cumulated)
> > - indeed for bigger apps and edit/relink cycle it is an life saver ;)
>
> I haven't tested exactly around this, but I see a ~10s (~5%) improved kernel
> LTO build time going from 4.9-20140209 to 20140220
Good to know! It also improves firefox build noticeably.
I added some comments to the PR itself. I do not see how it can make too many
WPA processes, given that it does not fork without explicit -flto=N argument
that is not passed by the bootstrap-lto.mk config. So perhaps something is
wrong with parsing command liine arguments and setting lto_parallelizm?
Also for all builds I tested so far the memory is not dominated by WPA
streaming but by the subsequent ltrans-es now. Things are different with
-fprofile-generate that adds a lot of extra datastructures to stream. Generally
I am trying to convince people to profile without LTO as it is much faster.
I will try to reproduce the problem - but I am running ltobootstraps and
profiled-ltobootstrap regularly and never saw too many WPA processes at
once.
Honza