Re: [RFC] Old school parallelization of WPA streaming

> >One risk is if someone streams to a spinning disk it may add more seeks
> >the parallel IO. But I think it's a reasonable tradeoffs.
> It'll also wreck all WPA dump files.

We do not dump anything during the main streaming.  If we now stream 2GB for firefox,
I think we can hope to mostly fit in cache with the whole machinery.

We will need to flush cgraph file prior forking and close it in forked process.
It is only one that remains cross fork boundary IMO.
> >We should also use a faster compressor
> And we should avoid uncompressing the function sections...

Yep, we also need to avoid carring whole tree stream of the original source
unit whenever we stream out function from it.  I think function sections should
have two parts - the references to global trees that is uncompressed and
transleted during WPA streaming plus compressed binary blob with the body that
is copied over.
> That said, the patch is enough of a hack that I don't ever want to debug a bug in it....
> I also fail to see why threads should not work here.  Maybe simply annotate gcc with openmp?

It means pushing global state of lto-streamer into a context variable + moving
it out of GGC or making GGC thread safe.  I would hope that David Malcolm would
be interested in doing this, but it is bit more I have time for right now during
the labs conference.

To be honest I fail to see how bug in openmp annotated program would be easier
to debug than the fork variant.


