-j8 is given to it, but it creates ~130 lto1 processes. See this downstream issues for details: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=265254
Hmm: https://gcc.gnu.org/onlinedocs/gcc-10.4.0/gcc/Optimize-Options.html#index-flto Maybe the automation detection of job server is not working on freebsd correctly.
That's the WPA LTRANS file generation which does not use the jobserver but parallelizes for I/O. There is --param lto-max-streaming-parallelism you can use to limit things but it's default is 32 (but that would be per actual parallel invoked LTO link step, so with -j8 it's 8 * 32 when there are 8 parallel invoked link steps). See gcc/lto/lto.{c,cc}:stream_out_partitions which indeed says #ifdef HAVE_WORKING_FORK ... /* Do not run more than LTO_PARALLELISM streamings FIXME: we ignore limits on jobserver. */ if (lto_parallelism > 0 && nruns >= lto_parallelism) { wait_for_child (); ... if (!cpid) { setproctitle ("lto1-wpa-streaming"); so "confirmed" - it doesn't honor the jobserver. Note without using -flto=auto or -flto=jobserver it would be all serial, note the above also does not honor a limit placed via -flto=8 I think.
Just to clarify, GCCs WPA stage fork()s to write out LTRANS IL object files in parallel - those processes are not controlled by make but GCC could request tokens from makes jobserver if it got a handle on a connection (which isn't trivial because make does not provide the necessary open FDs to all sibling processes it creates).
If it is possible to query the '-j=N' setting from make via the jobserver connection then it might be practical to auto-tune lto-max-streaming-parallelism to that setting at least (that's still prone to N*N sub-processes).
I've got a patch candidate..
The master branch has been updated by Martin Liska <marxin@gcc.gnu.org>: https://gcc.gnu.org/g:fed766af32ed6cd371016cc24e931131e19b4eb1 commit r13-2012-gfed766af32ed6cd371016cc24e931131e19b4eb1 Author: Martin Liska <mliska@suse.cz> Date: Tue Aug 9 13:59:39 2022 +0200 lto: respect jobserver in parallel WPA streaming PR lto/106328 gcc/ChangeLog: * opts-jobserver.h (struct jobserver_info): Add pipefd. (jobserver_info::connect): New. (jobserver_info::disconnect): Likewise. (jobserver_info::get_token): Likewise. (jobserver_info::return_token): Likewise. * opts-common.cc: Implement the new functions. gcc/lto/ChangeLog: * lto.cc (wait_for_child): Decrement nruns once a process finishes. (stream_out_partitions): Use job server if active. (do_whole_program_analysis): Likewise.
Implemented for GCC 13.
On Wed, 10 Aug 2022, marxin at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106328 > > Martin Liška <marxin at gcc dot gnu.org> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > Resolution|--- |FIXED > Status|ASSIGNED |RESOLVED > > --- Comment #7 from Martin Liška <marxin at gcc dot gnu.org> --- > Implemented for GCC 13. Magically only with recent GNU make, otherwise needs proper prefixed rules in the lto-wrapper generated makefile which I don't think we do. Without either it's now slow by default?
> Magically only with recent GNU make, otherwise needs proper prefixed > rules in the lto-wrapper generated makefile which I don't think we do. Wait, the cooperation works with older GNU make if a Makefile uses prefixed (+) rules. WPA does not email any artificial Makefile for WPA streaming. It's a Makefile we emit for LTRANS run, e.g.: marxin@marxinbox:/dev/shm/objdir> cat /tmp/ccuhgkQs.mk ./a.ltrans0.ltrans.o: @g++ '-xlto' '-c' '-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection=none' '-mtune=generic' '-march=x86-64' '-O2' '-save-temps' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' '-dumpdir' 'a.' '-dumpbase' './a.ltrans0.ltrans' '-fltrans' '-o' './a.ltrans0.ltrans.o' './a.ltrans0.o' ./a.ltrans1.ltrans.o: @g++ '-xlto' '-c' '-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection=none' '-mtune=generic' '-march=x86-64' '-O2' '-save-temps' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' '-dumpdir' 'a.' '-dumpbase' './a.ltrans1.ltrans' '-fltrans' '-o' './a.ltrans1.ltrans.o' './a.ltrans1.o' ... So what can be miss is jobserver detection on BSD that can fail for some reason, but it should work fine apart from that. Or do I miss something?
On Wed, 10 Aug 2022, marxin at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106328 > > --- Comment #9 from Martin Liška <marxin at gcc dot gnu.org> --- > > Magically only with recent GNU make, otherwise needs proper prefixed > > rules in the lto-wrapper generated makefile which I don't think we do. > > Wait, the cooperation works with older GNU make if a Makefile uses prefixed (+) > rules. WPA does not email any artificial Makefile for WPA streaming. It's a > Makefile we emit for LTRANS run, e.g.: > > marxin@marxinbox:/dev/shm/objdir> cat /tmp/ccuhgkQs.mk > ./a.ltrans0.ltrans.o: > @g++ '-xlto' '-c' '-fno-openmp' '-fno-openacc' '-fno-pie' > '-fcf-protection=none' '-mtune=generic' '-march=x86-64' '-O2' '-save-temps' > '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' '-dumpdir' 'a.' > '-dumpbase' './a.ltrans0.ltrans' '-fltrans' '-o' './a.ltrans0.ltrans.o' > './a.ltrans0.o' > ./a.ltrans1.ltrans.o: > @g++ '-xlto' '-c' '-fno-openmp' '-fno-openacc' '-fno-pie' > '-fcf-protection=none' '-mtune=generic' '-march=x86-64' '-O2' '-save-temps' > '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' '-dumpdir' 'a.' > '-dumpbase' './a.ltrans1.ltrans' '-fltrans' '-o' './a.ltrans1.ltrans.o' > './a.ltrans1.o' > ... > > So what can be miss is jobserver detection on BSD that can fail for some > reason, but it should work fine apart from that. Or do I miss something? Ah, indeed. That still leaves the question whether we execute the WPA stage with the FDs open - I suppose you checked? And whether pex_* "properly" does this for all host OSs (how does make jobserver work on mingw/cygwin?). I wonder because Honza once said he didn't implement jobserver support because it would require more fiddling to get it actually work. And IIRC BSD 'make' is not GNU make but I think gmake is available from the ports repo. The documentation about -flto=jobserver mentions that already.
> Ah, indeed. That still leaves the question whether we execute the > WPA stage with the FDs open - I suppose you checked? Yes, I can confirm it correctly works on Linux with FDs provided by jobserver. > And whether > pex_* "properly" does this for all host OSs (how does make jobserver > work on mingw/cygwin?). Based on the testing of Tamar, it also uses --jobserver-auth=3,4 on MinGW. > I wonder because Honza once said he didn't > implement jobserver support because it would require more fiddling > to get it actually work. I asked him and he didn't remember any details. > > And IIRC BSD 'make' is not GNU make but I think gmake is available > from the ports repo. The documentation about -flto=jobserver mentions > that already.