Created attachment 53460 [details] bad.log.xz This week's gcc snapshot fails bootstrap builds in parallel mode surprisingly frequently. The symptom is the same: build fails while .libs/liblto_plugin.so is being linked. I attached successful good.log.xz (-j1) and failing bad.log.xz (-j16) builds. Can you help me understand why it fails? Failure snippet is: checking for fgets_unlocked... /nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/bin/bash ./libtool --tag=CC --tag=disable-static --mode=link /build/build/./prev-gcc/xgcc -B/build/build/./prev-gcc/ -B/nix/store/v06bn3lc2s0yjci9px8l829mbks695fm-gfortran-13.0.0/x86_64-unknown-linux-gnu/bin/ -O2 -I/nix/store/q7l8qdpbvm594q4ayf4xr8wfqknc0nmg-glibc-2.35-163-dev/include -B/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ -idirafter /nix/store/q7l8qdpbvm594q4ayf4xr8wfqknc0nmg-glibc-2.35-163-dev/include -idirafter /nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/lib/gcc/x86_64-unknown-linux-gnu/8.3.0/include-fixed -Wl,-rpath,/nix/store/m3wi1gn0309l15zrha95yv9mw39972db-gfortran-13.0.0-lib/lib -Wl,-L/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-rpath -Wl,/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-dynamic-linker=/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ld-linux-x86-64.so.2 --sysroot=/ -fno-checking -Wall -fcf-protection -DBASE_VERSION='"13.0.0"' -O2 -I/nix/store/q7l8qdpbvm594q4ayf4xr8wfqknc0nmg-glibc-2.35-163-dev/include -B/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ -idirafter /nix/store/q7l8qdpbvm594q4ayf4xr8wfqknc0nmg-glibc-2.35-163-dev/include -idirafter /nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/lib/gcc/x86_64-unknown-linux-gnu/8.3.0/include-fixed -Wl,-rpath,/nix/store/m3wi1gn0309l15zrha95yv9mw39972db-gfortran-13.0.0-lib/lib -Wl,-L/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-rpath -Wl,/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-dynamic-linker=/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ld-linux-x86-64.so.2 -fno-checking -gtoggle -Wc,-static-libgcc -pthread -module -avoid-version -bindir /nix/store/v06bn3lc2s0yjci9px8l829mbks695fm-gfortran-13.0.0/libexec/gcc/x86_64-unknown-linux-gnu/13.0.0 -Wl,--version-script=../../gcc-13-20220814/lto-plugin/lto-plugin.map -Xcompiler '-static-libstdc++' -Xcompiler '-static-libgcc' '-O2' '-I/nix/store/q7l8qdpbvm594q4ayf4xr8wfqknc0nmg-glibc-2.35-163-dev/include' -Xcompiler '-B/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/' '-idirafter' '/nix/store/q7l8qdpbvm594q4ayf4xr8wfqknc0nmg-glibc-2.35-163-dev/include' '-idirafter' '/nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/lib/gcc/x86_64-unknown-linux-gnu/8.3.0/include-fixed' '-Wl,-rpath,/nix/store/m3wi1gn0309l15zrha95yv9mw39972db-gfortran-13.0.0-lib/lib' '-Wl,-L/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib' '-Wl,-rpath' '-Wl,/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib' '-Wl,-dynamic-linker=/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ld-linux-x86-64.so.2' -o liblto_plugin.la -rpath /nix/store/v06bn3lc2s0yjci9px8l829mbks695fm-gfortran-13.0.0/libexec/gcc/x86_64-unknown-linux-gnu/13.0.0 lto-plugin.lo -Wc,../libiberty/pic/libiberty.a yes checking for fileno_unlocked... libtool: link: /build/build/./prev-gcc/xgcc -B/build/build/./prev-gcc/ -B/nix/store/v06bn3lc2s0yjci9px8l829mbks695fm-gfortran-13.0.0/x86_64-unknown-linux-gnu/bin/ -O2 -I/nix/store/q7l8qdpbvm594q4ayf4xr8wfqknc0nmg-glibc-2.35-163-dev/include -B/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ -idirafter /nix/store/q7l8qdpbvm594q4ayf4xr8wfqknc0nmg-glibc-2.35-163-dev/include -idirafter /nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/lib/gcc/x86_64-unknown-linux-gnu/8.3.0/include-fixed -Wl,-rpath,/nix/store/m3wi1gn0309l15zrha95yv9mw39972db-gfortran-13.0.0-lib/lib -Wl,-L/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-rpath -Wl,/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-dynamic-linker=/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ld-linux-x86-64.so.2 --sysroot=/ -fno-checking -shared -fPIC -DPIC .libs/lto-plugin.o -Wl,-rpath -Wl,/nix/store/m3wi1gn0309l15zrha95yv9mw39972db-gfortran-13.0.0-lib/lib -Wl,-L/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-rpath -Wl,/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-dynamic-linker=/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ld-linux-x86-64.so.2 -Wl,-rpath -Wl,/nix/store/m3wi1gn0309l15zrha95yv9mw39972db-gfortran-13.0.0-lib/lib -Wl,-L/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-rpath -Wl,/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-dynamic-linker=/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ld-linux-x86-64.so.2 -static-libgcc -pthread -Wl,--version-script=../../gcc-13-20220814/lto-plugin/lto-plugin.map -static-libstdc++ -static-libgcc -B/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ -Wl,-rpath -Wl,/nix/store/m3wi1gn0309l15zrha95yv9mw39972db-gfortran-13.0.0-lib/lib -Wl,-L/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-rpath -Wl,/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib -Wl,-dynamic-linker=/nix/store/z1as323dfsk12agzlp9ia35p5801isd7-glibc-2.35-163/lib/ld-linux-x86-64.so.2 ../libiberty/pic/libiberty.a -pthread -Wl,-soname -Wl,liblto_plugin.so -o .libs/liblto_plugin.so xgcc: fatal error: cannot execute '/build/build/./prev-gcc/collect2': execv: Bad address compilation terminated. make[4]: *** [Makefile:472: liblto_plugin.la] Error 1 make[4]: Leaving directory '/build/build/lto-plugin' make[3]: *** [Makefile:383: all] Error 2 make[3]: Leaving directory '/build/build/lto-plugin' make[2]: *** [Makefile:15579: all-stage2-lto-plugin] Error 2 make[2]: *** Waiting for unfinished jobs....
Created attachment 53461 [details] good.log.xz
Used configure options: configure flags: --prefix=/nix/store/fx45rjgwi61c5xx6xyxz9lih1bkyv374-gfortran-13.0.0 --with-gmp-include=/nix/store/gyr707p3ac6ss8pcmf14g0hx041vj9xf-gmp-with-cxx-stage3-6.2.1-dev/include --with-gmp-lib=/nix/store/lcnnbhzzvknkfnlm5qh89xn4in9jm035-gmp-with-cxx-stage3-6.2.1/lib --with-mpfr-include=/nix/store/nfxamp6dnv1jhydhjndnln3maixsw22d-mpfr-stage3-4.1.0-dev/include --with-mpfr-lib=/nix/store/gwrfldp0x95sgsd6kqi2ms52kp68qrk7-mpfr-stage3-4.1.0/lib --with-mpc=/nix/store/3m2bgmj266d857m3x4sfzcbx0rpsqyfd-libmpc-stage3-1.2.1 --with-libelf=/nix/store/7gv2c6bfr8gzzikkp04l5py3yd6w5w13-libelf-0.8.13 --with-native-system-header-dir=/nix/store/q7l8qdpbvm594q4ayf4xr8wfqknc0nmg-glibc-2.35-163-dev/include --with-build-sysroot=/ --program-prefix= --enable-lto --disable-libstdcxx-pch --without-included-gettext --with-system-zlib --enable-checking=release --enable-static --enable-languages=fortran --disable-multilib --enable-plugin --with-isl=/nix/store/vcik6gi61dpw72ygd7lqv8g074m5p4cw-isl-stage3-0.20 --build=x86_64-unknown-linux-gnu --host=x86_64-unknown-linux-gnu --target=x86_64-unknown-linux-gnu
Works for me (with system dependencies), tries for g:e236d671d460dd47262accdea2e9d1d80820ae88.
Bisected locally down to: 53e3b2bf16a486c15c20991c6095f7be09012b55 is the first bad commit commit 53e3b2bf16a486c15c20991c6095f7be09012b55 Author: Martin Liska <mliska@suse.cz> Date: Tue Aug 9 13:59:36 2022 +0200 lto: support --jobserver-style=fifo for recent GNU make gcc/ChangeLog: * opts-jobserver.h: Add one member. * opts-common.cc (jobserver_info::jobserver_info): Parse FIFO format of --jobserver-auth. gcc/opts-common.cc | 17 +++++++++++++++-- gcc/opts-jobserver.h | 2 ++ 2 files changed, 17 insertions(+), 2 deletions(-) Which makes some sense as I locally run GNU make with --shuffle enabled by default: https://savannah.gnu.org/bugs/index.php?62100 It should generate environment something like 'MAKEFLAGS= -j2 --jobserver-auth=3,4 --shuffle=1660054175'.
(In reply to Sergei Trofimovich from comment #4) > Bisected locally down to: > > 53e3b2bf16a486c15c20991c6095f7be09012b55 is the first bad commit > commit 53e3b2bf16a486c15c20991c6095f7be09012b55 > Author: Martin Liska <mliska@suse.cz> > Date: Tue Aug 9 13:59:36 2022 +0200 > > lto: support --jobserver-style=fifo for recent GNU make > > gcc/ChangeLog: > > * opts-jobserver.h: Add one member. > * opts-common.cc (jobserver_info::jobserver_info): Parse FIFO > format of --jobserver-auth. > > gcc/opts-common.cc | 17 +++++++++++++++-- > gcc/opts-jobserver.h | 2 ++ > 2 files changed, 17 insertions(+), 2 deletions(-) Funny. > > Which makes some sense as I locally run GNU make with --shuffle enabled by > default: https://savannah.gnu.org/bugs/index.php?62100 Well, it's more likely caused by the fact that recent GNU make uses of the newly added fifo style for jobserver. Let me try reproducing it with the current make master. > > It should generate environment something like 'MAKEFLAGS= -j2 > --jobserver-auth=3,4 --shuffle=1660054175'.
> Which makes some sense as I locally run GNU make with --shuffle enabled by > default: https://savannah.gnu.org/bugs/index.php?62100 Do you have a special patch on top of that? Which exact revision of the make do you use?
I'm using GNU make from https://git.savannah.gnu.org/cgit/make.git/commit/?id=621d3196fae94e9006a7e9c5ffdaf5ec209bf832 commit (from around 22 June, before FIFO support). On top of that I apply --shuffle=random by default: --- a/src/main.c +++ b/src/main.c @@ -1513,6 +1513,10 @@ main (int argc, char **argv, char **envp) arg_job_slots = env_slots; } + /* Set less conservative default. */ + if (! shuffle_mode) + shuffle_mode= xstrdup ("random"); + /* Handle shuffle mode argument. */ if (shuffle_mode) { But I think I also see crashes with GNU make-4.2.1. I don't yet see anything wrong with `lto: support --jobserver-style=fifo for recent GNU make` patch. I'll keep digging what's wrong with my environment.
I think I understand now why it's such a mysterious failure. gcc uses putenv() incorrectly! I think the real bug was introduced in: commit 1270ccda70ca09f7d4 "Factor out jobserver_active_p.". It's gist is the change from `xputenv (concat ("MAKEFLAGS=", dup, NULL));` to `xputenv (jinfo.skipped_makeflags.c_str ());`. The difference here is what happens with memory allocated to be put into putenv(). putenv() is an odd API as it does not copy data, it just interns the pointer: // $ cat a.c #include <stdio.h> #include <stdlib.h> #include <string.h> char arr[1000] = "FOO=1234"; int main() { putenv(arr); printf("getenv(FOO)='%s'\n", getenv("FOO")); sprintf(arr + strlen("FOO="), "!!!!"); printf("getenv(FOO)='%s'\n", getenv("FOO")); } Thus `xputenv (jinfo.skipped_makeflags.c_str ());` gets clobbered with garbage as soon as string is freed and reallocated. I think commit 53e3b2bf16a486c "lto: support --jobserver-style=fifo for recent GNU make" only happens to tickle string reallocation as it does things with more std::strings. As a hack it looks like the following is enough to build a gcc for me: --- a/gcc/gcc.cc +++ b/gcc/gcc.cc @@ -9182,7 +9182,7 @@ driver::detect_jobserver () const { jobserver_info jinfo; if (!jinfo.is_active && !jinfo.skipped_makeflags.empty ()) - xputenv (jinfo.skipped_makeflags.c_str ()); + xputenv (xstrdup(jinfo.skipped_makeflags.c_str ())); } /* Determine what the exit code of the driver should be. */ Not sure what should be used instead for proper memory management.
Thanks for finding out! To be honest, I verified that path leading to env_manager::xput, but it does string copy only if m_can_restore. The patch is fine, please send it to gcc-patches as obvious!
Let's declare it a driver bug. Proposed the patch as: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599799.html
The master branch has been updated by Sergei Trofimovich <slyfox@gcc.gnu.org>: https://gcc.gnu.org/g:2b403297b111c990c331b5bbb6165b061ad2259b commit r13-2075-g2b403297b111c990c331b5bbb6165b061ad2259b Author: Sergei Trofimovich <siarheit@google.com> Date: Tue Aug 16 12:35:07 2022 +0100 driver: fix environ corruption after putenv() [PR106624] The bug appeared afte r13-2010-g1270ccda70ca09 "Factor out jobserver_active_p" slightly changed `putenv()` use from allocating to non-allocating: -xputenv (concat ("MAKEFLAGS=", dup, NULL)); +xputenv (jinfo.skipped_makeflags.c_str ()); `xputenv()` (and `putenv()`) don't copy strings and only store the pointer in the `environ` global table. As a result `environ` got corrupted as soon as `jinfo.skipped_makeflags` store got deallocated. This started causing bootstrap crashes in `execv()` calls: xgcc: fatal error: cannot execute '/build/build/./prev-gcc/collect2': execv: Bad address The change restores memory allocation for `xputenv()` argument. gcc/ PR driver/106624 * gcc.cc (driver::detect_jobserver): Allocate storage xputenv() argument using xstrdup().
Should be fixed now.
The releases/gcc-12 branch has been updated by Martin Liska <marxin@gcc.gnu.org>: https://gcc.gnu.org/g:193f7e62815b4089dfaed4c2bd34fd4f10209e27 commit r12-9061-g193f7e62815b4089dfaed4c2bd34fd4f10209e27 Author: Sergei Trofimovich <siarheit@google.com> Date: Tue Aug 16 12:35:07 2022 +0100 driver: fix environ corruption after putenv() [PR106624] The bug appeared afte r13-2010-g1270ccda70ca09 "Factor out jobserver_active_p" slightly changed `putenv()` use from allocating to non-allocating: -xputenv (concat ("MAKEFLAGS=", dup, NULL)); +xputenv (jinfo.skipped_makeflags.c_str ()); `xputenv()` (and `putenv()`) don't copy strings and only store the pointer in the `environ` global table. As a result `environ` got corrupted as soon as `jinfo.skipped_makeflags` store got deallocated. This started causing bootstrap crashes in `execv()` calls: xgcc: fatal error: cannot execute '/build/build/./prev-gcc/collect2': execv: Bad address The change restores memory allocation for `xputenv()` argument. gcc/ PR driver/106624 * gcc.cc (driver::detect_jobserver): Allocate storage xputenv() argument using xstrdup(). (cherry picked from commit 2b403297b111c990c331b5bbb6165b061ad2259b)
The releases/gcc-11 branch has been updated by Martin Liska <marxin@gcc.gnu.org>: https://gcc.gnu.org/g:9d21cc4edd94f8f2b1a3241fab5cf75649003226 commit r11-10479-g9d21cc4edd94f8f2b1a3241fab5cf75649003226 Author: Sergei Trofimovich <siarheit@google.com> Date: Tue Aug 16 12:35:07 2022 +0100 driver: fix environ corruption after putenv() [PR106624] The bug appeared afte r13-2010-g1270ccda70ca09 "Factor out jobserver_active_p" slightly changed `putenv()` use from allocating to non-allocating: -xputenv (concat ("MAKEFLAGS=", dup, NULL)); +xputenv (jinfo.skipped_makeflags.c_str ()); `xputenv()` (and `putenv()`) don't copy strings and only store the pointer in the `environ` global table. As a result `environ` got corrupted as soon as `jinfo.skipped_makeflags` store got deallocated. This started causing bootstrap crashes in `execv()` calls: xgcc: fatal error: cannot execute '/build/build/./prev-gcc/collect2': execv: Bad address The change restores memory allocation for `xputenv()` argument. gcc/ PR driver/106624 * gcc.c (driver::detect_jobserver): Allocate storage xputenv() argument using xstrdup(). (cherry picked from commit 2b403297b111c990c331b5bbb6165b061ad2259b)
The releases/gcc-10 branch has been updated by Martin Liska <marxin@gcc.gnu.org>: https://gcc.gnu.org/g:6ced00d53d91ea429948b34e6600b4633f962030 commit r10-11172-g6ced00d53d91ea429948b34e6600b4633f962030 Author: Sergei Trofimovich <siarheit@google.com> Date: Tue Aug 16 12:35:07 2022 +0100 driver: fix environ corruption after putenv() [PR106624] The bug appeared afte r13-2010-g1270ccda70ca09 "Factor out jobserver_active_p" slightly changed `putenv()` use from allocating to non-allocating: -xputenv (concat ("MAKEFLAGS=", dup, NULL)); +xputenv (jinfo.skipped_makeflags.c_str ()); `xputenv()` (and `putenv()`) don't copy strings and only store the pointer in the `environ` global table. As a result `environ` got corrupted as soon as `jinfo.skipped_makeflags` store got deallocated. This started causing bootstrap crashes in `execv()` calls: xgcc: fatal error: cannot execute '/build/build/./prev-gcc/collect2': execv: Bad address The change restores memory allocation for `xputenv()` argument. gcc/ PR driver/106624 * gcc.c (driver::detect_jobserver): Allocate storage xputenv() argument using xstrdup(). (cherry picked from commit 2b403297b111c990c331b5bbb6165b061ad2259b)