Bug 98784 - [11/12/13/14 Regression] problematic build of uClibc with -fPIC
Summary: [11/12/13/14 Regression] problematic build of uClibc with -fPIC
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 10.2.0
: P4 normal
Target Milestone: 11.5
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-21 22:13 UTC by Romain Naour
Modified: 2023-08-08 09:51 UTC (History)
3 users (show)

See Also:
Host:
Target: sparc-linux
Build:
Known to work: 8.3.0, 9.2.0
Known to fail: 8.4.0, 9.3.0
Last reconfirmed: 2021-01-21 00:00:00


Attachments
working uClibc build (224.71 KB, application/x-sharedlib)
2021-06-28 16:40 UTC, YannSionneau
Details
non-working uClibc-ng build (224.34 KB, application/x-sharedlib)
2021-06-28 16:50 UTC, YannSionneau
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Romain Naour 2021-01-21 22:13:09 UTC
Hello,

When building a rootfs for sparcv8 using the Buildroot defconfig qemu_sparc_ss10_defconfig, the system produce some illegal instruction
messages.

gcc 8.3, 9.2 are the latest working gcc version.
git bisect between gcc 8.3 and 8.4 allowed to identify the commit that introduced the regression:

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0a83f1a441d7aaadecb368c237b6ee70bd7b91d6

The commit has been introduced to fix the following bub:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92095

It has been backported to gcc 8.4 and 9.3.

Reverting this patch allowed to produce a working rootfs.

This issue can be reproduced using the following steps:

$ git://git.busybox.net/buildroot
$ cd buildroot/
$ git checkout 2020.11.1
$ make qemu_sparc_ss10_defconfig
$ make
$ ./output/images/start-qemu.sh

The kernel boot correctly but the login program (busybox) crash while trying to login:

[...]
Starting syslogd: 
Welcome to Buildroot
buildroot login: root

Welcome to Buildroot
buildroot login: root


For now, It's just a basic test that allow to reproduce the issue.

We can use a shell instead of init program but even with a simple command such 'ls' the system crash:

sh-5.0# ls

CPU: 0 PID: 1 Comm: sh Not tainted 4.19.16 #1
[f0022fbc : 
do_exit+0x948/0x968 ] 
[f000afec : 
do_signal+0x5f8/0x79c ] 
[f000b4b4 : 
do_notify_resume+0x48/0x58 ] 
[f0008c88 : 
signal_p+0x14/0x24 ] 
[0004b874 : 
0x4b874 ]

Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004

Best regards,
Romain
Comment 1 Eric Botcazou 2021-01-21 23:01:04 UTC
Thanks for reporting the problem, although it's rather elusive at this point.
What is miscompiled exactly?  The kernel or the userland?  The change is supposed to make a difference only in PIC/PIE mode, why does this mode matter here?
Comment 2 Romain Naour 2021-01-21 23:26:59 UTC
Hello,

The kernel and userland are built with the same toolchain, but this is the userpace program (such busybox) that crash.

Busybox is built with the following flags:

Toolchain wrapper executing:
    'qemu_sparc_ss10_defconfig/host/bin/sparc-buildroot-linux-uclibc-gcc.br_real'
    '--sysroot'
    'qemu_sparc_ss10_defconfig/host/sparc-buildroot-linux-uclibc/sysroot'
    '-Wp,-MD,libbb/.vfork_daemon_rexec.o.d'
    '-std=gnu99'
    '-Iinclude'
    '-Ilibbb'
    '-include'
    'include/autoconf.h'
    '-D_GNU_SOURCE'
    '-DNDEBUG'
    '-D_LARGEFILE_SOURCE'
    '-D_LARGEFILE64_SOURCE'
    '-D_FILE_OFFSET_BITS=64'
    '-DBB_VER="1.33.0"'
    '-D_LARGEFILE_SOURCE'
    '-D_LARGEFILE64_SOURCE'
    '-D_FILE_OFFSET_BITS=64'
    '-Os'
    '-Wall'
    '-Wshadow'
    '-Wwrite-strings'
    '-Wundef'
    '-Wstrict-prototypes'
    '-Wunused'
    '-Wunused-parameter'
    '-Wunused-function'
    '-Wunused-value'
    '-Wmissing-prototypes'
    '-Wmissing-declarations'
    '-Wno-format-security'
    '-Wdeclaration-after-statement'
    '-Wold-style-definition'
    '-finline-limit=0'
    '-fno-builtin-strlen'
    '-fomit-frame-pointer'
    '-ffunction-sections'
    '-fdata-sections'
    '-fno-guess-branch-probability'
    '-funsigned-char'
    '-static-libgcc'
    '-falign-functions=1'
    '-falign-jumps=1'
    '-falign-labels=1'
    '-falign-loops=1'
    '-fno-unwind-tables'
    '-fno-asynchronous-unwind-tables'
    '-fno-builtin-printf'
    '-Os'
    '-DKBUILD_BASENAME="vfork_daemon_rexec"'
    '-DKBUILD_MODNAME="vfork_daemon_rexec"'
    '-c'
    '-o'
    'libbb/vfork_daemon_rexec.o'
    'libbb/vfork_daemon_rexec.c'

So -fPIE is not used here but there is a side effect when the patch is applied.

Note: This is an initial report, I don't have any clue about the real issue.

Best regards,
Romain
Comment 3 Eric Botcazou 2021-01-21 23:43:23 UTC
> So -fPIE is not used here but there is a side effect when the patch is applied.

You need to look at the output of 'gcc -v' to be sure of that.
Comment 4 Romain Naour 2021-01-23 11:43:23 UTC
(In reply to Eric Botcazou from comment #3)
> > So -fPIE is not used here but there is a side effect when the patch is applied.
> 
> You need to look at the output of 'gcc -v' to be sure of that.

output/host/bin/sparc-buildroot-linux-uclibc-gcc -v

Using built-in specs.
COLLECT_GCC=/output/host/bin/sparc-buildroot-linux-uclibc-gcc.br_real
COLLECT_LTO_WRAPPER=/output/host/libexec/gcc/sparc-buildroot-linux-uclibc/10.2.0/lto-wrapper
Target: sparc-buildroot-linux-uclibc
Configured with: ./configure --prefix=/output/host --sysconfdir=/output/host/etc --enable-static -q --target=sparc-buildroot-linux-uclibc --with-sysroot=/output/host/sparc-buildroot-linux-uclibc/sysroot --enable-__cxa_atexit --with-gnu-ld --disable-libssp --disable-multilib --disable-decimal-float --with-gmp=output/host --with-mpc=output/host --with-mpfr=output/host --with-pkgversion='Buildroot 2020.11-999-g57d61a3986' --with-bugurl=http://bugs.buildroot.net/ --without-zstd --disable-libitm --disable-libquadmath --disable-libquadmath-support --disable-libsanitizer --disable-libsanitizer --enable-tls --enable-threads --without-isl --without-cloog --with-cpu=v8 --enable-languages=c,c++ --with-build-time-tools=/output/host/sparc-buildroot-linux-uclibc/bin --enable-shared --disable-libgomp --silent
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.2.0 (Buildroot 2020.11-999-g57d61a3986)

I can't give more debug info since all program segfault, even gdb or gdbserver.
Comment 5 Eric Botcazou 2021-01-23 17:33:29 UTC
> Thread model: posix
> Supported LTO compression algorithms: zlib
> gcc version 10.2.0 (Buildroot 2020.11-999-g57d61a3986)
> 
> I can't give more debug info since all program segfault, even gdb or
> gdbserver.

Thanks, definitively puzzling.  I don't see how the patch can change anything in a compilation not involving PIC/PIE or TLS.  If it does, then something really weird might be going on but, at the same time, nobody has been testing --with-cpu=v8 for   a decade or two I think so this is plausible.  I'm going to give it a try.
Comment 6 Romain Naour 2021-01-23 20:07:59 UTC
Hello,

Thanks for the help,

The previous gcc command line was from the busybox build (without -fPIC) but this is not busybox that crash... this is the libc.

See how the libc (uClibc) was built:

output/host/bin/sparc-buildroot-linux-uclibc-gcc -c libc/stdlib/atoll.c -o libc/stdlib/atoll.os -Wall -Wstrict-prototypes -Wstrict-aliasing -Wno-nonnull-compare -funsigned-char -fno-builtin -fcommon -fno-asm -fmerge-all-constants -std=gnu99 -mcpu=v8 -fno-stack-protector -nostdinc -I./include -I./include -include libc-symbols.h -I./libc/sysdeps/linux/sparc -I./libc/sysdeps/linux -I./ldso/ldso/sparc -I./ldso/include -I. -Os -fstrict-aliasing -D__USE_STDIO_FUTEXES__ -DHAVE_FORCED_UNWIND -D_LIBC_REENTRANT -I./libpthread/nptl -I./libpthread/nptl -I./libpthread/nptl/sysdeps/unix/sysv/linux/sparc -I./libpthread/nptl/sysdeps/sparc -I./libpthread/nptl/sysdeps/sparc -I./libpthread/nptl/sysdeps/unix/sysv/linux -I./libpthread/nptl/sysdeps/unix/sysv/linux -I./libpthread/nptl/sysdeps/pthread -I./libpthread/nptl/sysdeps/pthread/bits -I./libpthread/nptl/sysdeps/generic -I./libc/sysdeps/linux/common -isystem output/host/lib/gcc/sparc-buildroot-linux-uclibc/10.2.0/include-fixed -isystem output/host/lib/gcc/sparc-buildroot-linux-uclibc/10.2.0/include -Ioutput/build/linux-headers-5.4.88/usr/include/ -DNDEBUG -DIN_LIB=libc -fPIC -MT libc/stdlib/atoll.os -MD -MP -MF libc/stdlib/.atoll.os.dep

Indeed we have "-fPIC"

The system boot correctly if I replace the libc library with a working one.

I'm not familiar with gcc internals but I tried to remove "!optimize" from the if clauses [1] :

"if (!flag_pic || !crtl->uses_pic_offset_table)"

It seems to work (ok probably not the correct fix).
Is the issue related to the optimization level (Os vs O1) ?

[1] https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/sparc/sparc.c;h=aefced85fe142885b1b31fa878a0ff0dfd4e921a;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l13097
Comment 7 Eric Botcazou 2021-01-23 21:56:49 UTC
> The previous gcc command line was from the busybox build (without -fPIC) but
> this is not busybox that crash... this is the libc.
> 
> See how the libc (uClibc) was built:
> 
> output/host/bin/sparc-buildroot-linux-uclibc-gcc -c libc/stdlib/atoll.c -o
> libc/stdlib/atoll.os -Wall -Wstrict-prototypes -Wstrict-aliasing
> -Wno-nonnull-compare -funsigned-char -fno-builtin -fcommon -fno-asm
> -fmerge-all-constants -std=gnu99 -mcpu=v8 -fno-stack-protector -nostdinc
> -I./include -I./include -include libc-symbols.h -I./libc/sysdeps/linux/sparc
> -I./libc/sysdeps/linux -I./ldso/ldso/sparc -I./ldso/include -I. -Os
> -fstrict-aliasing -D__USE_STDIO_FUTEXES__ -DHAVE_FORCED_UNWIND
> -D_LIBC_REENTRANT -I./libpthread/nptl -I./libpthread/nptl
> -I./libpthread/nptl/sysdeps/unix/sysv/linux/sparc
> -I./libpthread/nptl/sysdeps/sparc -I./libpthread/nptl/sysdeps/sparc
> -I./libpthread/nptl/sysdeps/unix/sysv/linux
> -I./libpthread/nptl/sysdeps/unix/sysv/linux
> -I./libpthread/nptl/sysdeps/pthread -I./libpthread/nptl/sysdeps/pthread/bits
> -I./libpthread/nptl/sysdeps/generic -I./libc/sysdeps/linux/common -isystem
> output/host/lib/gcc/sparc-buildroot-linux-uclibc/10.2.0/include-fixed
> -isystem output/host/lib/gcc/sparc-buildroot-linux-uclibc/10.2.0/include
> -Ioutput/build/linux-headers-5.4.88/usr/include/ -DNDEBUG -DIN_LIB=libc
> -fPIC -MT libc/stdlib/atoll.os -MD -MP -MF libc/stdlib/.atoll.os.dep
> 
> Indeed we have "-fPIC"

OK, this makes sense now and this looks like a bootstrap problem, e.g. the code setting up _GLOBAL_OFFSET_TABLE_ in the libc might be trying to access it or something along this line.

Can you find out which module of uClibc sets up _GLOBAL_OFFSET_TABLE_ and confirm that it is compiled with -fPIC as well?  If so, would it be possible *not* to compile with -fPIC?
Comment 8 Eric Botcazou 2021-01-24 09:17:38 UTC
> OK, this makes sense now and this looks like a bootstrap problem, e.g. the
> code setting up _GLOBAL_OFFSET_TABLE_ in the libc might be trying to access
> it or something along this line.

I misremembered: the code loading the GOT register is eliminated if not used in the end, but it can block the leaf register optimization, i.e. a register window is allocated although it is not needed.  So does uClibc depend on the fact that a register window is not allocated in some specific spot?
Comment 9 Romain Naour 2021-01-26 21:24:00 UTC
(In reply to Eric Botcazou from comment #7)
> > The previous gcc command line was from the busybox build (without -fPIC) but
> > this is not busybox that crash... this is the libc.
> > 
> > See how the libc (uClibc) was built:
> > 
> > output/host/bin/sparc-buildroot-linux-uclibc-gcc -c libc/stdlib/atoll.c -o
> > libc/stdlib/atoll.os -Wall -Wstrict-prototypes -Wstrict-aliasing
> > -Wno-nonnull-compare -funsigned-char -fno-builtin -fcommon -fno-asm
> > -fmerge-all-constants -std=gnu99 -mcpu=v8 -fno-stack-protector -nostdinc
> > -I./include -I./include -include libc-symbols.h -I./libc/sysdeps/linux/sparc
> > -I./libc/sysdeps/linux -I./ldso/ldso/sparc -I./ldso/include -I. -Os
> > -fstrict-aliasing -D__USE_STDIO_FUTEXES__ -DHAVE_FORCED_UNWIND
> > -D_LIBC_REENTRANT -I./libpthread/nptl -I./libpthread/nptl
> > -I./libpthread/nptl/sysdeps/unix/sysv/linux/sparc
> > -I./libpthread/nptl/sysdeps/sparc -I./libpthread/nptl/sysdeps/sparc
> > -I./libpthread/nptl/sysdeps/unix/sysv/linux
> > -I./libpthread/nptl/sysdeps/unix/sysv/linux
> > -I./libpthread/nptl/sysdeps/pthread -I./libpthread/nptl/sysdeps/pthread/bits
> > -I./libpthread/nptl/sysdeps/generic -I./libc/sysdeps/linux/common -isystem
> > output/host/lib/gcc/sparc-buildroot-linux-uclibc/10.2.0/include-fixed
> > -isystem output/host/lib/gcc/sparc-buildroot-linux-uclibc/10.2.0/include
> > -Ioutput/build/linux-headers-5.4.88/usr/include/ -DNDEBUG -DIN_LIB=libc
> > -fPIC -MT libc/stdlib/atoll.os -MD -MP -MF libc/stdlib/.atoll.os.dep
> > 
> > Indeed we have "-fPIC"
> 
> OK, this makes sense now and this looks like a bootstrap problem, e.g. the
> code setting up _GLOBAL_OFFSET_TABLE_ in the libc might be trying to access
> it or something along this line.
> 
> Can you find out which module of uClibc sets up _GLOBAL_OFFSET_TABLE_ and
> confirm that it is compiled with -fPIC as well?  If so, would it be possible
> *not* to compile with -fPIC?

There is an option [1] to build all of uClibc as PIC objects but some other part are build unconditionally as PIC objects. This option us always set in the Buildroot's uClibc configuration. By disabling this option doesn't make any diffrence. Removing -fPIC in Makefiles produce a non working libc.

[1] https://cgit.uclibc-ng.org/cgi/cgit/uclibc-ng.git/tree/extra/Configs/Config.in?id=ab1dd83bec59c9e65c31efd6e887182948f627be#n296
Comment 10 Romain Naour 2021-01-26 22:29:08 UTC
(In reply to Eric Botcazou from comment #8)
> > OK, this makes sense now and this looks like a bootstrap problem, e.g. the
> > code setting up _GLOBAL_OFFSET_TABLE_ in the libc might be trying to access
> > it or something along this line.
> 
> I misremembered: the code loading the GOT register is eliminated if not used
> in the end, but it can block the leaf register optimization, i.e. a register
> window is allocated although it is not needed.  So does uClibc depend on the
> fact that a register window is not allocated in some specific spot?

Since some part of uClibc code come from glibc, I'm trying to compare with glibc 2.30... but there are some differences.

For example there is no SETUP_PIC_REG_LEAF definition for sparc32 in uClubc:
(SETUP_PIC_REG_LEAF use internally _GLOBAL_OFFSET_TABLE_)

https://cgit.uclibc-ng.org/cgi/cgit/uclibc-ng.git/tree/libc/sysdeps/linux/sparc/sysdep.h?id=ab1dd83bec59c9e65c31efd6e887182948f627be

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/sparc/sysdep.h;h=31a8addebcbeec2f60ece377677bc2be137b3664;hb=d811d240c06a8191db88ad4f1e60e1b672e4cc66

The uClibc code doesn't seems up-to-date with the glibc version...
But I can't try to reproduce the issue with glibc since the support for sparc has been removed from Buildroot since a long time and from glibc for sparcv8 since 2.31: https://lwn.net/Articles/811275/

resync the sparc port for uclibc with glibc requires a lot of work.

Best regards,
Romain
Comment 11 Jakub Jelinek 2021-05-14 09:54:20 UTC
GCC 8 branch is being closed.
Comment 12 Richard Biener 2021-06-01 08:19:24 UTC
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
Comment 13 YannSionneau 2021-06-28 16:40:19 UTC
Created attachment 51073 [details]
working uClibc build

This is a working libc build, that allows to boot Linux, run init, daemons and get to usual shell prompt.
Comment 14 YannSionneau 2021-06-28 16:50:07 UTC
Created attachment 51074 [details]
non-working uClibc-ng build

This is a build of uClibc-ng that does not allow normal Linux booting.
You can boot to a shell by doing init=/bin/sh but any command you type (like ls) would run the command and then crash the system because init (in fact the shell, the parent of the forked command) gets killed.

This might be related to issue in signal handling since the parent is supposed to receive a SIGCHILD signal when its child exits.

Difference with previous attachment is just that the follow buildroot patch is not applied to gcc: https://git.buildroot.net/buildroot/commit/?id=4d16e6f5324f0285f51bfbb5a3503584f3b3ad12
Comment 15 Richard Biener 2022-05-27 09:44:16 UTC
GCC 9 branch is being closed
Comment 16 Jakub Jelinek 2022-06-28 10:43:13 UTC
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
Comment 17 Richard Biener 2023-07-07 10:38:54 UTC
GCC 10 branch is being closed.
Comment 18 Waldemar Brodkorb 2023-08-08 09:51:43 UTC
Hi,

still happens with gcc 13.2.0.
You can boot a shell and then in strace you see a segfault error:

[pid    28] fstat64(3, {st_mode=S_IFDIR|S_ISVTX|0777, st_size=400, ...}) = 0
[pid    28] brk(0x154000)               = 0x154000
[pid    28] getdents64(3, 0xefb11b80 /* 20 entries */, 4096) = 496
[pid    28] brk(0x155000)               = 0x155000
[pid    28] lstat64("./init", {st_mode=S_IFLNK|0777, st_size=10, ...}) = 0
[pid    28] lstat64("./var", {st_mode=S_IFDIR|0755, st_size=40, ...}) = 0
[pid    28] lstat64("./usr", {st_mode=S_IFDIR|0755, st_size=120, ...}) = 0
[pid    28] lstat64("./tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=40, ...}) = 0
[pid    28] lstat64("./sys", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
[pid    28] lstat64("./sbin", {st_mode=S_IFDIR|0755, st_size=1420, ...}) = 0
[pid    28] lstat64("./run", {st_mode=S_IFDIR|0777, st_size=40, ...}) = 0
[pid    28] lstat64("./root", {st_mode=S_IFDIR|0755, st_size=60, ...}) = 0
[pid    28] lstat64("./proc", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
[pid    28] lstat64("./mnt", {st_mode=S_IFDIR|0755, st_size=40, ...}) = 0
[pid    28] lstat64("./media", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=40, ...}) = 0
[pid    28] lstat64("./linuxrc", {st_mode=S_IFLNK|0777, st_size=11, ...}) = 0
[pid    28] lstat64("./lib", {st_mode=S_IFDIR|0755, st_size=260, ...}) = 0
[pid    28] lstat64("./etc", {st_mode=S_IFDIR|0755, st_size=640, ...}) = 0
[pid    28] lstat64("./dev", {st_mode=S_IFDIR|0755, st_size=640, ...}) = 0
[pid    28] lstat64("./boot", {st_mode=S_IFDIR|0755, st_size=40, ...}) = 0
[pid    28] lstat64("./bin", {st_mode=S_IFDIR|0755, st_size=1920, ...}) = 0
[pid    28] getdents64(3, 0xefb11b80 /* 0 entries */, 4096) = 0
[pid    28] close(3)                    = 0
[pid    28] write(1, "\33[1;34mbin\33[m      \33[1;34metc\33[m"..., 109bin      etc      linuxrc  proc     sbin     usr
) = 109
[pid    28] write(1, "\33[1;34mboot\33[m     \33[1;36minit\33["..., 109boot     init     media    root     sys      var
) = 109
[pid    28] write(1, "\33[1;34mdev\33[m      \33[1;34mlib\33[m"..., 90dev      lib      mnt      run      tmp
) = 90
[pid    28] exit_group(0)               = ?
[pid    28] +++ exited with 0 +++
<... rt_sigsuspend resumed>)            = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=28, si_uid=0, si_status=0, si_utime=0, si_stime=3 /* 0.03 s */} ---
getrusage(RUSAGE_CHILDREN, {ru_utime={tv_sec=0, tv_usec=0}, ru_stime={tv_sec=0, tv_usec=0}, ...}) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG|WSTOPPED|WCONTINUED, NULL) = 28
getrusage(RUSAGE_CHILDREN, {ru_utime={tv_sec=0, tv_usec=8000}, ru_stime={tv_sec=0, tv_usec=32000}, ...}) = 0
wait4(-1, 0xefbe678c, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes)
sigreturn({mask=[INT RT_1 RT_8 RT_15 RT_21 RT_23 RT_31]}) = -1 (errno 629)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=NULL} ---
+++ killed by SIGSEGV +++

Any tipps how to debug this?

best regards
 Waldemar