At revision 182975 on x86_64-apple-darwin10, the test gcc.dg/tree-prof/pr44777.c fails with a segmentation fault: Program received signal SIGSEGV, Segmentation fault. 0x900949d9 in __findenv () from /usr/lib/libSystem.B.dylib (gdb) bt #0 0x900949d9 in __findenv () from /usr/lib/libSystem.B.dylib #1 0x90094971 in getenv () from /usr/lib/libSystem.B.dylib #2 0x00002640 in gcov_exit () at ../../../../work/libgcc/libgcov.c:334 #3 0x900b1c0a in __cxa_finalize () from /usr/lib/libSystem.B.dylib #4 0x900b1b14 in exit () from /usr/lib/libSystem.B.dylib #5 0x00001cc8 in main () at /opt/gcc/work/gcc/testsuite/gcc.dg/tree-prof/pr44777.c:42 Although the test has been introduced at revision 182920, I have marked it as a regression since there is no segmentation fault at revision 182587.
Looks like an OS bug, not GCC bug, if getenv segfaults...
> Looks like an OS bug, not GCC bug, if getenv segfaults... Well, revision 182587 uses the same OS and does not segfault. Also 'exit (0)' is used in many tests that pass.
Created attachment 26269 [details] assembler for r182980 with -m32 Compiled with -O0 -m32 -fprofile-generate [macbook] f90/bug% /opt/gcc/gcc4.7a/bin/gcc -v Using built-in specs. COLLECT_GCC=/opt/gcc/gcc4.7a/bin/gcc COLLECT_LTO_WRAPPER=/opt/gcc/gcc4.7a/libexec/gcc/x86_64-apple-darwin10.8.0/4.7.0/lto-wrapper Target: x86_64-apple-darwin10.8.0 Configured with: ../work/configure --prefix=/opt/gcc/gcc4.7a --enable-languages=c,c++,fortran,ada,lto --with-gmp=/opt/mp --with-system-zlib --enable-checking=release --with-cloog=/opt/mp --enable-cloog-backend=isl --enable-lto Thread model: posix gcc version 4.7.0 20120107 (experimental) [trunk revision 182980p4] (GCC)
Created attachment 26270 [details] assembler for r182587 with -m32 Compiled with -O0 -m32 -fprofile-generate [macbook] f90/bug% /opt/gcc/gcc4.7a-182587/bin/gcc -v Using built-in specs. COLLECT_GCC=/opt/gcc/gcc4.7a-182587/bin/gcc COLLECT_LTO_WRAPPER=/opt/gcc/gcc4.7a-182587/bin/../libexec/gcc/x86_64-apple-darwin10.8.0/4.7.0/lto-wrapper Target: x86_64-apple-darwin10.8.0 Configured with: ../work/configure --prefix=/opt/gcc/gcc4.7a --enable-languages=c,c++,ada,lto --with-gmp=/opt/mp --with-system-zlib --enable-checking=release --with-cloog=/opt/mp --enable-cloog-backend=isl --enable-lto Thread model: posix gcc version 4.7.0 20111221 (experimental) [trunk revision 182587p3] (GCC)
Created attachment 26271 [details] assembler for r182980 with -m64 (default) Compiled with -O0 -fprofile-generate
Created attachment 26272 [details] assembler for r182587 with -m64 (default) Compiled with -O0 -fprofile-generate
Created attachment 26273 [details] preprocessed file compiled with -O0 -fprofile-generate
The main difference between r182587 and r182980 with -m32 is @@ -50,18 +50,26 @@ L3: movl 4(%eax), %esp jmp *%edx L2: + leal ___gcov0_y.1704-L00000000001$pb(%ebx), %eax + movl 28(%eax), %edx + movl 24(%eax), %eax + addl $1, %eax + adcl $0, %edx + leal ___gcov0_y.1704-L00000000001$pb(%ebx), %esi + movl %eax, 24(%esi) + movl %edx, 28(%esi) movl 8(%ebp), %eax subl $1, %eax movl %eax, (%esp) call _y.1704 leal ___gcov0_y.1704-L00000000001$pb(%ebx), %eax - movl 28(%eax), %edx - movl 24(%eax), %eax + movl 36(%eax), %edx + movl 32(%eax), %eax addl $1, %eax adcl $0, %edx leal ___gcov0_y.1704-L00000000001$pb(%ebx), %ecx - movl %eax, 24(%ecx) - movl %edx, 28(%ecx) + movl %eax, 32(%ecx) + movl %edx, 36(%ecx) leal -8(%ebp), %esp popl %ebx LCFI3:
Dominique, please see why __findenv segfaults.
(In reply to comment #8) > The main difference between r182587 and r182980 with -m32 is > > @@ -50,18 +50,26 @@ L3: > movl 4(%eax), %esp > jmp *%edx > L2: > + leal ___gcov0_y.1704-L00000000001$pb(%ebx), %eax > + movl 28(%eax), %edx > + movl 24(%eax), %eax > + addl $1, %eax > + adcl $0, %edx > + leal ___gcov0_y.1704-L00000000001$pb(%ebx), %esi > + movl %eax, 24(%esi) > + movl %edx, 28(%esi) > movl 8(%ebp), %eax > subl $1, %eax > movl %eax, (%esp) > call _y.1704 > leal ___gcov0_y.1704-L00000000001$pb(%ebx), %eax > - movl 28(%eax), %edx > - movl 24(%eax), %eax > + movl 36(%eax), %edx > + movl 32(%eax), %eax > addl $1, %eax > adcl $0, %edx > leal ___gcov0_y.1704-L00000000001$pb(%ebx), %ecx > - movl %eax, 24(%ecx) > - movl %edx, 28(%ecx) > + movl %eax, 32(%ecx) > + movl %edx, 36(%ecx) > leal -8(%ebp), %esp > popl %ebx > LCFI3: There is no bug in that, we record one more counter starting with 182920, but there is also the additional space reserved for it in the array.
The test passes if I revert r182920. > Dominique, please see why __findenv segfaults. (1) My knowledge of gdb (and C) is very shallow, (2) I don't have access to the source for __findenv (3) I don't know to what I'ld look for. Stepping through __findenv I reach Program received signal SIGSEGV, Segmentation fault. 0x900949d9 in __findenv () from /usr/lib/libSystem.B.dylib (gdb) info registers eax 0x0 0 ecx 0x9009497b -1878439557 edx 0x11 17 ebx 0x236a 9066 esp 0xbfffd798 0xbfffd798 ebp 0xbfffd7a8 0xbfffd7a8 esi 0xc000d9e0 -1073686048 edi 0x369b 13979 eip 0x900949d9 0x900949d9 <__findenv+85> eflags 0x10306 [ PF TF IF RF ] cs 0x17 23 ss 0x9009497b -1878439557 ds 0x1f 31 es 0x1f 31 fs 0x0 0 gs 0x37 55 With r182920 reverted I see (gdb) stepi 0x900949d9 in __findenv () from /usr/lib/libSystem.B.dylib (gdb) info registers eax 0x0 0 ecx 0x9009497b -1878439557 edx 0x11 17 ebx 0x236a 9066 esp 0xbfffd798 0xbfffd798 ebp 0xbfffd7a8 0xbfffd7a8 esi 0xbfffd9e0 -1073751584 edi 0x369b 13979 eip 0x900949d9 0x900949d9 <__findenv+85> eflags 0x306 [ PF TF IF ] cs 0x17 23 ss 0x9009497b -1878439557 ds 0x1f 31 es 0x1f 31 fs 0x0 0 gs 0x37 55 (gdb) stepi 0x900949db in __findenv () from /usr/lib/libSystem.B.dylib (gdb) info registers eax 0x0 0 ecx 0x9009497b -1878439557 edx 0x11 17 ebx 0xbfffdb80 -1073751168 esp 0xbfffd798 0xbfffd798 ebp 0xbfffd7a8 0xbfffd7a8 esi 0xbfffd9e0 -1073751584 edi 0x369b 13979 eip 0x900949db 0x900949db <__findenv+87> eflags 0x306 [ PF TF IF ] cs 0x17 23 ss 0x9009497b -1878439557 ds 0x1f 31 es 0x1f 31 fs 0x0 0 gs 0x37 55 Note that the test fails also on powerpc-apple-darwin9 with both -m32 Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0x00003f50 0x00001ee0 in x () at /opt/gcc/work/gcc/testsuite/gcc.dg/tree-prof/pr44777.c:29 29 y (a); (gdb) bt #0 0x00001ee0 in x () at /opt/gcc/work/gcc/testsuite/gcc.dg/tree-prof/pr44777.c:29 #1 0x00002044 in main () at /opt/gcc/work/gcc/testsuite/gcc.dg/tree-prof/pr44777.c:39 and -m64 Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x0000000200003028 0x0000000100001848 in gcov_exit () at ../../../../work/libgcc/libgcov.c:302 302 if (gfi_ptr && gfi_ptr->key != gi_ptr) (gdb) bt #0 0x0000000100001848 in gcov_exit () at ../../../../work/libgcc/libgcov.c:302 #1 0x000000010000174c in gcov_exit () at ../../../../work/libgcc/libgcov.c:274 Previous frame identical to this frame (gdb could not unwind past this frame) for gcc 4.4.6, 4.5.3, 4.6.2, and trunk. It passes with -m32 for 4.3.4 20090511 for GNAT GPL 2009 (20090511). The failures seem different: should I open another PR?
we have the source - it looks like inherited stdlib from FreeBSD. http://www.opensource.apple.com/source/Libc/Libc-498.1.7/stdlib/getenv-fbsd.c however, more debugging is going to be needed.. I guess we're going to have to look at what is set up for at exit .. and if that's getting broken (or this is revealing a lingering bug).
Created attachment 26318 [details] test findenv Test of findenv found in http://www.opensource.apple.com/source/Libc/Libc-498.1.7/stdlib/getenv-fbsd.c . Debugging seesion: (gdb) b my_findenv Breakpoint 1 at 0x187b: file pr51784_1.c, line 62. (gdb) run Starting program: /Users/dominiq/Documents/Fortran/g95bench/win/f90/bug/a.out Breakpoint 1, my_findenv (name=0x356c "GCOV_PREFIX_STRIP", offset=0xbfffd93c, environ=0xbfffd9b4) at pr51784_1.c:62 62 } (gdb) p/x _NSGetEnviron() $1 = 0x40a8 (gdb) p/x *_NSGetEnviron() $2 = 0xbfffd9b4 (gdb) p/x **_NSGetEnviron() Cannot access memory at address 0xbfffd9b4 (gdb) p *environ $3 = 0xbfffdb58 "PATH=/opt/gcc/gcc4.7a/bin/:/sw64/bin:/sw64/sbin:/opt/gcc/gcc4.7a/bin/:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/texbin:/usr/X11/bin:/usr/X11R6/bin:/Users/dominiq/geant4.9/bin/Darwin-g++:/usr/t"... (gdb) p environ[0] $10 = 0xbfffdb58 "PATH=/opt/gcc/gcc4.7a/bin/:/sw64/bin:/sw64/sbin:/opt/gcc/gcc4.7a/bin/:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/texbin:/usr/X11/bin:/usr/X11R6/bin:/Users/dominiq/geant4.9/bin/Darwin-g++:/usr/t"... ... (gdb) p environ[69] $79 = 0xbfffed25 "GCOV_PREFIX_STRIP=foo" (gdb) c Continuing. foo Program exited normally.
Created attachment 26319 [details] patch for libgcc/libgcov.c to debug findenv Patch to use the findenv in http://www.opensource.apple.com/source/Libc/Libc-498.1.7/stdlib/getenv-fbsd.c . Debugging session [macbook] f90/bug% /opt/gcc/gcc4.7p/bin/gcc pr44777_db.c -fprofile-generate -D_PROFILE_GENERATE -m32 -g -save-temps [macbook] f90/bug% gdb a.out ... (gdb) b 25 Breakpoint 1 at 0x28bb: file pr44777_db.c, line 25. (gdb) run Starting program: /Users/dominiq/Documents/Fortran/g95bench/win/f90/bug/a.out Breakpoint 1, y (a=0) at pr44777_db.c:25 25 goto xlab; (gdb) p/x _NSGetEnviron() $1 = 0x50a8 (gdb) p/x *_NSGetEnviron() $2 = 0xbfffd9b4 (gdb) p/x **_NSGetEnviron() Cannot access memory at address 0xbfffd9b4 (gdb) stepi 0x000028bd 25 goto xlab; (gdb) stepi 0x000028c3 25 goto xlab; (gdb) stepi 0x000028c5 in y (a=-1881144004) at pr44777_db.c:25 25 goto xlab; (gdb) stepi 0x000028c8 25 goto xlab; (gdb) stepi 0x000029a2 in x (a=-1881144004) at pr44777_db.c:29 29 y (a); (gdb) stepi 0x000029a5 in x (a=1) at pr44777_db.c:29 29 y (a); (gdb) stepi 0x000029ab 29 y (a); (gdb) stepi 0x000029ae 29 y (a); (gdb) stepi 0x000029b1 29 y (a); (gdb) stepi 0x000029b4 29 y (a); (gdb) stepi 0x000029b7 29 y (a); (gdb) stepi 0x000029bd 29 y (a); (gdb) p/x _NSGetEnviron() $3 = 0x50a8 (gdb) p/x *_NSGetEnviron() $4 = 0xbfffd9b4 (gdb) p/x **_NSGetEnviron() Cannot access memory at address 0xbfffd9b4 (gdb) x/x 0x000029bd 0x29bd <x+162>: 0x89084189 (gdb) stepi 0x000029c0 29 y (a); (gdb) p/x _NSGetEnviron() $5 = 0x50a8 (gdb) p/x *_NSGetEnviron() $6 = 0xc000d9b4 <----- address changed from 0xbfffd9b4 to 0xc000d9b4 (gdb) p/x **_NSGetEnviron() Cannot access memory at address 0xc000d9b4 (gdb) x/x 0x000029c0 0x29c0 <x+165>: 0x8b0c5189 (gdb) stepi 31 return a; (gdb) stepi 0x000029c6 31 return a; (gdb) x/x 0x000029c6 0x29c6 <x+171>: 0x2857838d (gdb) stepi 0x000029cc 31 return a; (gdb) x/x 0x000029cc 0x29cc <x+177>: 0x8b14508b (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00003198 in my_findenv (name=0x45cf "GCOV_PREFIX_STRIP", offset=0xbfffd79c, environ=0xc000d9b4) at ../../../../p_work/libgcc/libgcov.c:296 296 for (p = environ; (cp = *p) != NULL; ++p) { (gdb) c Continuing. Program terminated with signal SIGSEGV, Segmentation fault. pr44777_db.c is the original test with '#define DEPTH 1000' replaced with '#define DEPTH 1'. If I am not mistaken, findenv crashes because the address for environ has been changed from 0xbfffd9b4 to 0xc000d9b4 at the end of the proc 'x'.
(In reply to comment #14) > (gdb) stepi > 0x000029bd 29 y (a); > (gdb) p/x _NSGetEnviron() > $3 = 0x50a8 > (gdb) p/x *_NSGetEnviron() > $4 = 0xbfffd9b4 > (gdb) p/x **_NSGetEnviron() > Cannot access memory at address 0xbfffd9b4 > (gdb) x/x 0x000029bd > 0x29bd <x+162>: 0x89084189 > (gdb) stepi > 0x000029c0 29 y (a); > (gdb) p/x _NSGetEnviron() > $5 = 0x50a8 > (gdb) p/x *_NSGetEnviron() > $6 = 0xc000d9b4 <----- address changed from 0xbfffd9b4 to 0xc000d9b4 disas x would be interesting here to find out what insn is at 0x29c0 and what is around that. > If I am not mistaken, findenv crashes because the address for environ has been > changed from 0xbfffd9b4 to 0xc000d9b4 at the end of the proc 'x'.
> disas x > > would be interesting here to find out what insn is at 0x29c0 and what is around > that. 0x0000291b <+0>: push %ebp 0x0000291c <+1>: mov %esp,%ebp 0x0000291e <+3>: push %edi 0x0000291f <+4>: push %esi 0x00002920 <+5>: push %ebx 0x00002921 <+6>: sub $0x3c,%esp 0x00002924 <+9>: call 0x2660 0x00002929 <+14>: lea -0x18(%ebp),%eax 0x0000292c <+17>: mov %eax,-0x24(%ebp) 0x0000292f <+20>: mov %esp,-0x20(%ebp) 0x00002932 <+23>: lea 0x2833(%ebx),%eax 0x00002938 <+29>: mov (%eax),%eax 0x0000293a <+31>: lea 0x282f(%ebx),%edx 0x00002940 <+37>: mov (%edx),%edx 0x00002942 <+39>: mov %edx,0x10(%esp) 0x00002946 <+43>: lea -0xe(%ebx),%edx 0x0000294c <+49>: mov %edx,0xc(%esp) 0x00002950 <+53>: movl $0x0,0x4(%esp) 0x00002958 <+61>: movl $0x0,0x8(%esp) 0x00002960 <+69>: mov %eax,(%esp) 0x00002963 <+72>: call 0x4420 <__gcov_indirect_call_profiler> 0x00002968 <+77>: lea 0x282f(%ebx),%eax 0x0000296e <+83>: movl $0x0,(%eax) 0x00002974 <+89>: lea 0x2857(%ebx),%eax 0x0000297a <+95>: mov 0x4(%eax),%edx 0x0000297d <+98>: mov (%eax),%eax 0x0000297f <+100>: add $0x1,%eax 0x00002982 <+103>: adc $0x0,%edx 0x00002985 <+106>: lea 0x2857(%ebx),%ecx 0x0000298b <+112>: mov %eax,(%ecx) 0x0000298d <+114>: mov %edx,0x4(%ecx) 0x00002990 <+117>: lea -0x24(%ebp),%eax 0x00002993 <+120>: mov 0x8(%ebp),%edx 0x00002996 <+123>: mov %edx,(%esp) 0x00002999 <+126>: mov %eax,%ecx 0x0000299b <+128>: call 0x283e <y> 0x000029a0 <+133>: jmp 0x29c3 <x+168> 0x000029a2 <+135>: lea 0x18(%ebp),%ebp => 0x000029a5 <+138>: lea 0x2857(%ebx),%eax 0x000029ab <+144>: mov 0xc(%eax),%edx 0x000029ae <+147>: mov 0x8(%eax),%eax 0x000029b1 <+150>: add $0x1,%eax 0x000029b4 <+153>: adc $0x0,%edx 0x000029b7 <+156>: lea 0x2857(%ebx),%ecx 0x000029bd <+162>: mov %eax,0x8(%ecx) 0x000029c0 <+165>: mov %edx,0xc(%ecx) 0x000029c3 <+168>: mov 0x8(%ebp),%esi 0x000029c6 <+171>: lea 0x2857(%ebx),%eax 0x000029cc <+177>: mov 0x14(%eax),%edx 0x000029cf <+180>: mov 0x10(%eax),%eax 0x000029d2 <+183>: add $0x1,%eax 0x000029d5 <+186>: adc $0x0,%edx 0x000029d8 <+189>: lea 0x2857(%ebx),%ecx 0x000029de <+195>: mov %eax,0x10(%ecx) 0x000029e1 <+198>: mov %edx,0x14(%ecx) 0x000029e4 <+201>: mov %esi,%eax 0x000029e6 <+203>: add $0x3c,%esp 0x000029e9 <+206>: pop %ebx 0x000029ea <+207>: pop %esi 0x000029eb <+208>: pop %edi 0x000029ec <+209>: pop %ebp 0x000029ed <+210>: ret and the registers evolution is (gdb) stepi 0x000029b4 29 y (a); (gdb) info registers eax 0xdb52c000 -615333888 ecx 0xbfffd934 -1073751756 edx 0xbfff 49151 ebx 0x284b 10315 esp 0xbfffd910 0xbfffd910 ebp 0xbfffd958 0xbfffd958 esi 0x51a0 20896 edi 0x0 0 eip 0x29b4 0x29b4 <x+153> eflags 0x396 [ PF AF SF TF IF ] cs 0x17 23 ss 0xbfffd934 -1073751756 ds 0x1f 31 es 0x1f 31 fs 0x0 0 gs 0x37 55 (gdb) stepi 0x000029b7 29 y (a); (gdb) info registers eax 0xdb52c000 -615333888 ecx 0xbfffd934 -1073751756 edx 0xbfff 49151 ebx 0x284b 10315 esp 0xbfffd910 0xbfffd910 ebp 0xbfffd958 0xbfffd958 esi 0x51a0 20896 edi 0x0 0 eip 0x29b7 0x29b7 <x+156> eflags 0x306 [ PF TF IF ] cs 0x17 23 ss 0xbfffd934 -1073751756 ds 0x1f 31 es 0x1f 31 fs 0x0 0 gs 0x37 55 (gdb) stepi 0x000029bd 29 y (a); (gdb) info registers eax 0xdb52c000 -615333888 ecx 0x50a2 20642 edx 0xbfff 49151 ebx 0x284b 10315 esp 0xbfffd910 0xbfffd910 ebp 0xbfffd958 0xbfffd958 esi 0x51a0 20896 edi 0x0 0 eip 0x29bd 0x29bd <x+162> eflags 0x306 [ PF TF IF ] cs 0x17 23 ss 0x50a2 20642 ds 0x1f 31 es 0x1f 31 fs 0x0 0 gs 0x37 55 (gdb) stepi 0x000029c0 29 y (a); (gdb) info registers eax 0xdb52c000 -615333888 ecx 0x50a2 20642 edx 0xbfff 49151 ebx 0x284b 10315 esp 0xbfffd910 0xbfffd910 ebp 0xbfffd958 0xbfffd958 esi 0x51a0 20896 edi 0x0 0 eip 0x29c0 0x29c0 <x+165> eflags 0x306 [ PF TF IF ] cs 0x17 23 ss 0x50a2 20642 ds 0x1f 31 es 0x1f 31 fs 0x0 0 gs 0x37 55 (gdb) stepi 31 return a; (gdb) info registers eax 0xdb52c000 -615333888 ecx 0x50a2 20642 edx 0xbfff 49151 ebx 0x284b 10315 esp 0xbfffd910 0xbfffd910 ebp 0xbfffd958 0xbfffd958 esi 0x51a0 20896 edi 0x0 0 eip 0x29c3 0x29c3 <x+168> eflags 0x306 [ PF TF IF ] cs 0x17 23 ss 0x50a2 20642 ds 0x1f 31 es 0x1f 31 fs 0x0 0 gs 0x37 55 (gdb) stepi 0x000029c6 31 return a; (gdb) info registers eax 0xdb52c000 -615333888 ecx 0x50a2 20642 edx 0xbfff 49151 ebx 0x284b 10315 esp 0xbfffd910 0xbfffd910 ebp 0xbfffd958 0xbfffd958 esi 0x1 1 edi 0x0 0 eip 0x29c6 0x29c6 <x+171> eflags 0x306 [ PF TF IF ] cs 0x17 23 ss 0x50a2 20642 ds 0x1f 31 es 0x1f 31 fs 0x0 0 gs 0x37 55
Sounds like i?86-darwin target bug. The %ebx value is wrong, should be 0x2929 instead (i.e. the point after call __x86.get_pc_thunk.bx in x, but most probably is the point after call __x86.get_pc_thunk.bx in y instead. Seems on Darwin %ebx (or whatever the PIC register is) contains the address after the call to the get_pc_thunk, and thus each function has it different and therefore non-local goto on i?86-darwin must restore %ebx, but clearly it doesn't. On i?86-linux this bug doesn't exist, because %ebx there is the address of _GLOBAL_OFFSET_TABLE_ - nonlocal goto is only possible within the same CU and thus will have the same _GOT_ value, so nothing needs to be restored. Similarly on x86_64-linux, where there is no PIC pointer, but %rip addressing is used. Doesn't look like a regression to me though, if the above is the case, then probably darwin never supported nonlocal gotos in pic code (which is the default there).
Probably for TARGET_MACHO && !TARGET_64BIT && flag_pic you want to define "nonlocal_goto_receiver" pattern that would compute the right PIC pointer value at that point (not sure if the assembler/linker would be happy by doing something like call ___x86.get_pc_thunk.bx L1: add $(L00000000002$pb-L1), %ebx in the nonlocal goto receiver.
(In reply to comment #18) > Probably for TARGET_MACHO && !TARGET_64BIT && flag_pic you want to define > "nonlocal_goto_receiver" pattern that would compute the right PIC pointer value > at that point (not sure if the assembler/linker would be happy by doing > something like > call ___x86.get_pc_thunk.bx > L1: > add $(L00000000002$pb-L1), %ebx > in the nonlocal goto receiver. I concur with the observation that we are not restoring the PIC reg (and we need to). [with a very quick look, we seem to be trampling on it anyway in the nested func, even without gcov] Thus changing to target bug. I suppose the subtraction in your example should be feasible with a scattered reloc on ia32, but I'd have to check. (on x86-64 this is not relevant since we work %rip as per linux). Have to add it to the TODO - unless Mike has any immediate ideas?
Doesn't powerpc-darwin have the same problem? Not sure if it defaults to -fpic, perhaps just with explicit -fpic? Are you ok with dropping the Regression tag (or, just xfailing the testcase on *-darwin* for now with a reference to this PR) - as it really doesn't seem to be a regression, if the testcase ever didn't crash, it was by pure luck that it clobbered something that didn't result in a visible failure immediately.
(In reply to comment #20) > Doesn't powerpc-darwin have the same problem? Not sure if it defaults to > -fpic, perhaps just with explicit -fpic? powerpc makes provision (in the ABI) to save the pic reg in nested functions - so it might have a different/or no problem. > Are you ok with dropping the Regression tag I don't think this is a regression - I think it's been there for(ever/long time). (or, just xfailing the testcase on > *-darwin* for now with a reference to this PR) I guess xfailing is the thing to do for now - maybe give Mike a day or two in case there's a really easy fix. (it's not as simple as just doing that subtraction, as there is no guarantee that x uses or even saves the PIC reg as things stand) ... I am not familiar with how the non-local goto system works, so there's some reading for me to do before even cooking up a test...
I suppose that, for a start, if a function contains a non-local-label, then the pro/epilogue should save/restore all call-saved regs?
How would that help? With nonlocal goto, you need to recompute the PIC register (if different from the function doing nonlocal goto) in the nonlocal goto receiver. Consider: extern void baz (void (*) (void)); volatile int z, z1; static volatile int z2, z3; int foo (void) { __label__ l; void bar () { goto l; } baz (bar); return z1 + z3; l: return z + z2; } If you attempt to restore the PIC register in bar before doing the jump, you'd restore it to baz PIC register rather than foo PIC register. md.texi clearly hints it: @cindex @code{nonlocal_goto_receiver} instruction pattern @item @samp{nonlocal_goto_receiver} This pattern, if defined, contains code needed at the target of a nonlocal goto after the code already generated by GCC@. You will not normally need to define this pattern. A typical reason why you might need this pattern is if some value, such as a pointer to a global table, must be restored when the frame pointer is restored. Note that a nonlocal goto only occurs within a unit-of-translation, so a global table pointer that is shared by all functions of a given module need not be restored. There are no arguments. darwin clearly doesn't have a PIC pointer shared by all functions of a given module, therefore it needs to be restored.
(In reply to comment #23) > How would that help? well, I wasn't suggesting that it was a complete solution (and I get that we need to provide the nonlocal_goto_receiver). My point is that, at the moment, 'foo' from your example below is not saving the PIC register if it doesn't use it. so there is no place to restore it from, (and no local label to subtract to correct its value). I was figuring the nonlocal_goto_receiver would need to restore in the case that foo does not use the PIC reg, and correct the value by subtracting a local offset if it does. With nonlocal goto, you need to recompute the PIC > register (if different from the function doing nonlocal goto) in the nonlocal > goto receiver. > Consider: > extern void baz (void (*) (void)); > volatile int z, z1; > static volatile int z2, z3; > int > foo (void) > { > __label__ l; > void bar () > { > goto l; > } > baz (bar); > return z1 + z3; > l: > return z + z2; > } > If you attempt to restore the PIC register in bar before doing the jump, > you'd restore it to baz PIC register rather than foo PIC register. > md.texi clearly hints it: > @cindex @code{nonlocal_goto_receiver} instruction pattern > @item @samp{nonlocal_goto_receiver} > This pattern, if defined, contains code needed at the target of a > nonlocal goto after the code already generated by GCC@. You will not > normally need to define this pattern. A typical reason why you might > need this pattern is if some value, such as a pointer to a global table, > must be restored when the frame pointer is restored. Note that a nonlocal > goto only occurs within a unit-of-translation, so a global table pointer > that is shared by all functions of a given module need not be restored. > There are no arguments. > > darwin clearly doesn't have a PIC pointer shared by all functions of a given > module, therefore it needs to be restored.
> I don't think this is a regression - I think it's been there for(ever/long > time). I don't want to waste time arguing about the regression tag, but gcc.dg/tree-prof/pr44777.c and its avatars gcc.c-torture/execute/comp-goto-2.c and gcc.c-torture/execute/920501-7.c pass (I do not say work) with -fprofile-generate -D_PROFILE_GENERATE -m32 for gcc 4.6.2 or for trunk with r182920 reverted on x86_64-apple-darwin10 (i.e. no pr44777 on this platform). > I guess xfailing is the thing to do for now ... I hate xfail in this kind of situation: I see it as a hypocritical way to say "won't fix". Note the test passes with -fno-pic, so I'ld prefer an additional option -fno-pic for darwin. BTW what happens on i?86-linux-* with -fpic?
(In reply to comment #25) > > I guess xfailing is the thing to do for now ... > > I hate xfail in this kind of situation: I see it as a hypocritical way to say > "won't fix". Note the test passes with -fno-pic, so I'ld prefer an additional > option -fno-pic for darwin. well, I tend to agree about xfails hiding problems away - not a fan either. However, this problem is not directly related to the test-case (it just happens to reveal it). I think we will try to fix this bug - it seems quite serious ... so we could also decide to live with the test-suite noise? > BTW what happens on i?86-linux-* with -fpic? I think as Jakub mentioned, the _GOT solution will not exhibit this problem.
(In reply to comment #25) > BTW what happens on i?86-linux-* with -fpic? You haven't read #c17, right? But if you want even further details: Both x and y there have: call __x86.get_pc_thunk.bx addl $_GLOBAL_OFFSET_TABLE_, %ebx i.e. %ebx after these two insns doesn't contain some address within the function, but address of the _GLOBAL_OFFSET_TABLE_ symbol in the current shared library or binary. So, there is no need to restore anything, as nonlocal goto can only be from a nested function to a function within the same translation unit and thus shared library or binary, thus they have the same %ebx value.
Created attachment 26324 [details] a first attempt at a fix this is pretty much the first ever RTL I've written .... ... so comments welcome ... I've had a quick look at the output on a couple of test-cases and it seems to DTRT .. but it's "hardly tested" so far.
never mind, it doesn't bootstrap...
(In reply to comment #28) > Created attachment 26324 [details] > a first attempt at a fix > > this is pretty much the first ever RTL I've written .... > ... so comments welcome ... > > I've had a quick look at the output on a couple of test-cases and it seems to > DTRT .. but it's "hardly tested" so far. That #if TARGET_MACHO and if (TARGET_MACHO) is unneeded, the condition already guards it. If it was using some darwin specific functions or macros, you'd just surround the body in #if TARGET_MACHO. Furthermore, you don't know during expansion whether the PIC pointer will be emitted or not, therefore probably the nonlocal goto receiver (with the condition you've used) should be initially an instruction with unspec_volatile UNSPECV_NONLOCAL_GOTO_RECEIVER or so, and only split after prologue is emitted, either into nothing (if the PIC register doesn't need to be restored), or to the actual instructions.
(In reply to comment #20) > Doesn't powerpc-darwin have the same problem? Yes that was recorded in PR 10901 :)
Created attachment 26329 [details] test code firstly, apropos comment #22 and #23. If you build this test case under linux (or darwin), -fpic -O0 .. you will see that the prologue of x does not save ebx. However it is used in y ... and to quote function.h "the exit block is reachable directly from a nonlocal label". So I think my comment #22 stands. [if you build the testcase -O0 -fpic -DUSEPIC, then x uses the pic reg and all is OK]. this could be fixed thus (but you might well have a better place/way to fix it): Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 183180) +++ gcc/config/i386/i386.c (working copy) @@ -8698,7 +8698,8 @@ ix86_save_reg (unsigned int regno, bool maybe_eh_r && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) || crtl->profile || crtl->calls_eh_return - || crtl->uses_const_pool)) + || crtl->uses_const_pool + || cfun->has_nonlocal_label)) return ix86_select_alt_pic_regnum () == INVALID_REGNUM; if (crtl->calls_eh_return && maybe_eh_return)
Created attachment 26330 [details] second (non working) go I'm finding this moderately hard. I understand what you're suggesting that I should do - but between gccint.pdf, looking at other .md files and so on there is really rather scarce information -- trial & error is not a productive way forward .. so I'm still a bit stuck. The attached does the right thing with the testcase - it inserts the restore when PIC is used and not when it isn't. But the compiler fails to bootstrap because eh_personality.cc in the c++ library seems to cause the insert of the nonlocal_rx (for eh) but then, somehow the initial use of the pic register is optimized away .. and thus we end up with a load of "L000003$pd" can't be undefined... ... so ... how can I ensure that the test of "uses_pic_" is only done after optimization? (or any other help would be welcome).
(In reply to comment #33) 1) the define_expand isn't needed, just name the define_insn_and_split pattern as "nonlocal_goto_receiver". 2) it should split always, either to nothing if nothing is needed, or to the set_got_* plus adjust, so use "#" as the pattern 3) length attribute doesn't make sense for always split insn 4) the split condition should be && epilogue_completed 5) to determine if you need to load the pic register or not, you should match what the prologue expansion does, try (pic_offset_table_rtx && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) || crtl->profile)) probably also anded with && !current_function_is_leaf - non-local goto receiver in leaf functions doesn't make much sense and certainly doesn't need to restore PIC pointer, plus it will simplify it (for leaf functions we sometimes decide to use a different register as pic pointer instead of %ebx).
Created attachment 26366 [details] initial fix. there are a few wrinkles; 1/ the use of epilogue_completed has to be conditional on optimization because otherwise there's no post-epilogue split pass. 2/ When there's a non-local label in nested code it looks like this: bx = got load .... jmp xxxx nonlocal: do the fixup xxxx: which is fine. However, in except.c we have: void expand_dw2_landing_pad_for_region (eh_region region) { #ifdef HAVE_exception_receiver if (HAVE_exception_receiver) emit_insn (gen_exception_receiver ()); else #endif #ifdef HAVE_nonlocal_goto_receiver if (HAVE_nonlocal_goto_receiver) emit_insn (gen_nonlocal_goto_receiver ()); else which causes sequences like this: bx = load got (emits pic base label). ... ... exception_return = > forces in a got restore via the insert of nonlocal goto rx. bx = got restore (got correction - uses pic base label) ... the optimizer figures that the first (got load) is not needed because nothing touches bx in the meantime -- so it drops the got load. Unfortunately, it can't see that the got load is what emits the pic-base label needed by subsequent pic code (and the correction). The nice solution would be to carry the pic-base label in per function visible RTL and to make all the pic handling "open" in the md ... but that's not going to happen any time soon (well, not in stage 4 anyways). So .. the solution I've put in the patch is that we always try to do a pic load - and we notice (per function) if we've already output the pic-base label. If so, we don't try to do it again. This works (there's a marginal inefficiency in that, once in a while, in exception handling code you will get a zero correction made), but that's only on exception non-local jump branches .. so we can probably live with it for now. A similar solution works also for PPC (the pic code is a lot more in the md there - so it's a bit more involved).
(In reply to comment #34) > 5) to determine if you need to load the pic register or not, you should match > what > the prologue expansion does, try > (pic_offset_table_rtx > && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) > || crtl->profile)) nothing seems to use df_regs_ever_live_p in the md files, and the function is not visible. I also wonder if it is updated during RTL optimization? (I tried making a function in i386.c that performed these tests and was visible from the md - but it didn't work). For now, I've used crtl->uses_pic_offset_table which seems to work. Is there any other suggestion? > probably also anded with && !current_function_is_leaf - non-local goto receiver > in leaf functions doesn't make much sense and certainly doesn't need to restore > PIC pointer, plus it will simplify it (for leaf functions we sometimes decide > to use a different register as pic pointer instead of %ebx). hopefully using pic_offset_table_rtx will pick up the current one?
Regstrapped with the patch in comment #35. The patch fixes this PR without regression (down to 2 failures with some pending patches) and the tests for pr10901 pass with the different options I have tried. Thanks.
If the insn pattern is "#", then if no split pass splits it before final, during final it will be split anyway. So no idea why you play games with !optimize vs. optimize.
(In reply to comment #38) > If the insn pattern is "#", then if no split pass splits it before final, > during final it will be split anyway. So no idea why you play games with > !optimize vs. optimize. hm. well, without that I was hitting the 'unreachable' here ... final.c:2715 if (new_rtx == insn && PATTERN (new_rtx) == body) fatal_insn ("could not split insn", insn); #ifdef HAVE_ATTR_length /* This instruction should have been split in shorten_branches, to ensure that we would have valid length info for the splitees. */ gcc_unreachable (); #endif
GCC 4.7.0 is being released, adjusting target milestone.
*** Bug 52444 has been marked as a duplicate of this bug. ***
Created attachment 29363 [details] updated patch Thanks to my colleague Bernds who identified my mistake (a missing DONE in the splitter). Hopefully this new version addresses Jakub's concerns. folks, please test it across the x86 Darwin versions you have (I've bootstrapped all langs, incl Ada, on x86 Darwin9 where it appears to DTRT). Will update my tree and do a full regtest now.
(In reply to comment #42) On x86_64-apple-darwin12, while the proposed patch from Comment 42 bootstraps fine, it does produce a new regression at -m64... Executing on host: /sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/gcc/xgcc -B/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/gcc/ -fno-diagnostics-show-caret -O2 -mcmodel=large -c -m64 -o pr49866.o /sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20130206/gcc/testsuite/gcc.target/i386/pr49866.c (timeout = 300) /var/tmp//ccgCyUt7.s:11:junk `@PLTOFF' after expression^M /var/tmp//ccgCyUt7.s:14:junk `@PLTOFF' after expression^M /var/tmp//ccgCyUt7.s:17:junk `@PLTOFF' after expression^M /var/tmp//ccgCyUt7.s:48:junk `@PLTOFF' after expression^M compiler exited with status 1 output is: /var/tmp//ccgCyUt7.s:11:junk `@PLTOFF' after expression^M /var/tmp//ccgCyUt7.s:14:junk `@PLTOFF' after expression^M /var/tmp//ccgCyUt7.s:17:junk `@PLTOFF' after expression^M /var/tmp//ccgCyUt7.s:48:junk `@PLTOFF' after expression^M FAIL: gcc.target/i386/pr49866.c (test for excess errors)
Created attachment 29385 [details] assembly file for failing gcc.target/i386/pr49866.c compilation at -m64 Compiled with... /sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/gcc/xgcc -B/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/gcc/ -fno-diagnostics-show-caret -O2 -mcmodel=large -c -m64 -o pr49866.o /sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20130206/gcc/testsuite/gcc.target/i386/pr49866.c --save-temps
(In reply to comment #43) > On x86_64-apple-darwin12, while the proposed patch from Comment 42 bootstraps > fine, it does produce a new regression at -m64... This is pr50077 and it has nothing to do with the patch (look at your tests, e.g., http://gcc.gnu.org/ml/gcc-testresults/2013-02/msg00522.html). BTW the patch did not cause any new failure on x86_64-apple-darwin10 (as did the previous version).
(In reply to comment #42) Full regression test results on x86_64-apple-darwin12 are at... http://gcc.gnu.org/ml/gcc-testresults/2013-02/msg00745.html
Fixed by revision 201086 Author: iains Date: Sat Jul 20 16:22:59 2013 UTC (6 days ago) Changed paths: 5 Log Message: gcc/ PR target/51784 * config/i386/i386.c (output_set_got) [TARGET_MACHO]: Adjust to emit a second label for nonlocal goto receivers. Don't output pic base labels unless we're producing PIC; mark that action unreachable(). (ix86_save_reg): If the function contains a nonlocal label, save the PIC base reg. * config/darwin-protos.h (machopic_should_output_picbase_label): New. * gcc/config/darwin.c (emitted_pic_label_num): New GTY. (update_pic_label_number_if_needed): New. (machopic_output_function_base_name): Adjust for nonlocal receiver case. (machopic_should_output_picbase_label): New. * config/i386/i386.md (enum unspecv): UNSPECV_NLGR: New. (nonlocal_goto_receiver): New insn and split. Thanks for the fix.
Reopened as the test gcc.c-torture/execute/pr51447.c still fails on powerpc-apple-darwin9 (see http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00136.html ). The test succeeds with the patch for pr10901 at http://gcc.gnu.org/bugzilla/attachment.cgi?id=26370 .
(In reply to Dominique d'Humieres from comment #48) > Reopened as the test gcc.c-torture/execute/pr51447.c still fails on > powerpc-apple-darwin9 (see > http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00136.html ). The test > succeeds with the patch for pr10901 at > http://gcc.gnu.org/bugzilla/attachment.cgi?id=26370 . As Andrew noted above, that was recorded in PR 10901 :) [I have a PPC patch-in-progress - add 10901 to your list, and I'll post to there when ready to test].
Author: iains Date: Sun Sep 1 15:39:28 2013 New Revision: 202147 URL: http://gcc.gnu.org/viewcvs?rev=202147&root=gcc&view=rev Log: gcc: Backport from mainline: 2013-07-22 Uros Bizjak <ubizjak@gmail.com> * config/i386/i386.md (nonlocal_goto_receiver): Delete insn if it is not needed after split. 2013-07-20 Iain Sandoe <iain@codesourcery.com> PR target/51784 * config/i386/i386.c (output_set_got) [TARGET_MACHO]: Adjust to emit a second label for nonlocal goto receivers. Don't output pic base labels unless we're producing PIC; mark that action unreachable(). (ix86_save_reg): If the function contains a nonlocal label, save the PIC base reg. * config/darwin-protos.h (machopic_should_output_picbase_label): New. * gcc/config/darwin.c (emitted_pic_label_num): New GTY. (update_pic_label_number_if_needed): New. (machopic_output_function_base_name): Adjust for nonlocal receiver case. (machopic_should_output_picbase_label): New. * config/i386/i386.md (enum unspecv): UNSPECV_NLGR: New. (nonlocal_goto_receiver): New insn and split. Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/config/darwin-protos.h branches/gcc-4_7-branch/gcc/config/darwin.c branches/gcc-4_7-branch/gcc/config/i386/i386.c branches/gcc-4_7-branch/gcc/config/i386/i386.md