gcc -g -O2 -march=pentium4 -mfpmath=sse -c movapd_align_bug.c objdump -dS movapd_align_bug.o: xfp->m[0].x = -sin_lon; 42: f2 0f 10 05 00 00 00 movsd 0x0,%xmm0 49: 00 4a: f2 0f 10 4d b0 movsd 0xffffffb0(%ebp),%xmm1 4f: 66 0f 29 45 88 movapd %xmm0,0xffffff88(%ebp) <- ILLEGAL 54: 66 0f 57 c8 xorpd %xmm0,%xmm1 xfp->m[1].x = cos_lon; xfp->m[2].x = 0.0;
Created attachment 6012 [details] test case C file
Created attachment 6013 [details] objdump -dS movapd_align_bug.o
Confirmed on the mainline (almost the same asm as given) Also on 3.3.3: movsd -104(%ebp), %xmm4 Also 3.2.3: movsd -120(%ebp), %xmm0
Um..., no. Those are ok. movsd only needs 8 byte alignment. The original bug (movapd) stands, though. And, I confirmed it on 3.5.
Actually 3.5.0 does not produce any at all in i686-pc-linux, maybe this is a cygwin bug only. 7d: 66 0f 28 cb movapd %xmm3,%xmm1 bb: 66 0f 28 d3 movapd %xmm3,%xmm2
Cygwin has -malign-double the default. Try that in Linux. BTW, why did you remove the "Known to fail"?
Can you provide the output of gcc -v? I still cannot reproduce it with: gcc pr14776.c -O2 -march=pentium4 -mfpmath=sse -c -g -malign-double -mstack-arg-probe -mfp-ret-in-387 -mieee-fp on linux.
Reading specs from /home/ford/local2/lib/gcc/i686-pc-cygwin/3.4.0/specs Configured with: ../sources/configure --enable-languages=c --prefix=/home/ford/local2 --with-local-prefix=/home/ford/local2/include --disable-gdbtk : (reconfigured) : (reconfigured) : (reconfigured) Thread model: single gcc version 3.4.0 20040329 (prerelease) I haven't tried Linux, I just thought the -malign-double might be the difference.
Subject: Re: New: -mfpmath=sse causes movapd from non-16-byte aligned address > gcc -g -O2 -march=pentium4 -mfpmath=sse -c movapd_align_bug.c > > objdump -dS movapd_align_bug.o: > > xfp->m[0].x = -sin_lon; > 42: f2 0f 10 05 00 00 00 movsd 0x0,%xmm0 > 49: 00 > 4a: f2 0f 10 4d b0 movsd 0xffffffb0(%ebp),%xmm1 > 4f: 66 0f 29 45 88 movapd %xmm0,0xffffff88(%ebp) <- ILLEGAL > 54: 66 0f 57 c8 xorpd %xmm0,%xmm1 GCC manages to conclude to spill out the temporary negative zero used to expand negations and then it hits the usual problem of stack frame being missaligned in main on cygwin and few other runtimes. I am quite surprised that register alloc didn't rematerialized the counstant tought... It is perfect candidate for that. Honza > xfp->m[1].x = cos_lon; > xfp->m[2].x = 0.0; > > -- > Summary: -mfpmath=sse causes movapd from non-16-byte aligned > address > Product: gcc > Version: 3.4.0 > Status: UNCONFIRMED > Severity: normal > Priority: P2 > Component: optimization > AssignedTo: unassigned at gcc dot gnu dot org > ReportedBy: ford at vss dot fsi dot com > CC: gcc-bugs at gcc dot gnu dot org > GCC build triplet: i686-pc-cygwin > GCC host triplet: i686-pc-cygwin > GCC target triplet: i686-pc-cygwin > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14776
Created attachment 6653 [details] Bugexample for linux, use multiple threads It is easy to get hit by this bug on a ordinary linux distribution (e.g. SuSE-8.2), just use a multithreaded application (see attached example file). The bug appears with gcc-3.4.0 and gcc-3.4.1-20040625, but not with gcc-3.3.3. ~/test/gcc_bug_14776> /opt/gcc-3.4.1-20040625/bin/gcc -v -Wall -march=pentium4 -O2 -save-temps -g -pthread movapd_align_bug_pthread.c -lm Reading specs from /opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/specs Configured with: ../gcc-3.4.1-20040625/configure --prefix=/opt/gcc-3.4.1-20040625 --enable-threads=posix --enable-languages=c,c++,java Thread model: posix gcc version 3.4.1 20040625 (prerelease) /opt/gcc-3.4.1-20040625/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1 -E -quiet -v -D_REENTRANT movapd_align_bug_pthread.c -march=pentium4 -Wall -fworking-directory -O2 -o movapd_align_bug_pthread.i ignoring nonexistent directory "/opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../i686-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/local/include /opt/gcc-3.4.1-20040625/include /opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/include /usr/include End of search list. /opt/gcc-3.4.1-20040625/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1 -fpreprocessed movapd_align_bug_pthread.i -quiet -dumpbase movapd_align_bug_pthread.c -march=pentium4 -auxbase movapd_align_bug_pthread -g -O2 -Wall -version -o movapd_align_bug_pthread.s GNU C version 3.4.1 20040625 (prerelease) (i686-pc-linux-gnu) compiled by GNU C version 3.4.1 20040625 (prerelease). GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 as -V -Qy -o movapd_align_bug_pthread.o movapd_align_bug_pthread.s GNU assembler version 2.13.90.0.18 (i486-suse-linux) using BFD version 2.13.90.0.18 20030121 (SuSE Linux) /opt/gcc-3.4.1-20040625/libexec/gcc/i686-pc-linux-gnu/3.4.1/collect2 --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o /opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/crtbegin.o -L/opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1 -L/opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/../../.. movapd_align_bug_pthread.o -lm -lgcc -lgcc_eh -lpthread -lc -lgcc -lgcc_eh /opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/crtend.o /usr/lib/crtn.o ~/test/gcc_bug_14776> gdb ./a.out GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1". (gdb) run Starting program: /home/seiderer/test/gcc_bug_14776/a.out [Thread debugging using libthread_db enabled] [New Thread 16384 (LWP 5968)] [New Thread 32769 (LWP 5970)] [New Thread 16386 (LWP 5971)] [New Thread 32771 (LWP 5972)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 16386 (LWP 5971)] create_geo_to_topo (xfp=0xbf7ffa6c, lat=1, lon=1) at movapd_align_bug_pthread.c:41 41 xfp->m[0].x = -sin_lon; (gdb) info reg eax 0xbf7ff9fc -1082131972 ecx 0x401b7d60 1075543392 edx 0x6 6 ebx 0xbf7ffa6c -1082131860 esp 0xbf7ff9bc 0xbf7ff9bc ebp 0xbf7ffa44 0xbf7ffa44 esi 0xbf7ffbe0 -1082131488 edi 0x0 0 eip 0x8048541 0x8048541 eflags 0x10246 66118 cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x0 0 (gdb) disassemble Dump of assembler code for function create_geo_to_topo: 0x0804848f <create_geo_to_topo+0>: push %ebp 0x08048490 <create_geo_to_topo+1>: mov %esp,%ebp 0x08048492 <create_geo_to_topo+3>: push %ebx 0x08048493 <create_geo_to_topo+4>: sub $0x84,%esp 0x08048499 <create_geo_to_topo+10>: fldl 0x14(%ebp) 0x0804849c <create_geo_to_topo+13>: mov 0x8(%ebp),%ebx 0x0804849f <create_geo_to_topo+16>: fstpl (%esp) 0x080484a2 <create_geo_to_topo+19>: call 0x80483ac <sin> 0x080484a7 <create_geo_to_topo+24>: fldl 0x14(%ebp) 0x080484aa <create_geo_to_topo+27>: fstpl (%esp) 0x080484ad <create_geo_to_topo+30>: fstpl 0xffffffb0(%ebp) 0x080484b0 <create_geo_to_topo+33>: call 0x804836c <cos> 0x080484b5 <create_geo_to_topo+38>: fstpl 0xffffffa8(%ebp) 0x080484b8 <create_geo_to_topo+41>: fldl 0xc(%ebp) 0x080484bb <create_geo_to_topo+44>: fstpl (%esp) 0x080484be <create_geo_to_topo+47>: call 0x80483ac <sin> 0x080484c3 <create_geo_to_topo+52>: fldl 0xc(%ebp) 0x080484c6 <create_geo_to_topo+55>: fstpl (%esp) 0x080484c9 <create_geo_to_topo+58>: fstpl 0xffffffa0(%ebp) 0x080484cc <create_geo_to_topo+61>: call 0x804836c <cos> 0x080484d1 <create_geo_to_topo+66>: fldl 0xffffffa8(%ebp) 0x080484d4 <create_geo_to_topo+69>: fldl 0xffffffb0(%ebp) 0x080484d7 <create_geo_to_topo+72>: fxch %st(1) 0x080484d9 <create_geo_to_topo+74>: fstl 0x18(%ebx) 0x080484dc <create_geo_to_topo+77>: fchs 0x080484de <create_geo_to_topo+79>: fxch %st(1) 0x080484e0 <create_geo_to_topo+81>: fchs 0x080484e2 <create_geo_to_topo+83>: fxch %st(1) 0x080484e4 <create_geo_to_topo+85>: fmull 0xffffffa0(%ebp) 0x080484e7 <create_geo_to_topo+88>: fxch %st(1) 0x080484e9 <create_geo_to_topo+90>: fstl (%ebx) 0x080484eb <create_geo_to_topo+92>: fxch %st(2) 0x080484ed <create_geo_to_topo+94>: fstl 0x38(%ebx) 0x080484f0 <create_geo_to_topo+97>: fxch %st(1) 0x080484f2 <create_geo_to_topo+99>: fstpl 0x8(%ebx) 0x080484f5 <create_geo_to_topo+102>: fldl 0xffffffa8(%ebp) 0x080484f8 <create_geo_to_topo+105>: fldz 0x080484fa <create_geo_to_topo+107>: fxch %st(1) 0x080484fc <create_geo_to_topo+109>: fmul %st(2),%st 0x080484fe <create_geo_to_topo+111>: fxch %st(3) 0x08048500 <create_geo_to_topo+113>: fmull 0xffffffa0(%ebp) 0x08048503 <create_geo_to_topo+116>: fxch %st(3) 0x08048505 <create_geo_to_topo+118>: lea 0xffffffd8(%ebp),%eax 0x08048508 <create_geo_to_topo+121>: fstpl 0x10(%ebx) 0x0804850b <create_geo_to_topo+124>: fldl 0xffffffb0(%ebp) 0x0804850e <create_geo_to_topo+127>: fxch %st(1) 0x08048510 <create_geo_to_topo+129>: fstl 0x30(%ebx) 0x08048513 <create_geo_to_topo+132>: fxch %st(1) 0x08048515 <create_geo_to_topo+134>: fmulp %st,%st(2) 0x08048517 <create_geo_to_topo+136>: fldl 0xffffffa0(%ebp) 0x0804851a <create_geo_to_topo+139>: fxch %st(3) 0x0804851c <create_geo_to_topo+141>: fstpl 0x20(%ebx) 0x0804851f <create_geo_to_topo+144>: fxch %st(1) 0x08048521 <create_geo_to_topo+146>: fstpl 0x28(%ebx) 0x08048524 <create_geo_to_topo+149>: fxch %st(1) 0x08048526 <create_geo_to_topo+151>: fstpl 0x40(%ebx) 0x08048529 <create_geo_to_topo+154>: fldl 0xc(%ebp) 0x0804852c <create_geo_to_topo+157>: fldl 0x14(%ebp) 0x0804852f <create_geo_to_topo+160>: mov %eax,0x4(%esp) 0x08048533 <create_geo_to_topo+164>: movsd 0x80486c0,%xmm0 0x0804853b <create_geo_to_topo+172>: lea 0xffffffb8(%ebp),%eax 0x0804853e <create_geo_to_topo+175>: fstpl 0xffffffc0(%ebp) 0x08048541 <create_geo_to_topo+178>: movapd %xmm0,0xffffff88(%ebp) 0x08048546 <create_geo_to_topo+183>: fstpl 0xffffffb8(%ebp) 0x08048549 <create_geo_to_topo+186>: fstpl 0xffffffc8(%ebp) 0x0804854c <create_geo_to_topo+189>: mov %eax,(%esp) 0x0804854f <create_geo_to_topo+192>: call 0x8048474 <geo_lla_xyz> 0x08048554 <create_geo_to_topo+197>: fldl 0xffffffd8(%ebp) 0x08048557 <create_geo_to_topo+200>: fchs 0x08048559 <create_geo_to_topo+202>: fstpl 0x48(%ebx) 0x0804855c <create_geo_to_topo+205>: add $0x84,%esp 0x08048562 <create_geo_to_topo+211>: pop %ebx 0x08048563 <create_geo_to_topo+212>: pop %ebp 0x08048564 <create_geo_to_topo+213>: ret End of assembler dump. (gdb) quit The program is running. Exit anyway? (y or n) ~/test/gcc_bug_14776> /lib/libc.so.6 GNU C Library stable release version 2.3.2, by Roland McGrath et al. Copyright (C) 2003 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Compiled by GNU CC version 3.3 20030226 (prerelease) (SuSE Linux). Compiled on a Linux 2.4.20 system on 2003-03-13. Available extensions: GNU libio by Per Bothner crypt add-on version 2.1 by Michael Glad and others linuxthreads-0.10 by Xavier Leroy NoVersion patch for broken glibc 2.0 binaries BIND-8.2.3-T5B libthread_db work sponsored by Alpha Processor Inc NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk Report bugs using the `glibcbug' script to <bugs@gnu.org>.
The previous attached example programm compiled with gcc-3.4.1 gives a Segmentation fault too (the same with 3.4.0 and gcc-3.4.1-20040625, not with gcc-3.3.3). This happens because of a missaligned 'movapd %xmm0,0xffffff88(%ebp)'. Reading specs from /opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/specs Configured with: ../gcc-3.4.1/configure --prefix=/opt/gcc-3.4.1 --enable-threads=posix --enable-languages=c,c++,java Thread model: posix gcc version 3.4.1 /opt/gcc-3.4.1/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1 -E -quiet -v -D_REENTRANT movapd_align_bug_pthread.c -march=pentium4 -Wall -fworking-directory -O2 -o movapd_align_bug_pthread.i ignoring nonexistent directory "/opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../i686-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/local/include /opt/gcc-3.4.1/include /opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/include /usr/include End of search list. /opt/gcc-3.4.1/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1 -fpreprocessed movapd_align_bug_pthread.i -quiet -dumpbase movapd_align_bug_pthread.c -march=pentium4 -auxbase movapd_align_bug_pthread -g -O2 -Wall -version -o movapd_align_bug_pthread.s GNU C version 3.4.1 (i686-pc-linux-gnu) compiled by GNU C version 3.4.1. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 as -V -Qy -o movapd_align_bug_pthread.o movapd_align_bug_pthread.s GNU assembler version 2.13.90.0.18 (i486-suse-linux) using BFD version 2.13.90.0.18 20030121 (SuSE Linux) /opt/gcc-3.4.1/libexec/gcc/i686-pc-linux-gnu/3.4.1/collect2 --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o /opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/crtbegin.o -L/opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1 -L/opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/../../.. movapd_align_bug_pthread.o -lm -lgcc -lgcc_eh -lpthread -lc -lgcc -lgcc_eh /opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/crtend.o /usr/lib/crtn.o GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1". (gdb) run Starting program: /home/seiderer/test/gcc_bug_14776/a.out [Thread debugging using libthread_db enabled] [New Thread 16384 (LWP 11430)] [New Thread 32769 (LWP 11432)] [New Thread 16386 (LWP 11433)] [New Thread 32771 (LWP 11434)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 16386 (LWP 11433)] create_geo_to_topo (xfp=0xbf7ffa6c, lat=1, lon=1) at movapd_align_bug_pthread.c:41 41 xfp->m[0].x = -sin_lon; (gdb) info reg eax 0xbf7ff9fc -1082131972 ecx 0x401b7d60 1075543392 edx 0x6 6 ebx 0xbf7ffa6c -1082131860 esp 0xbf7ff9bc 0xbf7ff9bc ebp 0xbf7ffa44 0xbf7ffa44 esi 0xbf7ffbe0 -1082131488 edi 0x0 0 eip 0x8048541 0x8048541 eflags 0x10246 66118 cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x0 0 (gdb) disassemble Dump of assembler code for function create_geo_to_topo: 0x0804848f <create_geo_to_topo+0>: push %ebp 0x08048490 <create_geo_to_topo+1>: mov %esp,%ebp 0x08048492 <create_geo_to_topo+3>: push %ebx 0x08048493 <create_geo_to_topo+4>: sub $0x84,%esp 0x08048499 <create_geo_to_topo+10>: fldl 0x14(%ebp) 0x0804849c <create_geo_to_topo+13>: mov 0x8(%ebp),%ebx 0x0804849f <create_geo_to_topo+16>: fstpl (%esp) 0x080484a2 <create_geo_to_topo+19>: call 0x80483ac <sin> 0x080484a7 <create_geo_to_topo+24>: fldl 0x14(%ebp) 0x080484aa <create_geo_to_topo+27>: fstpl (%esp) 0x080484ad <create_geo_to_topo+30>: fstpl 0xffffffb0(%ebp) 0x080484b0 <create_geo_to_topo+33>: call 0x804836c <cos> 0x080484b5 <create_geo_to_topo+38>: fstpl 0xffffffa8(%ebp) 0x080484b8 <create_geo_to_topo+41>: fldl 0xc(%ebp) 0x080484bb <create_geo_to_topo+44>: fstpl (%esp) 0x080484be <create_geo_to_topo+47>: call 0x80483ac <sin> 0x080484c3 <create_geo_to_topo+52>: fldl 0xc(%ebp) 0x080484c6 <create_geo_to_topo+55>: fstpl (%esp) 0x080484c9 <create_geo_to_topo+58>: fstpl 0xffffffa0(%ebp) 0x080484cc <create_geo_to_topo+61>: call 0x804836c <cos> 0x080484d1 <create_geo_to_topo+66>: fldl 0xffffffa8(%ebp) 0x080484d4 <create_geo_to_topo+69>: fldl 0xffffffb0(%ebp) 0x080484d7 <create_geo_to_topo+72>: fxch %st(1) 0x080484d9 <create_geo_to_topo+74>: fstl 0x18(%ebx) 0x080484dc <create_geo_to_topo+77>: fchs 0x080484de <create_geo_to_topo+79>: fxch %st(1) 0x080484e0 <create_geo_to_topo+81>: fchs 0x080484e2 <create_geo_to_topo+83>: fxch %st(1) 0x080484e4 <create_geo_to_topo+85>: fmull 0xffffffa0(%ebp) 0x080484e7 <create_geo_to_topo+88>: fxch %st(1) 0x080484e9 <create_geo_to_topo+90>: fstl (%ebx) 0x080484eb <create_geo_to_topo+92>: fxch %st(2) 0x080484ed <create_geo_to_topo+94>: fstl 0x38(%ebx) 0x080484f0 <create_geo_to_topo+97>: fxch %st(1) 0x080484f2 <create_geo_to_topo+99>: fstpl 0x8(%ebx) 0x080484f5 <create_geo_to_topo+102>: fldl 0xffffffa8(%ebp) 0x080484f8 <create_geo_to_topo+105>: fldz 0x080484fa <create_geo_to_topo+107>: fxch %st(1) 0x080484fc <create_geo_to_topo+109>: fmul %st(2),%st 0x080484fe <create_geo_to_topo+111>: fxch %st(3) 0x08048500 <create_geo_to_topo+113>: fmull 0xffffffa0(%ebp) 0x08048503 <create_geo_to_topo+116>: fxch %st(3) 0x08048505 <create_geo_to_topo+118>: lea 0xffffffd8(%ebp),%eax 0x08048508 <create_geo_to_topo+121>: fstpl 0x10(%ebx) 0x0804850b <create_geo_to_topo+124>: fldl 0xffffffb0(%ebp) 0x0804850e <create_geo_to_topo+127>: fxch %st(1) 0x08048510 <create_geo_to_topo+129>: fstl 0x30(%ebx) 0x08048513 <create_geo_to_topo+132>: fxch %st(1) 0x08048515 <create_geo_to_topo+134>: fmulp %st,%st(2) 0x08048517 <create_geo_to_topo+136>: fldl 0xffffffa0(%ebp) 0x0804851a <create_geo_to_topo+139>: fxch %st(3) 0x0804851c <create_geo_to_topo+141>: fstpl 0x20(%ebx) 0x0804851f <create_geo_to_topo+144>: fxch %st(1) 0x08048521 <create_geo_to_topo+146>: fstpl 0x28(%ebx) 0x08048524 <create_geo_to_topo+149>: fxch %st(1) 0x08048526 <create_geo_to_topo+151>: fstpl 0x40(%ebx) 0x08048529 <create_geo_to_topo+154>: fldl 0xc(%ebp) 0x0804852c <create_geo_to_topo+157>: fldl 0x14(%ebp) 0x0804852f <create_geo_to_topo+160>: mov %eax,0x4(%esp) 0x08048533 <create_geo_to_topo+164>: movsd 0x80486c0,%xmm0 0x0804853b <create_geo_to_topo+172>: lea 0xffffffb8(%ebp),%eax 0x0804853e <create_geo_to_topo+175>: fstpl 0xffffffc0(%ebp) 0x08048541 <create_geo_to_topo+178>: movapd %xmm0,0xffffff88(%ebp) 0x08048546 <create_geo_to_topo+183>: fstpl 0xffffffb8(%ebp) 0x08048549 <create_geo_to_topo+186>: fstpl 0xffffffc8(%ebp) 0x0804854c <create_geo_to_topo+189>: mov %eax,(%esp) 0x0804854f <create_geo_to_topo+192>: call 0x8048474 <geo_lla_xyz> 0x08048554 <create_geo_to_topo+197>: fldl 0xffffffd8(%ebp) 0x08048557 <create_geo_to_topo+200>: fchs 0x08048559 <create_geo_to_topo+202>: fstpl 0x48(%ebx) 0x0804855c <create_geo_to_topo+205>: add $0x84,%esp 0x08048562 <create_geo_to_topo+211>: pop %ebx 0x08048563 <create_geo_to_topo+212>: pop %ebp 0x08048564 <create_geo_to_topo+213>: ret End of assembler dump. (gdb) quit The program is running. Exit anyway? (y or n)
Fixed in Cygwin by: http://www.cygwin.com/ml/cygwin-cvs/2004-q2/msg00124.html for single threaded executables, and by: http://www.cygwin.com/ml/cygwin-cvs/2004-q2/msg00108.html for multi threaded ones. Win32 callbacks would still be an issue, though.
*** Bug 17930 has been marked as a duplicate of this bug. ***
(In reply to comment #13) > *** Bug 17930 has been marked as a duplicate of this bug. *** However, Bug 17930 might still be worth looking at, since it includes a fortran source file with which you can easily reproduce this bug, even on non-cygwin targets.
Created attachment 7324 [details] fortran source for a different test case This is the Fortran77 source for a different test case which reproduces this bug with gcc-3.4.1 and gcc-3.4.2 on target i486-pc-linux-gnu. (At least bugzilla says it's the same bug, so I'm posting this here, too.) However, I was just able to check with a gcc-3.5 snapshot; for this testcase, the bug seems to be resolved. Details: frank:gccbug> gfortran -O -msse2 -mfpmath=sse -g -Wall gccbug.f -v Driving: gfortran -O -msse2 -mfpmath=sse -g -Wall gccbug.f -v -lgfortranbegin -lgfortran -lm -shared-libgcc Reading specs from /usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/3.5.0/specs Configured with: ../src/configure -v --enable-languages=c,c++,java,f95,objc --prefix=/usr/lib/gcc-snapshot --enable-shared --with-system-zlib --enable-nls --enable-threads=posix --without-included-gettext --disable-werror --enable-__cxa_atexit --enable-libstdcxx-allocator=mt --enable-clocale=gnu --enable-libstdcxx-debug --enable-java-gc=boehm --enable-java-awt=gtk i486-linux-gnu Thread model: posix gcc version 3.5.0 20040717 (experimental) /usr/lib/gcc-snapshot/libexec/gcc/i486-linux-gnu/3.5.0/f951 gccbug.f -ffixed-form -quiet -dumpbase gccbug.f -msse2 -mfpmath=sse -mtune=i486 -auxbase gccbug -g -O -Wall -version -o /tmp/cc16CTBA.s GNU F95 version 3.5.0 20040717 (experimental) (i486-linux-gnu) compiled by GNU C version 3.5.0 20040717 (experimental). GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 In file gccbug.f:25 return 1 Warning: Extension: RETURN statement in main program at (1) gccbug.f: In function `choleskyzhp': gccbug.f:53: warning: 'y' may be used uninitialized in this function as -V -Qy -o /tmp/cciP9Rce.o /tmp/cc16CTBA.s GNU assembler version 2.15 (i386-linux) using BFD version 2.15 /usr/lib/gcc-snapshot/libexec/gcc/i486-linux-gnu/3.5.0/collect2 --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/3.5.0/crtbegin.o -L/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/3.5.0 -L/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/3.5.0/../../.. /tmp/cciP9Rce.o -lgfortranbegin -lgfortran -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/3.5.0/crtend.o /usr/lib/crtn.o With gcc-3.5, my program runs correctly.
Here is a small testcase that segfaults in a movapd instruction when compiled with the 3.4 branch: ---------------------------- subroutine choleskyzhp (a) integer i,j,k complex*16 a(5,5),x do j = 1,5 do i = j,5 do k = 1,j-1 x = a(i,k)*dconjg(a(j,k)) enddo a(i,j) = x enddo enddo return end program test complex*16 work(5,5,2) call choleskyzhp(work(1,1,2)) return end -------------------------- g/x> /home/bangerth/bin/gcc-3.4.*-pre/bin/g77 -O -msse2 -g x.f ; ./a.out g/x> /home/bangerth/bin/gcc-3.4.*-pre/bin/g77 -O -msse2 -mfpmath=sse -g x.f ; ./a.out Segmentation fault (gdb) r `/home/bangerth/tmp/g/x/a.out' has changed; re-reading symbols. Starting program: /home/bangerth/tmp/g/x/a.out Program received signal SIGSEGV, Segmentation fault. 0x08048636 in choleskyzhp_ (a=0xbfffe920) at x.f:12 (gdb) disass Dump of assembler code for function choleskyzhp_: [...] 0x08048636 <choleskyzhp_+34>: movapd %xmm0,0xffffffc8(%ebp) Given that g77 doesn't exist any more on mainline, I don't know how to reproduce this bug there. Maybe some of the other dups listed in PR 17990 can help. I'll take a look. W.
I should probably say that my installation is a little dusty already: gcc version 3.4.3 20041015 (prerelease) Maybe someone with a newer version can try to reproduce this. W.
In case someone wondered about the mapping of the rank-3 array to the rank-2 array in my previous testcase: here is something even simpler: ========================== subroutine choleskyzhp () integer i,j,k complex*16 a(500),x do i = 1,5 do j = 1,5 do k = 1,j x = a(i*5+j)*dconjg(a(j*5+j)) enddo a(i*5+k) = x enddo enddo return end program test call choleskyzhp return end ============================ It fails in the same way as my previous testcase, i.e. in an movapd instruction. W.
According to comment #12, this was a cygwin bug wrt the initial alignment of the main and thread stacks. Even the test program in comment #10, which is supposed to be reproducible for linux is not reproducible here with Fedora Core 3. I can only assume that SuSe 8.3 ships a broken thread library. Marking PR17930 as a duplicate was a mistake. While the symptom is the same, this one has not shown to be a gcc bug at all. If you find a new test case that you think shows this problem again, first thing to do is validate that the stack is correctly aligned by the system. Note that the return address for the function should be at address 12 modulo 16. If that's true, then file another bug; this one's a bit disjointed already.
My testcase was actually for x86 linux, but since you fixed it in that other PR today, I assume that the same bug triggered but PRs, so closing this one should be ok. W.