Bug 14776 - -mfpmath=sse causes movapd from non-16-byte aligned address
Summary: -mfpmath=sse causes movapd from non-16-byte aligned address
Status: RESOLVED WORKSFORME
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.4.0
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2004-03-29 20:01 UTC by Brian Ford
Modified: 2004-12-14 03:56 UTC (History)
3 users (show)

See Also:
Host:
Target: i686-pc-cygwin
Build:
Known to work:
Known to fail: 3.4.3
Last reconfirmed: 2004-03-29 23:07:51


Attachments
test case C file (390 bytes, text/plain)
2004-03-29 20:03 UTC, Brian Ford
Details
objdump -dS movapd_align_bug.o (1.49 KB, text/plain)
2004-03-29 20:03 UTC, Brian Ford
Details
Bugexample for linux, use multiple threads (558 bytes, text/plain)
2004-06-29 18:47 UTC, Peter Seiderer
Details
fortran source for a different test case (1.31 KB, text/plain)
2004-10-11 12:29 UTC, Frank Otto
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Brian Ford 2004-03-29 20:01:40 UTC
gcc -g -O2 -march=pentium4 -mfpmath=sse -c movapd_align_bug.c

objdump -dS movapd_align_bug.o:

        xfp->m[0].x = -sin_lon;
  42:   f2 0f 10 05 00 00 00    movsd  0x0,%xmm0
  49:   00 
  4a:   f2 0f 10 4d b0          movsd  0xffffffb0(%ebp),%xmm1
  4f:   66 0f 29 45 88          movapd %xmm0,0xffffff88(%ebp) <- ILLEGAL
  54:   66 0f 57 c8             xorpd  %xmm0,%xmm1
        xfp->m[1].x =  cos_lon;
        xfp->m[2].x = 0.0;
Comment 1 Brian Ford 2004-03-29 20:03:11 UTC
Created attachment 6012 [details]
test case C file
Comment 2 Brian Ford 2004-03-29 20:03:57 UTC
Created attachment 6013 [details]
objdump -dS movapd_align_bug.o
Comment 3 Andrew Pinski 2004-03-29 23:07:50 UTC
Confirmed on the mainline (almost the same asm as given)

Also on 3.3.3:
        movsd   -104(%ebp), %xmm4

Also 3.2.3:
        movsd   -120(%ebp), %xmm0
Comment 4 Brian Ford 2004-03-29 23:32:42 UTC
Um..., no.

Those are ok.  movsd only needs 8 byte alignment.

The original bug (movapd) stands, though.  And, I confirmed it on 3.5.
Comment 5 Andrew Pinski 2004-03-29 23:39:04 UTC
Actually 3.5.0 does not produce any at all in i686-pc-linux, maybe this is a cygwin bug only.
  7d:   66 0f 28 cb             movapd %xmm3,%xmm1
  bb:   66 0f 28 d3             movapd %xmm3,%xmm2
Comment 6 Brian Ford 2004-03-29 23:43:22 UTC
Cygwin has -malign-double the default.  Try that in Linux.

BTW, why did you remove the "Known to fail"?
Comment 7 Andrew Pinski 2004-03-29 23:55:33 UTC
Can you provide the output of gcc -v?
I still cannot reproduce it with:
gcc pr14776.c -O2 -march=pentium4 -mfpmath=sse -c -g -malign-double -mstack-arg-probe 
-mfp-ret-in-387 -mieee-fp
on linux.
Comment 8 Brian Ford 2004-03-29 23:59:48 UTC
Reading specs from /home/ford/local2/lib/gcc/i686-pc-cygwin/3.4.0/specs
Configured with: ../sources/configure --enable-languages=c
--prefix=/home/ford/local2 --with-local-prefix=/home/ford/local2/include
--disable-gdbtk : (reconfigured)  : (reconfigured)  : (reconfigured) 
Thread model: single
gcc version 3.4.0 20040329 (prerelease)

I haven't tried Linux, I just thought the -malign-double might be the difference.  
Comment 9 Jan Hubicka 2004-03-30 14:51:05 UTC
Subject: Re:  New: -mfpmath=sse causes movapd from non-16-byte aligned address

> gcc -g -O2 -march=pentium4 -mfpmath=sse -c movapd_align_bug.c
> 
> objdump -dS movapd_align_bug.o:
> 
>         xfp->m[0].x = -sin_lon;
>   42:   f2 0f 10 05 00 00 00    movsd  0x0,%xmm0
>   49:   00 
>   4a:   f2 0f 10 4d b0          movsd  0xffffffb0(%ebp),%xmm1
>   4f:   66 0f 29 45 88          movapd %xmm0,0xffffff88(%ebp) <- ILLEGAL
>   54:   66 0f 57 c8             xorpd  %xmm0,%xmm1

GCC manages to conclude to spill out the temporary negative zero used to
expand negations and then it hits the usual problem of stack frame being
missaligned in main on cygwin and few other runtimes.

I am quite surprised that register alloc didn't rematerialized the
counstant tought...  It is perfect candidate for that.

Honza
>         xfp->m[1].x =  cos_lon;
>         xfp->m[2].x = 0.0;
> 
> -- 
>            Summary: -mfpmath=sse causes movapd from non-16-byte aligned
>                     address
>            Product: gcc
>            Version: 3.4.0
>             Status: UNCONFIRMED
>           Severity: normal
>           Priority: P2
>          Component: optimization
>         AssignedTo: unassigned at gcc dot gnu dot org
>         ReportedBy: ford at vss dot fsi dot com
>                 CC: gcc-bugs at gcc dot gnu dot org
>  GCC build triplet: i686-pc-cygwin
>   GCC host triplet: i686-pc-cygwin
> GCC target triplet: i686-pc-cygwin
> 
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14776
Comment 10 Peter Seiderer 2004-06-29 18:47:54 UTC
Created attachment 6653 [details]
Bugexample for linux, use multiple threads

It is easy to get hit by this bug on a ordinary linux distribution
(e.g. SuSE-8.2), just use a multithreaded application (see attached
example file). The bug appears with gcc-3.4.0 and gcc-3.4.1-20040625,
but not with gcc-3.3.3.

~/test/gcc_bug_14776> /opt/gcc-3.4.1-20040625/bin/gcc -v -Wall -march=pentium4
-O2 -save-temps -g -pthread movapd_align_bug_pthread.c -lm
Reading specs from
/opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/specs
Configured with: ../gcc-3.4.1-20040625/configure
--prefix=/opt/gcc-3.4.1-20040625 --enable-threads=posix
--enable-languages=c,c++,java
Thread model: posix
gcc version 3.4.1 20040625 (prerelease)
 /opt/gcc-3.4.1-20040625/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1 -E -quiet -v
-D_REENTRANT movapd_align_bug_pthread.c -march=pentium4 -Wall
-fworking-directory -O2 -o movapd_align_bug_pthread.i
ignoring nonexistent directory
"/opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../i686-pc-linux-gnu/include"

#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /opt/gcc-3.4.1-20040625/include
 /opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/include
 /usr/include
End of search list.
 /opt/gcc-3.4.1-20040625/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1 -fpreprocessed
movapd_align_bug_pthread.i -quiet -dumpbase movapd_align_bug_pthread.c
-march=pentium4 -auxbase movapd_align_bug_pthread -g -O2 -Wall -version -o
movapd_align_bug_pthread.s
GNU C version 3.4.1 20040625 (prerelease) (i686-pc-linux-gnu)
	compiled by GNU C version 3.4.1 20040625 (prerelease).
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
 as -V -Qy -o movapd_align_bug_pthread.o movapd_align_bug_pthread.s
GNU assembler version 2.13.90.0.18 (i486-suse-linux) using BFD version
2.13.90.0.18 20030121 (SuSE Linux)
 /opt/gcc-3.4.1-20040625/libexec/gcc/i686-pc-linux-gnu/3.4.1/collect2
--eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o
/usr/lib/crti.o
/opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/crtbegin.o
-L/opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1
-L/opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/../../..
movapd_align_bug_pthread.o -lm -lgcc -lgcc_eh -lpthread -lc -lgcc -lgcc_eh
/opt/gcc-3.4.1-20040625/lib/gcc/i686-pc-linux-gnu/3.4.1/crtend.o
/usr/lib/crtn.o

~/test/gcc_bug_14776> gdb ./a.out
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db
library "/lib/libthread_db.so.1".

(gdb) run
Starting program: /home/seiderer/test/gcc_bug_14776/a.out 
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 5968)]
[New Thread 32769 (LWP 5970)]
[New Thread 16386 (LWP 5971)]
[New Thread 32771 (LWP 5972)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16386 (LWP 5971)]
create_geo_to_topo (xfp=0xbf7ffa6c, lat=1, lon=1)
    at movapd_align_bug_pthread.c:41
41		xfp->m[0].x = -sin_lon;
(gdb) info reg
eax	       0xbf7ff9fc	-1082131972
ecx	       0x401b7d60	1075543392
edx	       0x6	6
ebx	       0xbf7ffa6c	-1082131860
esp	       0xbf7ff9bc	0xbf7ff9bc
ebp	       0xbf7ffa44	0xbf7ffa44
esi	       0xbf7ffbe0	-1082131488
edi	       0x0	0
eip	       0x8048541	0x8048541
eflags	       0x10246	66118
cs	       0x23	35
ss	       0x2b	43
ds	       0x2b	43
es	       0x2b	43
fs	       0x0	0
gs	       0x0	0
(gdb) disassemble 
Dump of assembler code for function create_geo_to_topo:
0x0804848f <create_geo_to_topo+0>:	push   %ebp
0x08048490 <create_geo_to_topo+1>:	mov    %esp,%ebp
0x08048492 <create_geo_to_topo+3>:	push   %ebx
0x08048493 <create_geo_to_topo+4>:	sub    $0x84,%esp
0x08048499 <create_geo_to_topo+10>:	fldl   0x14(%ebp)
0x0804849c <create_geo_to_topo+13>:	mov    0x8(%ebp),%ebx
0x0804849f <create_geo_to_topo+16>:	fstpl  (%esp)
0x080484a2 <create_geo_to_topo+19>:	call   0x80483ac <sin>
0x080484a7 <create_geo_to_topo+24>:	fldl   0x14(%ebp)
0x080484aa <create_geo_to_topo+27>:	fstpl  (%esp)
0x080484ad <create_geo_to_topo+30>:	fstpl  0xffffffb0(%ebp)
0x080484b0 <create_geo_to_topo+33>:	call   0x804836c <cos>
0x080484b5 <create_geo_to_topo+38>:	fstpl  0xffffffa8(%ebp)
0x080484b8 <create_geo_to_topo+41>:	fldl   0xc(%ebp)
0x080484bb <create_geo_to_topo+44>:	fstpl  (%esp)
0x080484be <create_geo_to_topo+47>:	call   0x80483ac <sin>
0x080484c3 <create_geo_to_topo+52>:	fldl   0xc(%ebp)
0x080484c6 <create_geo_to_topo+55>:	fstpl  (%esp)
0x080484c9 <create_geo_to_topo+58>:	fstpl  0xffffffa0(%ebp)
0x080484cc <create_geo_to_topo+61>:	call   0x804836c <cos>
0x080484d1 <create_geo_to_topo+66>:	fldl   0xffffffa8(%ebp)
0x080484d4 <create_geo_to_topo+69>:	fldl   0xffffffb0(%ebp)
0x080484d7 <create_geo_to_topo+72>:	fxch   %st(1)
0x080484d9 <create_geo_to_topo+74>:	fstl   0x18(%ebx)
0x080484dc <create_geo_to_topo+77>:	fchs   
0x080484de <create_geo_to_topo+79>:	fxch   %st(1)
0x080484e0 <create_geo_to_topo+81>:	fchs   
0x080484e2 <create_geo_to_topo+83>:	fxch   %st(1)
0x080484e4 <create_geo_to_topo+85>:	fmull  0xffffffa0(%ebp)
0x080484e7 <create_geo_to_topo+88>:	fxch   %st(1)
0x080484e9 <create_geo_to_topo+90>:	fstl   (%ebx)
0x080484eb <create_geo_to_topo+92>:	fxch   %st(2)
0x080484ed <create_geo_to_topo+94>:	fstl   0x38(%ebx)
0x080484f0 <create_geo_to_topo+97>:	fxch   %st(1)
0x080484f2 <create_geo_to_topo+99>:	fstpl  0x8(%ebx)
0x080484f5 <create_geo_to_topo+102>:	fldl   0xffffffa8(%ebp)
0x080484f8 <create_geo_to_topo+105>:	fldz   
0x080484fa <create_geo_to_topo+107>:	fxch   %st(1)
0x080484fc <create_geo_to_topo+109>:	fmul   %st(2),%st
0x080484fe <create_geo_to_topo+111>:	fxch   %st(3)
0x08048500 <create_geo_to_topo+113>:	fmull  0xffffffa0(%ebp)
0x08048503 <create_geo_to_topo+116>:	fxch   %st(3)
0x08048505 <create_geo_to_topo+118>:	lea    0xffffffd8(%ebp),%eax
0x08048508 <create_geo_to_topo+121>:	fstpl  0x10(%ebx)
0x0804850b <create_geo_to_topo+124>:	fldl   0xffffffb0(%ebp)
0x0804850e <create_geo_to_topo+127>:	fxch   %st(1)
0x08048510 <create_geo_to_topo+129>:	fstl   0x30(%ebx)
0x08048513 <create_geo_to_topo+132>:	fxch   %st(1)
0x08048515 <create_geo_to_topo+134>:	fmulp  %st,%st(2)
0x08048517 <create_geo_to_topo+136>:	fldl   0xffffffa0(%ebp)
0x0804851a <create_geo_to_topo+139>:	fxch   %st(3)
0x0804851c <create_geo_to_topo+141>:	fstpl  0x20(%ebx)
0x0804851f <create_geo_to_topo+144>:	fxch   %st(1)
0x08048521 <create_geo_to_topo+146>:	fstpl  0x28(%ebx)
0x08048524 <create_geo_to_topo+149>:	fxch   %st(1)
0x08048526 <create_geo_to_topo+151>:	fstpl  0x40(%ebx)
0x08048529 <create_geo_to_topo+154>:	fldl   0xc(%ebp)
0x0804852c <create_geo_to_topo+157>:	fldl   0x14(%ebp)
0x0804852f <create_geo_to_topo+160>:	mov    %eax,0x4(%esp)
0x08048533 <create_geo_to_topo+164>:	movsd  0x80486c0,%xmm0
0x0804853b <create_geo_to_topo+172>:	lea    0xffffffb8(%ebp),%eax
0x0804853e <create_geo_to_topo+175>:	fstpl  0xffffffc0(%ebp)
0x08048541 <create_geo_to_topo+178>:	movapd %xmm0,0xffffff88(%ebp)
0x08048546 <create_geo_to_topo+183>:	fstpl  0xffffffb8(%ebp)
0x08048549 <create_geo_to_topo+186>:	fstpl  0xffffffc8(%ebp)
0x0804854c <create_geo_to_topo+189>:	mov    %eax,(%esp)
0x0804854f <create_geo_to_topo+192>:	call   0x8048474 <geo_lla_xyz>
0x08048554 <create_geo_to_topo+197>:	fldl   0xffffffd8(%ebp)
0x08048557 <create_geo_to_topo+200>:	fchs   
0x08048559 <create_geo_to_topo+202>:	fstpl  0x48(%ebx)
0x0804855c <create_geo_to_topo+205>:	add    $0x84,%esp
0x08048562 <create_geo_to_topo+211>:	pop    %ebx
0x08048563 <create_geo_to_topo+212>:	pop    %ebp
0x08048564 <create_geo_to_topo+213>:	ret    
End of assembler dump.
(gdb) quit
The program is running.  Exit anyway? (y or n) 

~/test/gcc_bug_14776> /lib/libc.so.6 
GNU C Library stable release version 2.3.2, by Roland McGrath et al.
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 3.3 20030226 (prerelease) (SuSE Linux).
Compiled on a Linux 2.4.20 system on 2003-03-13.
Available extensions:
	GNU libio by Per Bothner
	crypt add-on version 2.1 by Michael Glad and others
	linuxthreads-0.10 by Xavier Leroy
	NoVersion patch for broken glibc 2.0 binaries
	BIND-8.2.3-T5B
	libthread_db work sponsored by Alpha Processor Inc
	NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
Report bugs using the `glibcbug' script to <bugs@gnu.org>.
Comment 11 Peter Seiderer 2004-07-06 12:16:19 UTC
The previous attached example programm compiled with gcc-3.4.1 gives
a Segmentation fault too (the same with 3.4.0 and gcc-3.4.1-20040625, not
with gcc-3.3.3).

This happens because of a missaligned 'movapd %xmm0,0xffffff88(%ebp)'.

Reading specs from /opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/specs
Configured with: ../gcc-3.4.1/configure --prefix=/opt/gcc-3.4.1
--enable-threads=posix --enable-languages=c,c++,java
Thread model: posix
gcc version 3.4.1
 /opt/gcc-3.4.1/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1 -E -quiet -v
-D_REENTRANT movapd_align_bug_pthread.c -march=pentium4 -Wall
-fworking-directory -O2 -o movapd_align_bug_pthread.i
ignoring nonexistent directory
"/opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /opt/gcc-3.4.1/include
 /opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/include
 /usr/include
End of search list.
 /opt/gcc-3.4.1/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1 -fpreprocessed
movapd_align_bug_pthread.i -quiet -dumpbase movapd_align_bug_pthread.c
-march=pentium4 -auxbase movapd_align_bug_pthread -g -O2 -Wall -version -o
movapd_align_bug_pthread.s
GNU C version 3.4.1 (i686-pc-linux-gnu)
	compiled by GNU C version 3.4.1.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
 as -V -Qy -o movapd_align_bug_pthread.o movapd_align_bug_pthread.s
GNU assembler version 2.13.90.0.18 (i486-suse-linux) using BFD version
2.13.90.0.18 20030121 (SuSE Linux)
 /opt/gcc-3.4.1/libexec/gcc/i686-pc-linux-gnu/3.4.1/collect2 --eh-frame-hdr -m
elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o
/opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/crtbegin.o
-L/opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1
-L/opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/../../..
movapd_align_bug_pthread.o -lm -lgcc -lgcc_eh -lpthread -lc -lgcc -lgcc_eh
/opt/gcc-3.4.1/lib/gcc/i686-pc-linux-gnu/3.4.1/crtend.o /usr/lib/crtn.o
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db library
"/lib/libthread_db.so.1".

(gdb) run
Starting program: /home/seiderer/test/gcc_bug_14776/a.out 
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 11430)]
[New Thread 32769 (LWP 11432)]
[New Thread 16386 (LWP 11433)]
[New Thread 32771 (LWP 11434)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16386 (LWP 11433)]
create_geo_to_topo (xfp=0xbf7ffa6c, lat=1, lon=1)
    at movapd_align_bug_pthread.c:41
41		xfp->m[0].x = -sin_lon;
(gdb) info reg
eax            0xbf7ff9fc	-1082131972
ecx            0x401b7d60	1075543392
edx            0x6	6
ebx            0xbf7ffa6c	-1082131860
esp            0xbf7ff9bc	0xbf7ff9bc
ebp            0xbf7ffa44	0xbf7ffa44
esi            0xbf7ffbe0	-1082131488
edi            0x0	0
eip            0x8048541	0x8048541
eflags         0x10246	66118
cs             0x23	35
ss             0x2b	43
ds             0x2b	43
es             0x2b	43
fs             0x0	0
gs             0x0	0
(gdb) disassemble 
Dump of assembler code for function create_geo_to_topo:
0x0804848f <create_geo_to_topo+0>:	push   %ebp
0x08048490 <create_geo_to_topo+1>:	mov    %esp,%ebp
0x08048492 <create_geo_to_topo+3>:	push   %ebx
0x08048493 <create_geo_to_topo+4>:	sub    $0x84,%esp
0x08048499 <create_geo_to_topo+10>:	fldl   0x14(%ebp)
0x0804849c <create_geo_to_topo+13>:	mov    0x8(%ebp),%ebx
0x0804849f <create_geo_to_topo+16>:	fstpl  (%esp)
0x080484a2 <create_geo_to_topo+19>:	call   0x80483ac <sin>
0x080484a7 <create_geo_to_topo+24>:	fldl   0x14(%ebp)
0x080484aa <create_geo_to_topo+27>:	fstpl  (%esp)
0x080484ad <create_geo_to_topo+30>:	fstpl  0xffffffb0(%ebp)
0x080484b0 <create_geo_to_topo+33>:	call   0x804836c <cos>
0x080484b5 <create_geo_to_topo+38>:	fstpl  0xffffffa8(%ebp)
0x080484b8 <create_geo_to_topo+41>:	fldl   0xc(%ebp)
0x080484bb <create_geo_to_topo+44>:	fstpl  (%esp)
0x080484be <create_geo_to_topo+47>:	call   0x80483ac <sin>
0x080484c3 <create_geo_to_topo+52>:	fldl   0xc(%ebp)
0x080484c6 <create_geo_to_topo+55>:	fstpl  (%esp)
0x080484c9 <create_geo_to_topo+58>:	fstpl  0xffffffa0(%ebp)
0x080484cc <create_geo_to_topo+61>:	call   0x804836c <cos>
0x080484d1 <create_geo_to_topo+66>:	fldl   0xffffffa8(%ebp)
0x080484d4 <create_geo_to_topo+69>:	fldl   0xffffffb0(%ebp)
0x080484d7 <create_geo_to_topo+72>:	fxch   %st(1)
0x080484d9 <create_geo_to_topo+74>:	fstl   0x18(%ebx)
0x080484dc <create_geo_to_topo+77>:	fchs   
0x080484de <create_geo_to_topo+79>:	fxch   %st(1)
0x080484e0 <create_geo_to_topo+81>:	fchs   
0x080484e2 <create_geo_to_topo+83>:	fxch   %st(1)
0x080484e4 <create_geo_to_topo+85>:	fmull  0xffffffa0(%ebp)
0x080484e7 <create_geo_to_topo+88>:	fxch   %st(1)
0x080484e9 <create_geo_to_topo+90>:	fstl   (%ebx)
0x080484eb <create_geo_to_topo+92>:	fxch   %st(2)
0x080484ed <create_geo_to_topo+94>:	fstl   0x38(%ebx)
0x080484f0 <create_geo_to_topo+97>:	fxch   %st(1)
0x080484f2 <create_geo_to_topo+99>:	fstpl  0x8(%ebx)
0x080484f5 <create_geo_to_topo+102>:	fldl   0xffffffa8(%ebp)
0x080484f8 <create_geo_to_topo+105>:	fldz   
0x080484fa <create_geo_to_topo+107>:	fxch   %st(1)
0x080484fc <create_geo_to_topo+109>:	fmul   %st(2),%st
0x080484fe <create_geo_to_topo+111>:	fxch   %st(3)
0x08048500 <create_geo_to_topo+113>:	fmull  0xffffffa0(%ebp)
0x08048503 <create_geo_to_topo+116>:	fxch   %st(3)
0x08048505 <create_geo_to_topo+118>:	lea    0xffffffd8(%ebp),%eax
0x08048508 <create_geo_to_topo+121>:	fstpl  0x10(%ebx)
0x0804850b <create_geo_to_topo+124>:	fldl   0xffffffb0(%ebp)
0x0804850e <create_geo_to_topo+127>:	fxch   %st(1)
0x08048510 <create_geo_to_topo+129>:	fstl   0x30(%ebx)
0x08048513 <create_geo_to_topo+132>:	fxch   %st(1)
0x08048515 <create_geo_to_topo+134>:	fmulp  %st,%st(2)
0x08048517 <create_geo_to_topo+136>:	fldl   0xffffffa0(%ebp)
0x0804851a <create_geo_to_topo+139>:	fxch   %st(3)
0x0804851c <create_geo_to_topo+141>:	fstpl  0x20(%ebx)
0x0804851f <create_geo_to_topo+144>:	fxch   %st(1)
0x08048521 <create_geo_to_topo+146>:	fstpl  0x28(%ebx)
0x08048524 <create_geo_to_topo+149>:	fxch   %st(1)
0x08048526 <create_geo_to_topo+151>:	fstpl  0x40(%ebx)
0x08048529 <create_geo_to_topo+154>:	fldl   0xc(%ebp)
0x0804852c <create_geo_to_topo+157>:	fldl   0x14(%ebp)
0x0804852f <create_geo_to_topo+160>:	mov    %eax,0x4(%esp)
0x08048533 <create_geo_to_topo+164>:	movsd  0x80486c0,%xmm0
0x0804853b <create_geo_to_topo+172>:	lea    0xffffffb8(%ebp),%eax
0x0804853e <create_geo_to_topo+175>:	fstpl  0xffffffc0(%ebp)
0x08048541 <create_geo_to_topo+178>:	movapd %xmm0,0xffffff88(%ebp)
0x08048546 <create_geo_to_topo+183>:	fstpl  0xffffffb8(%ebp)
0x08048549 <create_geo_to_topo+186>:	fstpl  0xffffffc8(%ebp)
0x0804854c <create_geo_to_topo+189>:	mov    %eax,(%esp)
0x0804854f <create_geo_to_topo+192>:	call   0x8048474 <geo_lla_xyz>
0x08048554 <create_geo_to_topo+197>:	fldl   0xffffffd8(%ebp)
0x08048557 <create_geo_to_topo+200>:	fchs   
0x08048559 <create_geo_to_topo+202>:	fstpl  0x48(%ebx)
0x0804855c <create_geo_to_topo+205>:	add    $0x84,%esp
0x08048562 <create_geo_to_topo+211>:	pop    %ebx
0x08048563 <create_geo_to_topo+212>:	pop    %ebp
0x08048564 <create_geo_to_topo+213>:	ret    
End of assembler dump.
(gdb) quit
The program is running.  Exit anyway? (y or n) 
Comment 12 Brian Ford 2004-09-22 17:37:17 UTC
Fixed in Cygwin by:
http://www.cygwin.com/ml/cygwin-cvs/2004-q2/msg00124.html
for single threaded executables, and by:
http://www.cygwin.com/ml/cygwin-cvs/2004-q2/msg00108.html
for multi threaded ones.  Win32 callbacks would still be an issue, though.
Comment 13 Andrew Pinski 2004-10-11 12:03:54 UTC
*** Bug 17930 has been marked as a duplicate of this bug. ***
Comment 14 Frank Otto 2004-10-11 12:09:04 UTC
(In reply to comment #13)
> *** Bug 17930 has been marked as a duplicate of this bug. ***

However, Bug 17930 might still be worth looking at, since it
includes a fortran source file with which you can easily reproduce
this bug, even on non-cygwin targets.
Comment 15 Frank Otto 2004-10-11 12:29:53 UTC
Created attachment 7324 [details]
fortran source for a different test case

This is the Fortran77 source for a different test case
which reproduces this bug with gcc-3.4.1 and gcc-3.4.2 on
target i486-pc-linux-gnu.

(At least bugzilla says it's the same bug, so I'm posting
this here, too.)

However, I was just able to check with a gcc-3.5 snapshot;
for this testcase, the bug seems to be resolved. Details:

frank:gccbug> gfortran -O -msse2 -mfpmath=sse -g -Wall gccbug.f -v
Driving: gfortran -O -msse2 -mfpmath=sse -g -Wall gccbug.f -v -lgfortranbegin
-lgfortran -lm -shared-libgcc
Reading specs from /usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/3.5.0/specs
Configured with: ../src/configure -v --enable-languages=c,c++,java,f95,objc
--prefix=/usr/lib/gcc-snapshot --enable-shared --with-system-zlib --enable-nls
--enable-threads=posix --without-included-gettext --disable-werror
--enable-__cxa_atexit --enable-libstdcxx-allocator=mt --enable-clocale=gnu
--enable-libstdcxx-debug --enable-java-gc=boehm --enable-java-awt=gtk
i486-linux-gnu
Thread model: posix
gcc version 3.5.0 20040717 (experimental)
 /usr/lib/gcc-snapshot/libexec/gcc/i486-linux-gnu/3.5.0/f951 gccbug.f
-ffixed-form -quiet -dumpbase gccbug.f -msse2 -mfpmath=sse -mtune=i486 -auxbase
gccbug -g -O -Wall -version -o /tmp/cc16CTBA.s
GNU F95 version 3.5.0 20040717 (experimental) (i486-linux-gnu)
	compiled by GNU C version 3.5.0 20040717 (experimental).
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
 In file gccbug.f:25

      return
	   1
Warning: Extension: RETURN statement in main program at (1)
gccbug.f: In function `choleskyzhp':
gccbug.f:53: warning: 'y' may be used uninitialized in this function
 as -V -Qy -o /tmp/cciP9Rce.o /tmp/cc16CTBA.s
GNU assembler version 2.15 (i386-linux) using BFD version 2.15
 /usr/lib/gcc-snapshot/libexec/gcc/i486-linux-gnu/3.5.0/collect2 --eh-frame-hdr
-m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o
/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/3.5.0/crtbegin.o
-L/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/3.5.0
-L/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/3.5.0/../../.. /tmp/cciP9Rce.o
-lgfortranbegin -lgfortran -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc
/usr/lib/gcc-snapshot/lib/gcc/i486-linux-gnu/3.5.0/crtend.o /usr/lib/crtn.o

With gcc-3.5, my program runs correctly.
Comment 16 Wolfgang Bangerth 2004-12-13 20:18:52 UTC
Here is a small testcase that segfaults in a movapd instruction 
when compiled with the 3.4 branch: 
---------------------------- 
      subroutine choleskyzhp (a) 
      integer    i,j,k 
      complex*16 a(5,5),x 
 
      do j = 1,5 
         do i = j,5 
            do k = 1,j-1 
               x = a(i,k)*dconjg(a(j,k)) 
            enddo 
            a(i,j) = x 
         enddo 
      enddo 
 
      return 
      end 
 
 
      program test 
      complex*16 work(5,5,2) 
      call choleskyzhp(work(1,1,2)) 
      return 
      end 
-------------------------- 
 
g/x> /home/bangerth/bin/gcc-3.4.*-pre/bin/g77 -O -msse2 -g x.f ; ./a.out  
 
g/x> /home/bangerth/bin/gcc-3.4.*-pre/bin/g77 -O -msse2 -mfpmath=sse -g 
x.f ; ./a.out  
Segmentation fault 
 
(gdb) r 
`/home/bangerth/tmp/g/x/a.out' has changed; re-reading symbols. 
Starting program: /home/bangerth/tmp/g/x/a.out  
 
Program received signal SIGSEGV, Segmentation fault. 
0x08048636 in choleskyzhp_ (a=0xbfffe920) at x.f:12 
(gdb) disass 
Dump of assembler code for function choleskyzhp_: 
[...] 
0x08048636 <choleskyzhp_+34>:	movapd %xmm0,0xffffffc8(%ebp) 
 
 
Given that g77 doesn't exist any more on mainline, I don't know how to 
reproduce this bug there. Maybe some of the other dups listed in PR 17990 
can help. I'll take a look. 
 
W. 
Comment 17 Wolfgang Bangerth 2004-12-13 20:20:56 UTC
I should probably say that my installation is a little dusty already: 
  gcc version 3.4.3 20041015 (prerelease) 
Maybe someone with a newer version can try to reproduce this. 
W. 
Comment 18 Wolfgang Bangerth 2004-12-13 21:10:45 UTC
In case someone wondered about the mapping of the rank-3 array to the rank-2 
array in my previous testcase: here is something even simpler: 
========================== 
      subroutine choleskyzhp ()  
      integer    i,j,k  
      complex*16 a(500),x  
  
      do i = 1,5  
         do j = 1,5  
            do k = 1,j 
               x = a(i*5+j)*dconjg(a(j*5+j)) 
            enddo  
            a(i*5+k) = x  
         enddo  
      enddo  
  
      return  
      end  
  
  
      program test  
      call choleskyzhp 
      return  
      end  
============================ 
It fails in the same way as my previous testcase, i.e. in an movapd 
instruction. 
 
W. 
Comment 19 Richard Henderson 2004-12-14 00:37:11 UTC
According to comment #12, this was a cygwin bug wrt the initial alignment
of the main and thread stacks.  Even the test program in comment #10, which
is supposed to be reproducible for linux is not reproducible here with
Fedora Core 3.  I can only assume that SuSe 8.3 ships a broken thread library.

Marking PR17930 as a duplicate was a mistake.  While the symptom is the same,
this one has not shown to be a gcc bug at all.

If you find a new test case that you think shows this problem again, first
thing to do is validate that the stack is correctly aligned by the system.
Note that the return address for the function should be at address 12 modulo 16.
If that's true, then file another bug; this one's a bit disjointed already.
Comment 20 Wolfgang Bangerth 2004-12-14 03:56:33 UTC
My testcase was actually for x86 linux, but since you fixed it in 
that other PR today, I assume that the same bug triggered but PRs, 
so closing this one should be ok. 
 
W.