Bug 52736

Summary: [4.5/4.6/4.7/4.8 Regression] miscompilation: store to aliased __m128d is 8 Bytes off
Product: gcc Reporter: Matthias Kretz <kretz>
Component: targetAssignee: Jakub Jelinek <jakub>
Status: RESOLVED FIXED    
Severity: normal CC: jakub
Priority: P3    
Version: 4.7.0   
Target Milestone: 4.5.4   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2012-03-27 00:00:00
Attachments: testcase
-S output
gcc48-pr52736.patch

Description Matthias Kretz 2012-03-27 06:40:22 UTC
Created attachment 27007 [details]
testcase

The attached testcase fails when compiled with:
g++ -msse2 -O1 main.cpp

The first aliased store on line 35 stores to the second element in the vector instead of the fist. The store on line 37 stores to the exact same location in memory (which is the correct location for line 37).

Slight modification of the example can lead to the compiler optimizing the code to use half vector loads, in which case it uses the correct one.
Comment 1 Richard Biener 2012-03-27 07:42:36 UTC
I cannot reproduce this on trunk.  What is your target?  Please provide the
output of the compiler command when you append 'v'.
Comment 2 Matthias Kretz 2012-03-27 08:00:47 UTC
Using built-in specs.
COLLECT_GCC=/opt/gcc-4.7.0/bin/g++
COLLECT_LTO_WRAPPER=/opt/gcc-4.7.0/libexec/gcc/x86_64-linux-gnu/4.7.0/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ./configure --prefix=/opt/gcc-4.7.0 --build=x86_64-linux-gnu --host=x86_64-linux-gnu --enable-languages=c,c++,fortran --with-gmp=/opt/gcc-4.7.0 --with-mpfr=/opt/gcc-4.7.0 --with-ppl=/opt/gcc-4.7.0 --with-cloog=/opt/gcc-4.7.0 --with-libelf=/opt/gcc-4.7.0 --with-mpc=/opt/gcc-4.7.0 --enable-lto
Thread model: posix
gcc version 4.7.0 (GCC) 
COLLECT_GCC_OPTIONS='-v' '-Wall' '-Wextra' '-msse2' '-O1' '-o' 'gcc-bug' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
 /opt/gcc-4.7.0/libexec/gcc/x86_64-linux-gnu/4.7.0/cc1plus -quiet -v -imultiarch x86_64-linux-gnu -D_GNU_SOURCE gcc-bug.cpp -quiet -dumpbase gcc-bug.cpp -msse2 -mtune=generic -march=x86-64 -auxbase gcc-bug -O1 -Wall -Wextra -version -o /tmp/ccfqD6uO.s
GNU C++ (GCC) version 4.7.0 (x86_64-linux-gnu)
        compiled by GNU C version 4.7.0, GMP version 5.0.4, MPFR version 3.1.0, MPC version 0.9
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../../include/c++/4.7.0
 /opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../../include/c++/4.7.0/x86_64-linux-gnu
 /opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../../include/c++/4.7.0/backward
 /opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/include
 /usr/local/include
 /opt/gcc-4.7.0/include
 /opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C++ (GCC) version 4.7.0 (x86_64-linux-gnu)
        compiled by GNU C version 4.7.0, GMP version 5.0.4, MPFR version 3.1.0, MPC version 0.9
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 1ed577a2a8985e4282c7df02a5129e15
COLLECT_GCC_OPTIONS='-v' '-Wall' '-Wextra' '-msse2' '-O1' '-o' 'gcc-bug' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
 /opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../../x86_64-linux-gnu/bin/as --64 -o /tmp/ccJeVBlU.o /tmp/ccfqD6uO.s
COMPILER_PATH=/opt/gcc-4.7.0/libexec/gcc/x86_64-linux-gnu/4.7.0/:/opt/gcc-4.7.0/libexec/gcc/x86_64-linux-gnu/4.7.0/:/opt/gcc-4.7.0/libexec/gcc/x86_64-linux-gnu/:/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/:/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/:/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../../x86_64-linux-gnu/bin/
LIBRARY_PATH=/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/:/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../../lib64/:/lib/x86_64-linux-gnu/:/lib/../lib64/:/usr/lib/x86_64-linux-gnu/:/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../../x86_64-linux-gnu/lib/:/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-Wall' '-Wextra' '-msse2' '-O1' '-o' 'gcc-bug' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
 /opt/gcc-4.7.0/libexec/gcc/x86_64-linux-gnu/4.7.0/collect2 --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o gcc-bug /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o /opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/crtbegin.o -L/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0 -L/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../../../x86_64-linux-gnu/lib -L/opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/../../.. /tmp/ccJeVBlU.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /opt/gcc-4.7.0/lib/gcc/x86_64-linux-gnu/4.7.0/crtend.o /usr/lib/x86_64-linux-gnu/crtn.o


I will also add the output of -S. Line 16 is the broken one. Instead of "movsd %xmm0, 8(%rsp)" it should be "movsd %xmm0, %(rsp)".
Comment 3 Matthias Kretz 2012-03-27 08:01:23 UTC
Created attachment 27010 [details]
-S output
Comment 4 Jakub Jelinek 2012-03-27 13:54:34 UTC
Looking at it.
Comment 5 Jakub Jelinek 2012-03-27 14:51:17 UTC
The bug seems to have been introduced in gcc 4.0.
Comment 6 Jakub Jelinek 2012-03-27 14:52:13 UTC
Created attachment 27015 [details]
gcc48-pr52736.patch

Untested fix.
Comment 7 Matthias Kretz 2012-03-27 16:58:42 UTC
With the patch my unit test passes again with 4.7.0. In my code I don't see any regressions from the patch.
Comment 8 Matthias Kretz 2012-03-27 17:02:59 UTC
I might have been too fast. On the Intel Sandy-Bridge, where I debugged the problem first things have improved. On an AMD Magny-Cours I get lots of failures. I need to investigate whether I broke something unrelated to this patch.
Comment 9 Matthias Kretz 2012-03-27 21:20:49 UTC
All good. The error is mine. I don't see any regressions from the use of the patch.
Comment 10 Jakub Jelinek 2012-03-28 08:01:17 UTC
Author: jakub
Date: Wed Mar 28 08:01:00 2012
New Revision: 185904

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=185904
Log:
	PR target/52736
	* config/i386/sse.md (sse2_loadlpd splitter): Use offset 0
	instead of 8 in adjust_address.

	* gcc.target/i386/pr52736.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr52736.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/sse.md
    trunk/gcc/testsuite/ChangeLog
Comment 11 Jakub Jelinek 2012-03-28 08:03:41 UTC
Author: jakub
Date: Wed Mar 28 08:03:11 2012
New Revision: 185905

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=185905
Log:
	PR target/52736
	* config/i386/sse.md (sse2_loadlpd splitter): Use offset 0
	instead of 8 in adjust_address.

	* gcc.target/i386/pr52736.c: New test.

Added:
    branches/gcc-4_7-branch/gcc/testsuite/gcc.target/i386/pr52736.c
Modified:
    branches/gcc-4_7-branch/gcc/ChangeLog
    branches/gcc-4_7-branch/gcc/config/i386/sse.md
    branches/gcc-4_7-branch/gcc/testsuite/ChangeLog
Comment 12 Jakub Jelinek 2012-03-28 08:10:25 UTC
Author: jakub
Date: Wed Mar 28 08:09:55 2012
New Revision: 185906

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=185906
Log:
	PR target/52736
	* config/i386/sse.md (sse2_loadlpd splitter): Use offset 0
	instead of 8 in adjust_address.

	* gcc.target/i386/pr52736.c: New test.

Added:
    branches/gcc-4_6-branch/gcc/testsuite/gcc.target/i386/pr52736.c
Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/config/i386/sse.md
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
Comment 13 Jakub Jelinek 2012-03-28 08:15:36 UTC
Fixed for 4.6+ (note, even in 4.6 the bug is just latent).
Comment 14 Matthias Kretz 2012-03-28 09:03:05 UTC
Thanks! Great response time!

And yes, I started to notice the bug on 4.6.x now, too. In different unit tests, though. I'll see if I can come up with a workaround for the "broken" compilers.