The compiler generates unaligned access for Polyhedron channel.f90 test when compiled with -O2 -mtune=k8: /home/uros/gcc-build/gcc/gfortran -B/home/uros/gcc-build/gcc -B/home/uros/gcc-build/x86_64-unknown-linux-gnu/libgfortran/ -B/home/uros/gcc-build/x86_64-unknown-linux-gnu/libquadmath/.libs -L/home/uros/gcc-build/x86_64-unknown-linux-gnu/libgfortran/.libs -L-L/home/uros/gcc-build/x86_64-unknown-linux-gnu/libquadmath/.libs -O2 -ffast-math -mtune=k8 channel.f90 LD_LIBRARY_PATH=/home/uros/gcc-build/gcc:/home/uros/gcc-build/x86_64-unknown-linux-gnu/libgfortran/.libs:/home/uros/gcc-build/x86_64-unknown-linux-gnu/libquadmath/.libs ./a.out ... Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: #0 0x2AC9058B3C27 #1 0x2AC9058B2E20 #2 0x31CDC3002F #3 0x402AFB in ddx at channel.f90:247 Segmentation fault (gdb) r ... Program received signal SIGSEGV, Segmentation fault. 0x0000000000402afb in ddx (array=..., __result=...) at channel.f90:247 247 ddx(2:I-1,1:J) = array(3:I,1:J)-array(1:I-2,1:J) ! interior points (gdb) bt #0 0x0000000000402afb in ddx (array=..., __result=...) at channel.f90:247 #1 sw () at channel.f90:148 #2 0x0000000000404cfd in main (argc=<optimized out>, argv=<optimized out>) at channel.f90:234 #3 0x00000031cdc1d9f4 in __libc_start_main () from /lib64/libc.so.6 #4 0x0000000000400909 in _start () (gdb) disass ... 0x0000000000402acc <+7644>: movaps %xmm4,-0x20(%rdx) 0x0000000000402ad0 <+7648>: movlpd -0x10(%rax),%xmm0 0x0000000000402ad5 <+7653>: movhpd -0x8(%rax),%xmm0 0x0000000000402ada <+7658>: movapd %xmm0,%xmm4 0x0000000000402ade <+7662>: subpd %xmm3,%xmm4 0x0000000000402ae2 <+7666>: movaps %xmm4,-0x10(%rdx) 0x0000000000402ae6 <+7670>: cmp $0xf4,%rdi 0x0000000000402aed <+7677>: jne 0x402a71 <sw+7553> 0x0000000000402aef <+7679>: lea 0x20(%rcx),%rdi 0x0000000000402af3 <+7683>: add $0x20,%rax 0x0000000000402af7 <+7687>: add $0x20,%rdx => 0x0000000000402afb <+7691>: movapd -0x20(%rax),%xmm3 0x0000000000402b00 <+7696>: mov $0xf6,%r11d 0x0000000000402b06 <+7702>: xor %ecx,%ecx 0x0000000000402b08 <+7704>: movapd %xmm3,%xmm4 0x0000000000402b0c <+7708>: subpd %xmm0,%xmm4 ... (gdb) p/x $rax $3 = 0x8c5a38 The failure can also be seen on SUSE gcc tester [1]. [1] http://gcc.opensuse.org/c++bench/polyhedron/polyhedron-performance-latest
Created attachment 35042 [details] Testcase from Polyhedron testsuite
The problematic instruction (insn 1717) is generated from: ;; vect__1095.501_3524 = MEM[base: vectp.499_3571, offset: 0B]; (insn 1717 1716 0 (set (reg:V2DF 1511 [ vect__1095.501 ]) (mem:V2DF (reg/f:DI 638 [ vectp.499 ]) [7 MEM[base: vectp.499_3571, offset: 0B]+0 S16 A256])) channel.f90:247 -1 (nil))
Confirmed between r220156 (2015-01-27, OK) and r220302 (2015-01-31, segfault). I am not sure this is a fortran problem (no segfault if the code is compiled with '-O3 -fno-tree-vectorize -mtune=k8').
(In reply to Uroš Bizjak from comment #0) > The compiler generates unaligned access for Polyhedron channel.f90 test when > compiled with -O2 -mtune=k8: Whoops, this should read "-O3 -mtune=k8". > > /home/uros/gcc-build/gcc/gfortran -B/home/uros/gcc-build/gcc > -B/home/uros/gcc-build/x86_64-unknown-linux-gnu/libgfortran/ > -B/home/uros/gcc-build/x86_64-unknown-linux-gnu/libquadmath/.libs > -L/home/uros/gcc-build/x86_64-unknown-linux-gnu/libgfortran/.libs > -L-L/home/uros/gcc-build/x86_64-unknown-linux-gnu/libquadmath/.libs -O2 > -ffast-math -mtune=k8 channel.f90 And here. Correct flags are "-O3 -mtune=k8". Everything reported is compiled with these two flags only.
Confirmed: see comment 3.
Most likely r220244, but have to verify that.
Also crashes with -mtune=amdfam10. But in this case it even crashes when compiled with 4.9.2.
Works fine with -fwrapv...
> Also crashes with -mtune=amdfam10. But in this case it even > crashes when compiled with 4.9.2. Revision r204000 (2013-10-24) is OK, r204945 (2013-11-18) is not. > Works fine with -fwrapv... Confirmed.
(In reply to Dominique d'Humieres from comment #9) > > Also crashes with -mtune=amdfam10. But in this case it even > > crashes when compiled with 4.9.2. > > Revision r204000 (2013-10-24) is OK, r204945 (2013-11-18) is not. > > > Works fine with -fwrapv... > > Confirmed. r204257 in particular.
(In reply to Dominique d'Humieres from comment #9) > > Also crashes with -mtune=amdfam10. But in this case it even > > crashes when compiled with 4.9.2. > > Revision r204000 (2013-10-24) is OK, r204945 (2013-11-18) is not. Probably the cause of failure for "channel" testcase at SUSE's amdfam10 tester [1], but nobody noticed. [1] http://gcc.opensuse.org/c++bench-frescobaldi/polyhedron/polyhedron-summary.txt-2-0.html
So, I believe it is incorrect ALIGN info as can be seen in the -fdump-tree-all-alias dumps. Seems with current trunk on x86_64-linux and -g -quiet -mtune=amdfam10 -O3 pr65450.f90 the problematic memory load which should have been movupd and is movapd instead is using _3571: # ALIGN = 32, MISALIGN = 0 vectp.499_3571 = vectp.499_1481 + 64; which has been created by aprefetch pass from: # ALIGN = 32, MISALIGN = 0 vectp.499_135 = vectp.499_1480 + 16; which has been created by pcom pass from: # ALIGN = 32, MISALIGN = 0 vectp.499_1480 = vectp.499_1481 + 16; which has been created during vectorization by vect_create_data_ref_ptr -> create_iv -> make_ssa_name for dr: #(Data Ref: # bb: 43 # stmt: _1095 = MEM[(real(kind=8)[0:D.3649] *)_558][_1094]; # ref: MEM[(real(kind=8)[0:D.3649] *)_558][_1094]; # base_object: MEM[(real(kind=8)[0:D.3649] *)_558]; # Access function 0: {_1086 + 3, +, 1}_41 #) - _1480 is indx_after_incr. The base_object is indeed 32 byte aligned: # RANGE [-64424509440000, 60000] NONZERO 18446744073709551600 _557 = _556 * 30000; # PT = { D.3692 } (nonlocal) # ALIGN = 32, MISALIGN = 0 _558 = &u[_557]; - u is a common aligned to 32 bytes: static real(kind=8) u[90000]; and 30000 is divisible by 16, and it is ARRAY_REF, so the offset from &u[0] is a multiple of 128 bytes. But that doesn't tell anything about what values _1094 can have. I see vect_create_addr_base_for_vector_ref already always overrides the align/misalign info after duplicating DR_PTR_INFO, and so does bump_vector_ptr, but vect_create_data_ref_ptr trusts DR_PTR_INFO. But from what I can understand, it is just the points to info, and alignment info in there is solely for the base address, but not necessarily for the whole DR. Richard, do you agree? Now the question is what we can do here, if in all the spots in vect_create_data_ref_ptr we should just set it to unknown alignment, or if we can do better (and how).
Created attachment 35047 [details] gcc5-pr65450.patch Fix that passed bootstrap/regtest on x86_64-linux and i686-linux. It is I think conservatively correct, on the other side we don't have any ccp pass after vectorization that could recompute that info, only vrp2 which doesn't do zero bits computation, so perhaps we should try harder.
Created attachment 35053 [details] gcc5-pr65450.patch Updated (but this time untested except on the testcase) patch, that tries to preserve the alignment info in DR_PTR_INFO from DR_MISALIGNMENT. Will see if it is possible to reduce the testcase somewhat.
Created attachment 35055 [details] gcc5-pr65450.patch And now with the reduced testcase, going to bootstrap/regtest it now.
I think DR_PTR_INFO is indeed for the base object and was supposed to preserve points-to-info only. Patch looks ok.
Author: jakub Date: Wed Mar 18 13:54:12 2015 New Revision: 221490 URL: https://gcc.gnu.org/viewcvs?rev=221490&root=gcc&view=rev Log: PR tree-optimization/65450 * tree-vect-data-refs.c (vect_duplicate_ssa_name_ptr_info): New function. (vect_create_addr_base_for_vector_ref, vect_create_data_ref_ptr): Use it instead of duplicate_ssa_name_ptr_info. * gfortran.dg/pr65450.f90: New test. Added: trunk/gcc/testsuite/gfortran.dg/pr65450.f90 Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-data-refs.c
Fixed on the trunk so far.
Author: jakub Date: Wed Jun 3 15:27:43 2015 New Revision: 224087 URL: https://gcc.gnu.org/viewcvs?rev=224087&root=gcc&view=rev Log: Backported from mainline 2015-03-18 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/65450 * tree-vect-data-refs.c (vect_duplicate_ssa_name_ptr_info): New function. (vect_create_addr_base_for_vector_ref, vect_create_data_ref_ptr): Use it instead of duplicate_ssa_name_ptr_info. * gfortran.dg/pr65450.f90: New test. Added: branches/gcc-4_9-branch/gcc/testsuite/gfortran.dg/pr65450.f90 Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/testsuite/ChangeLog branches/gcc-4_9-branch/gcc/tree-vect-data-refs.c
Fixed for 4.9+.