This fortran program fails for both x86_64 and powerpc when an appropriate veclibabi is specified that includes support for optimizing a pow invocation to V2DF variables. On the x86_64 you need to use: -O3 -march=core2 -mavx -ffast-math -mveclibabi=svml On the powerpc you need to use: -O3 -mcpu=power7 -ffast-math -mveclibabi=mass Here is the debugger trace: Current directory is /home/meissner/fsf-build-x86_64/trunk/gcc/ GNU gdb (GDB) Fedora (7.1-34.fc13) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /data/meissner/fsf-build-x86_64/trunk/gcc/f951...done. Breakpoint 1 at 0x61b210: file /home/meissner/fsf-src/trunk/gcc/diagnostic.c, line 878. Breakpoint 2 at 0x61bfd0: file /home/meissner/fsf-src/trunk/gcc/diagnostic.c, line 819. Breakpoint 3 at 0x492d50 Breakpoint 4 at 0x492500 (gdb) r -O3 -quiet -march=core2 -mavx -ffast-math -mveclibabi=svml foo3.f Starting program: /data/meissner/fsf-build-x86_64/trunk/gcc/f951 -O3 -quiet -march=core2 -mavx -ffast-math -mveclibabi=svml foo3.f Program received signal SIGSEGV, Segmentation fault. 0x0000000000975884 in nested_in_vect_loop_p (stmt=0x7ffff05d6f68, gsi=0x7fffffffc5b0, strided_store=<value optimized out>, slp_node=<value optimized out>, slp_node_instance=<value optimized out>) at /home/meissner/fsf-src/trunk/gcc/tree-vectorizer.h:316 Missing separate debuginfos, use: debuginfo-install glibc-2.12-3.x86_64 libgcc-4.4.4-10.fc13.x86_64 libstdc++-4.4.4-10.fc13.x86_64 (gdb) where #0 0x0000000000975884 in nested_in_vect_loop_p (stmt=0x7ffff05d6f68, gsi=0x7fffffffc5b0, strided_store=<value optimized out>, slp_node=<value optimized out>, slp_node_instance=<value optimized out>) at /home/meissner/fsf-src/trunk/gcc/tree-vectorizer.h:316 #1 vect_transform_stmt (stmt=0x7ffff05d6f68, gsi=0x7fffffffc5b0, strided_store=<value optimized out>, slp_node=<value optimized out>, slp_node_instance=<value optimized out>) at /home/meissner/fsf-src/trunk/gcc/tree-vect-stmts.c:4607 #2 0x0000000000982a44 in vect_transform_loop (loop_vinfo=0x185e7f0) at /home/meissner/fsf-src/trunk/gcc/tree-vect-loop.c:4798 #3 0x000000000098e5a3 in vectorize_loops () at /home/meissner/fsf-src/trunk/gcc/tree-vectorizer.c:225 #4 0x00000000007969ef in execute_one_pass (pass=0x12270c0) at /home/meissner/fsf-src/trunk/gcc/passes.c:1573 #5 0x0000000000796c95 in execute_pass_list (pass=0x12270c0) at /home/meissner/fsf-src/trunk/gcc/passes.c:1628 #6 0x0000000000796ca7 in execute_pass_list (pass=0x12272a0) at /home/meissner/fsf-src/trunk/gcc/passes.c:1629 #7 0x0000000000796ca7 in execute_pass_list (pass=0x1226460) at /home/meissner/fsf-src/trunk/gcc/passes.c:1629 #8 0x000000000088f716 in tree_rest_of_compilation (fndecl=0x7ffff0865900) at /home/meissner/fsf-src/trunk/gcc/tree-optimize.c:452 #9 0x0000000000a2d4d6 in cgraph_expand_function (node=0x7ffff065f408) at /home/meissner/fsf-src/trunk/gcc/cgraphunit.c:1469 #10 0x0000000000a2feda in cgraph_expand_all_functions () at /home/meissner/fsf-src/trunk/gcc/cgraphunit.c:1548 #11 cgraph_optimize () at /home/meissner/fsf-src/trunk/gcc/cgraphunit.c:1804 #12 0x0000000000a3043a in cgraph_finalize_compilation_unit () at /home/meissner/fsf-src/trunk/gcc/cgraphunit.c:1012 #13 0x000000000075691d in write_global_declarations () at /home/meissner/fsf-src/trunk/gcc/langhooks.c:310 #14 0x0000000000836f0c in compile_file (argc=8, argv=0x7fffffffc938) at /home/meissner/fsf-src/trunk/gcc/toplev.c:967 #15 do_compile (argc=8, argv=0x7fffffffc938) at /home/meissner/fsf-src/trunk/gcc/toplev.c:2394 #16 toplev_main (argc=8, argv=0x7fffffffc938) at /home/meissner/fsf-src/trunk/gcc/toplev.c:2435 #17 0x0000003db681ec5d in __libc_start_main () from /lib64/libc.so.6 #18 0x00000000004a4309 in _start () (gdb) quit A debugging session is active. Inferior 1 [process 21141] will be killed. Quit anyway? (y or n) y Debugger finished If I compile it without the -mveclibabi= switch it works fine (or on the x86 use -mveclibabi=svml since svml does not provide a pow that supports V2DF arguments).
Created attachment 21826 [details] Fortran program from spec 2006 that shows the bug Fortran program derived from spec 2006 that shows the bug.
reduced testcase: > cat bug.f90 integer index(18),i,j,k,l,ipiv(18),info,ichange,neq,lda,ldb, & nrhs,iplas real*8 ep0(6),al10(18),al20(18),dg0(18),ep(6),al1(18), & al2(18),dg(18),ddg(18),xm(6,18),h(18,18),ck(18),cn(18), & c(18),d(18),phi(18),delta(18),r0(18),q(18),b(18),cphi(18), & q1(18),q2(18),stri(6),htri(18),sg(18),r(42),xmc(6,18),aux(18), & t(42),gl(18,18),gr(18,18),ee(6),c1111,c1122,c1212,dd, & skl(3,3),xmtran(3,3),ddsdde(6,6),xx(6,18) do do i=1,18 htri(i)=dabs(sg(i))-r0(i)-ck(i)*(dg(i)/dtime)**(1.d0/cn(i)) do j=1,18 enddo enddo do if(i.ne.j) then gr(index(i),1)=htri(i) endif call dgesv(neq,nrhs,gl,lda,ipiv,gr,ldb,info) enddo enddo end with slightly different bt: #0 0x00000000009aeb56 in vect_transform_stmt (stmt=0x7f588feacb28, gsi=0x7fffceb29b40, strided_store=0x7fffceb29b7f "", slp_node=0x0, slp_node_instance=<value optimized out>) at /data03/vondele/gcc_trunk/gcc/gcc/tree-vectorizer.h:315 #1 0x00000000009b5dd5 in vect_transform_loop (loop_vinfo=0x14570b0) at /data03/vondele/gcc_trunk/gcc/gcc/tree-vect-loop.c:4797 #2 0x00000000009c9275 in vectorize_loops () at /data03/vondele/gcc_trunk/gcc/gcc/tree-vectorizer.c:225 #3 0x00000000007bb69f in execute_one_pass (pass=0x128d680) at /data03/vondele/gcc_trunk/gcc/gcc/passes.c:1573 #4 0x00000000007bb995 in execute_pass_list (pass=0x128d680) at /data03/vondele/gcc_trunk/gcc/gcc/passes.c:1628 #5 0x00000000007bb9ad in execute_pass_list (pass=0x128d4a0) at /data03/vondele/gcc_trunk/gcc/gcc/passes.c:1629 #6 0x00000000007bb9ad in execute_pass_list (pass=0x128cae0) at /data03/vondele/gcc_trunk/gcc/gcc/passes.c:1629 #7 0x00000000008bfd06 in tree_rest_of_compilation (fndecl=0x7f588ff95d00) at /data03/vondele/gcc_trunk/gcc/gcc/tree-optimize.c:452 #8 0x0000000000a6c0b9 in cgraph_expand_function (node=0x7f588fda4000) at /data03/vondele/gcc_trunk/gcc/gcc/cgraphunit.c:1469 #9 0x0000000000a6fe99 in cgraph_optimize () at /data03/vondele/gcc_trunk/gcc/gcc/cgraphunit.c:1548 #10 0x0000000000a7027d in cgraph_finalize_compilation_unit () at /data03/vondele/gcc_trunk/gcc/gcc/cgraphunit.c:1012 #11 0x000000000077b5af in write_global_declarations () at /data03/vondele/gcc_trunk/gcc/gcc/langhooks.c:310 #12 0x00000000008631d3 in toplev_main (argc=22, argv=0x7fffceb29e98) at /data03/vondele/gcc_trunk/gcc/gcc/toplev.c:984 #13 0x00007f5890738436 in __libc_start_main () from /lib64/libc.so.6
gimple_bb (stmt) returns NULL for that statement (D.1575_33 = __builtin_pow (D.1542_14, D.1574_32)). We can avoid vectorization in such cases, but looks like it should be fixed to return the actual basic block. Ira
(In reply to comment #3) > gimple_bb (stmt) returns NULL for that statement (D.1575_33 = __builtin_pow > (D.1542_14, D.1574_32)). > > We can avoid vectorization in such cases, but looks like it should be fixed to > return the actual basic block. If it returns NULL then that stmt was not properly inserted into the insn stream (or removed). > Ira >
Right. This patch fixes it: Index: tree-vect-stmts.c =================================================================== --- tree-vect-stmts.c (revision 164332) +++ tree-vect-stmts.c (working copy) @@ -4478,6 +4478,7 @@ vect_transform_stmt (gimple stmt, gimple case call_vec_info_type: gcc_assert (!slp_node); done = vectorizable_call (stmt, gsi, &vec_stmt); + stmt = gsi_stmt (*gsi); break; case reduc_vec_info_type: I am going to test it now. Thanks, Ira
Subject: Bug 45714 Author: irar Date: Sun Sep 19 14:23:40 2010 New Revision: 164420 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164420 Log: PR tree-optimization/45714 * tree-vect-stmts.c (vect_transform_stmt): Use a dummy statement created in vectorizable_call instead of the original statement in def stmt updates. Added: trunk/gcc/testsuite/gfortran.dg/vect/pr45714-a.f trunk/gcc/testsuite/gfortran.dg/vect/pr45714-b.f Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-stmts.c
Fixed.
Author: wschmidt Date: Wed Mar 16 18:00:23 2011 New Revision: 171057 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=171057 Log: gcc: Backport from mainline: 2010-09-19 Ira Rosen <irar@il.ibm.com> PR tree-optimization/45714 * tree-vect-stmts.c (vect_transform_stmt): Use a dummy statement created in vectorizable_call instead of the original statement in def stmt updates. Backport from mainline: 2011-03-16 Alan Modra <amodra@gmail.com> PR target/45844 * config/rs6000/rs6000.c (rs6000_legitimize_reload_address): Don't create invalid offset address for vsx splat insn. * config/rs6000/predicates.md (splat_input_operand): New. * config/rs6000/vsx.md (vsx_splat_*): Use it. gcc/testsuite: Backport from mainline: 2010-09-19 Ira Rosen <irar@il.ibm.com> PR tree-optimization/45714 * gfortran.dg/vect/pr45714-a.f: New test. * gfortran.dg/vect/pr45714-b.f: New test. Added: branches/ibm/gcc-4_5-branch/gcc/testsuite/gfortran.dg/vect/pr45714-a.f branches/ibm/gcc-4_5-branch/gcc/testsuite/gfortran.dg/vect/pr45714-b.f Modified: branches/ibm/gcc-4_5-branch/gcc/ChangeLog.ibm branches/ibm/gcc-4_5-branch/gcc/config/rs6000/predicates.md branches/ibm/gcc-4_5-branch/gcc/config/rs6000/rs6000.c branches/ibm/gcc-4_5-branch/gcc/config/rs6000/vsx.md branches/ibm/gcc-4_5-branch/gcc/testsuite/ChangeLog.ibm branches/ibm/gcc-4_5-branch/gcc/tree-vect-stmts.c