Bug 33299 - [4.3 Regression] miscompilation with gfortran -O2 -ffast-math -ftree-vectorize
Summary: [4.3 Regression] miscompilation with gfortran -O2 -ffast-math -ftree-vectorize
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.3.0
: P1 normal
Target Milestone: 4.3.0
Assignee: dorit
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2007-09-04 09:47 UTC by jpr
Modified: 2007-09-08 09:24 UTC (History)
2 users (show)

See Also:
Host: x86_64 linux
Target:
Build:
Known to work: 4.2.1
Known to fail:
Last reconfirmed: 2007-09-04 09:55:14


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jpr 2007-09-04 09:47:36 UTC
The following gets miscompiled with gcc 4.3, works with 4.2.0 & 4.2.1 at least.

PROGRAM test
  REAL(8) :: f,dist(2)
  dist = [1.0_8, 0.5_8]
  PRINT*,f(1.0_8, dist), ' should be: ', MINVAL(dist)
END PROGRAM test

FUNCTION f( x, dist ) RESULT(s)
  REAL(8) :: dist(2), x, s
  s = MINVAL(dist)
  IF( x < 0 ) s = -s
END FUNCTION f



gfortran -v -O2 -ffast-math -ftree-vectorize -o test test.f90
Driving: gfortran -v -O2 -ffast-math -ftree-vectorize -o test test.f90 -lgfortranbegin -lgfortra n -lm -shared-libgcc
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: /projects/tob/gcc/configure --enable-languages=c,fortran --prefix=/projects/tob /gcc-trunk
Thread model: posix
gcc version 4.3.0 20070904 (experimental) [trunk revision 128067] (GCC)
COLLECT_GCC_OPTIONS='-v' '-O2' '-ffast-math' '-ftree-vectorize' '-o' 'test' '-shared-libgcc' '-m tune=generic'
 /wrk/jpr/gcc-trunk/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/f951 test.f90 -quiet -dump base test.f90 -mtune=generic -auxbase test -O2 -version -ffast-math -ftree-vectorize -fintrinsic -modules-path /wrk/jpr/gcc-trunk/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.3.0/finclude -o /tmp/ jpr/ccEvFIMX.s
GNU F95 (GCC) version 4.3.0 20070904 (experimental) [trunk revision 128067] (x86_64-unknown-linu x-gnu)
        compiled by GNU C version 4.3.0 20070904 (experimental) [trunk revision 128067], GMP ver sion 4.2.1, MPFR version 2.2.1.
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
COLLECT_GCC_OPTIONS='-v' '-O2' '-ffast-math' '-ftree-vectorize' '-o' 'test' '-shared-libgcc' '-m tune=generic'
 as -V -Qy -o /tmp/jpr/ccJEvzPa.o /tmp/jpr/ccEvFIMX.s
GNU assembler version 2.15.92.0.2 (x86_64-redhat-linux) using BFD version 2.15.92.0.2 20040927
COMPILER_PATH=/wrk/jpr/gcc-trunk/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/:/wrk/jpr/gcc -trunk/bin/../libexec/gcc/
LIBRARY_PATH=/wrk/jpr/gcc-trunk/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.3.0/:/wrk/jpr/gcc-trun k/bin/../lib/gcc/:/v/linux26_x86_64/opt/papi/3.5.0/lib64/../lib64/:/wrk/jpr/gcc-trunk/bin/../lib /gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/v/linu x26_x86_64/opt/papi/3.5.0/lib64/:/wrk/jpr/gcc-trunk/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.3. 0/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-O2' '-ffast-math' '-ftree-vectorize' '-o' 'test' '-shared-libgcc' '-m tune=generic'
 /wrk/jpr/gcc-trunk/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/collect2 --eh-frame-hdr -m  elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o test /usr/lib/../lib64/crt1.o /usr/li b/../lib64/crti.o /wrk/jpr/gcc-trunk/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.3.0/crtbegin.o -L /wrk/jpr/gcc-trunk/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.3.0 -L/wrk/jpr/gcc-trunk/bin/../lib /gcc -L/v/linux26_x86_64/opt/papi/3.5.0/lib64/../lib64 -L/wrk/jpr/gcc-trunk/bin/../lib/gcc/x86_6 4-unknown-linux-gnu/4.3.0/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/v/linux26_x86 _64/opt/papi/3.5.0/lib64 -L/wrk/jpr/gcc-trunk/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../. ./.. /tmp/jpr/ccJEvzPa.o -lgfortranbegin -lgfortran -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /wrk/jpr /gcc-trunk/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.3.0/crtfastmath.o /wrk/jpr/gcc-trunk/bin/.. /lib/gcc/x86_64-unknown-linux-gnu/4.3.0/crtend.o /usr/lib/../lib64/crtn.o

./test
   1.00000000000000       should be:   0.500000000000000

Regrads, Juha
Comment 1 Richard Biener 2007-09-04 09:55:14 UTC
Confirmed.  It looks like the vectorizer forgets to update the PHI node for
stmp_var:

<bb 5>:
  # vect_var_.32_31 = PHI <vect_var_.32_29(3)>
  # s_4 = PHI <s_2(3)>
  # s_10 = PHI <s_2(3)>
  stmp_var_.34_32 = BIT_FIELD_REF <vect_var_.32_31, 64, 0>;
  stmp_var_.34_33 = BIT_FIELD_REF <vect_var_.32_31, 64, 64>;
  stmp_var_.34_34 = MIN_EXPR <stmp_var_.34_33, stmp_var_.34_32>;
  D.1396_15 = *x_14(D);
  if (D.1396_15 < 0.0)
    goto <bb 6>;
  else
    goto <bb 7>;

<bb 6>:
  s_16 = -stmp_var_.34_34;

<bb 7>: 
  # s_1 = PHI <s_4(5), s_16(6)>
  return s_1;

s_4 should be stmp_var_.34_34 instead.
Comment 2 dorit 2007-09-04 11:44:17 UTC
(In reply to comment #1)
> Confirmed.  It looks like the vectorizer forgets to update the PHI node for
> stmp_var:

yes. I suspect I didn't expect at the time that there would be two loop-closed-ssa-form phi-nodes at the loop exit for s_3, so I probably update just one of them (s_10) and not the other (s_4). This is how it looks before vectorization:

<bb 7>:
  # s_4 = PHI <s_3(3)>
  # s_10 = PHI <s_3(3)>
  D.1368_15 = *x_14(D);
  if (D.1368_15 < 0.0)
    goto <bb 8>;
  else
    goto <bb 9>;

<bb 8>:
  s_16 = -s_10;

<bb 9>:
  # s_1 = PHI <s_4(7), s_16(8)>
  return s_1;

I'll prepare a fix.
Comment 3 dorit 2007-09-04 19:11:22 UTC
I'm testing this patch:

Index: tree-vect-transform.c
===================================================================
*** tree-vect-transform.c	(revision 128037)
--- tree-vect-transform.c	(working copy)
*************** vect_create_epilog_for_reduction (tree v
*** 1964,1969 ****
--- 1964,1971 ----
    tree operation = GIMPLE_STMT_OPERAND (stmt, 1);
    bool nested_in_vect_loop = false;
    int op_type;
+   VEC(tree,heap) *phis = NULL;
+   int i;

    if (nested_in_vect_loop_p (loop, stmt))
      {
*************** vect_finalize_reduction:
*** 2260,2270 ****
        epilog_stmt = build_gimple_modify_stmt (new_dest, expr);
        new_temp = make_ssa_name (new_dest, epilog_stmt);
        GIMPLE_STMT_OPERAND (epilog_stmt, 0) = new_temp;
- #if 0
-       bsi_insert_after (&exit_bsi, epilog_stmt, BSI_NEW_STMT);
- #else
        bsi_insert_before (&exit_bsi, epilog_stmt, BSI_SAME_STMT);
- #endif
      }


--- 2262,2268 ----
*************** vect_finalize_reduction:
*** 2274,2318 ****
       Find the loop-closed-use at the loop exit of the original scalar result.
       (The reduction result is expected to have two immediate uses - one at the
       latch block, and one at the loop exit).  */
!   exit_phi = NULL;
    FOR_EACH_IMM_USE_FAST (use_p, imm_iter, scalar_dest)
      {
        if (!flow_bb_inside_loop_p (loop, bb_for_stmt (USE_STMT (use_p))))
  	{
  	  exit_phi = USE_STMT (use_p);
! 	  break;
  	}
      }
    /* We expect to have found an exit_phi because of loop-closed-ssa form.  */
!   gcc_assert (exit_phi);
  
!   if (nested_in_vect_loop)
      {
!       stmt_vec_info stmt_vinfo = vinfo_for_stmt (exit_phi);
  
!       /* FORNOW. Currently not supporting the case that an inner-loop reduction
! 	 is not used in the outer-loop (but only outside the outer-loop).  */
!       gcc_assert (STMT_VINFO_RELEVANT_P (stmt_vinfo) 
! 		  && !STMT_VINFO_LIVE_P (stmt_vinfo));
! 
!       epilog_stmt = adjustment_def ? epilog_stmt :  new_phi;
!       STMT_VINFO_VEC_STMT (stmt_vinfo) = epilog_stmt;
!       set_stmt_info (get_stmt_ann (epilog_stmt),
!                      new_stmt_vec_info (epilog_stmt, loop_vinfo));

!       if (vect_print_dump_info (REPORT_DETAILS))
!         {
!           fprintf (vect_dump, "vector of partial results after inner-loop:");
!           print_generic_expr (vect_dump, epilog_stmt, TDF_SLIM);
!         }
!       return;
      }
- 
-   /* Replace the uses:  */
-   orig_name = PHI_RESULT (exit_phi);
-   FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, orig_name)
-     FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
-       SET_USE (use_p, new_temp);
  } 

  
--- 2272,2313 ----
       Find the loop-closed-use at the loop exit of the original scalar result.
       (The reduction result is expected to have two immediate uses - one at the 
       latch block, and one at the loop exit).  */
!   phis = VEC_alloc (tree, heap, 10);
    FOR_EACH_IMM_USE_FAST (use_p, imm_iter, scalar_dest)
      {
        if (!flow_bb_inside_loop_p (loop, bb_for_stmt (USE_STMT (use_p))))
  	{
  	  exit_phi = USE_STMT (use_p);
! 	  VEC_quick_push (tree, phis, exit_phi);
  	}
      }
    /* We expect to have found an exit_phi because of loop-closed-ssa form.  */
!   gcc_assert (!VEC_empty (tree, phis));
  
!   for (i = 0; VEC_iterate (tree, phis, i, exit_phi); i++)
      {
!       if (nested_in_vect_loop)
! 	{
! 	  stmt_vec_info stmt_vinfo = vinfo_for_stmt (exit_phi);
  
! 	  /* FORNOW. Currently not supporting the case that an inner-loop reduction
! 	     is not used in the outer-loop (but only outside the outer-loop).  */
! 	  gcc_assert (STMT_VINFO_RELEVANT_P (stmt_vinfo) 
! 		      && !STMT_VINFO_LIVE_P (stmt_vinfo));
!
! 	  epilog_stmt = adjustment_def ? epilog_stmt :  new_phi;
! 	  STMT_VINFO_VEC_STMT (stmt_vinfo) = epilog_stmt;
! 	  set_stmt_info (get_stmt_ann (epilog_stmt),
! 	  new_stmt_vec_info (epilog_stmt, loop_vinfo));
! 	  continue;
! 	}
  
!       /* Replace the uses:  */
!       orig_name = PHI_RESULT (exit_phi);
!       FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, orig_name)
! 	FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
! 	  SET_USE (use_p, new_temp);
      }
  } 

Comment 4 dorit 2007-09-04 19:14:42 UTC
(by the way, fast-math should not be required here, but that's a different bug... will fix that soonish)
Comment 5 dorit 2007-09-07 15:00:23 UTC
Subject: Bug 33299

Author: dorit
Date: Fri Sep  7 15:00:11 2007
New Revision: 128242

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=128242
Log:
        PR tree-optimization/33299
        * tree-vect-transform.c (vect_create_epilog_for_reduction): Update uses
        for all relevant loop-exit phis, not just the first.


Added:
    trunk/gcc/testsuite/gfortran.dg/vect/fast-math-pr33299.f90
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gfortran.dg/vect/vect.exp
    trunk/gcc/tree-vect-transform.c

Comment 6 dorit 2007-09-08 09:24:11 UTC
fix committed