Bug 51058

Summary: [4.7 Regression] ICE: gimple check: expected gimple_assign(error_mark), have gimple_call() in gimple_assign_rhs_code, at gimple.h:1992
Product: gcc Reporter: Dominique d'Humieres <dominiq>
Component: tree-optimizationAssignee: irar
Status: RESOLVED FIXED    
Severity: normal CC: irar, jakub, rguenther
Priority: P3    
Version: 4.7.0   
Target Milestone: 4.7.0   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2011-11-09 00:00:00
Attachments: gcc47-pr51058.patch

Description Dominique d'Humieres 2011-11-09 13:26:15 UTC
On x86_64-apple-darwin10 at revision 181200, compiling the polyhedron test mdbx.f90 with '-O2 -ftree-vectorize' gives an ICE:

[macbook] lin/test% gfc -c mdbx.f90 -O2 -ftree-vectorize
mdbx.f90: In function 'mlist':
mdbx.f90:1199:0: internal compiler error: gimple check: expected gimple_assign(error_mark), have gimple_call() in gimple_assign_rhs_code, at gimple.h:1992

Revision 181085 is OK. The following reduced test reproduces the ICE:

      SUBROUTINE MLIST(MOLsp,PBCx,PBCy,PBCz, X0)
      IMPLICIT NONE
      INTEGER, PARAMETER :: NM=16384
      INTEGER :: MOLsp, i
      DOUBLE PRECISION :: PBCx, PBCy, PBCz, boxjmp, HALf=1./2.
      DOUBLE PRECISION :: X0(3,-2:NM)

         DO i = 1 , MOLsp
            boxjmp = PBCx*INT(X0(1,i)+SIGN(HALf,X0(1,i)))
            X0(1,i) = X0(1,i) - boxjmp
            boxjmp = PBCy*INT(X0(2,i)+SIGN(HALf,X0(2,i)))
            X0(2,i) = X0(2,i) - boxjmp
            boxjmp = PBCz*INT(X0(3,i)+SIGN(HALf,X0(3,i)))
            X0(3,i) = X0(3,i) - boxjmp
         ENDDO
      END
Comment 1 Ira Rosen 2011-11-09 17:06:38 UTC
I guess it'd better be

      SUBROUTINE MLIST(MOLsp,PBCx,PBCy,PBCz, X0)
      IMPLICIT NONE
      INTEGER, PARAMETER :: NM=16384
      INTEGER :: MOLsp, i
      REAL :: PBCx, PBCy, PBCz, boxjmp, HALf=1./2.
      REAL :: X0(2,-2:NM)

         DO i = 1 , MOLsp
            boxjmp = PBCx*INT(X0(1,i)+SIGN(HALf,X0(1,i)))
            X0(1,i) = X0(1,i) - boxjmp
            boxjmp = PBCy*INT(X0(2,i)+SIGN(HALf,X0(2,i)))
            X0(2,i) = X0(2,i) - boxjmp
         ENDDO
      END

otherwise it's an interleaving group of 3, which is only supported on NEON, AFAIK. I also changed the type to REAL to make it work (fail) on power.


This patch fixes the problem:
Index: tree-vect-slp.c
===================================================================
--- tree-vect-slp.c     (revision 181190)
+++ tree-vect-slp.c     (working copy)
@@ -2191,10 +2191,13 @@ vect_get_constant_vectors (tree op, slp_tree slp_n
   VEC (tree, heap) *voprnds = VEC_alloc (tree, heap, number_of_vectors);
   bool constant_p, is_store;
   tree neutral_op = NULL;
-  enum tree_code code = gimple_assign_rhs_code (stmt);
+  enum tree_code code = ERROR_MARK;
   gimple def_stmt;
   struct loop *loop;

+  if (is_gimple_assign (stmt))
+    code = gimple_assign_rhs_code (stmt);
+
   if (STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_reduction_def
       && reduc_index != -1)
     {
@@ -2287,9 +2290,7 @@ vect_get_constant_vectors (tree op, slp_tree slp_n
         {
           if (is_store)
             op = gimple_assign_rhs1 (stmt);
-          else if (gimple_assign_rhs_code (stmt) != COND_EXPR)
-            op = gimple_op (stmt, op_num + 1);
-         else
+          else if (code == COND_EXPR)
            {
              if (op_num == 0 || op_num == 1)
                {
@@ -2304,6 +2305,10 @@ vect_get_constant_vectors (tree op, slp_tree slp_n
                    op = gimple_assign_rhs3 (stmt);
                }
            }
+         else if (is_gimple_call (stmt))
+           op = gimple_op (stmt, op_num + 3);
+         else
+            op = gimple_op (stmt, op_num + 1);

           if (reduc_index != -1)
             {

I'll test it tomorrow.


(My GCC address is irar@gcc.gnu.org, and not ira@gcc.gnu.org).
Comment 2 Jakub Jelinek 2011-11-09 17:15:17 UTC
(In reply to comment #1)
> I guess it'd better be
> 
>       SUBROUTINE MLIST(MOLsp,PBCx,PBCy,PBCz, X0)
>       IMPLICIT NONE
>       INTEGER, PARAMETER :: NM=16384
>       INTEGER :: MOLsp, i
>       REAL :: PBCx, PBCy, PBCz, boxjmp, HALf=1./2.
>       REAL :: X0(2,-2:NM)
> 
>          DO i = 1 , MOLsp
>             boxjmp = PBCx*INT(X0(1,i)+SIGN(HALf,X0(1,i)))
>             X0(1,i) = X0(1,i) - boxjmp
>             boxjmp = PBCy*INT(X0(2,i)+SIGN(HALf,X0(2,i)))
>             X0(2,i) = X0(2,i) - boxjmp
>          ENDDO
>       END
> 
> otherwise it's an interleaving group of 3, which is only supported on NEON,
> AFAIK. I also changed the type to REAL to make it work (fail) on power.
> 
> 
> This patch fixes the problem:
> Index: tree-vect-slp.c
> ===================================================================
> --- tree-vect-slp.c     (revision 181190)
> +++ tree-vect-slp.c     (working copy)
> @@ -2191,10 +2191,13 @@ vect_get_constant_vectors (tree op, slp_tree slp_n
>    VEC (tree, heap) *voprnds = VEC_alloc (tree, heap, number_of_vectors);
>    bool constant_p, is_store;
>    tree neutral_op = NULL;
> -  enum tree_code code = gimple_assign_rhs_code (stmt);
> +  enum tree_code code = ERROR_MARK;
>    gimple def_stmt;
>    struct loop *loop;
> 
> +  if (is_gimple_assign (stmt))
> +    code = gimple_assign_rhs_code (stmt);
> +

You could as well do
  else if (is_gimple_call (stmt))
    code = CALL_EXPR;
like elsewhere and in the loop just test code == CALL_EXPR.

> @@ -2304,6 +2305,10 @@ vect_get_constant_vectors (tree op, slp_tree slp_n
>                     op = gimple_assign_rhs3 (stmt);
>                 }
>             }
> +         else if (is_gimple_call (stmt))
> +           op = gimple_op (stmt, op_num + 3);

Guess it would be nicer to use gimple_call_arg (stmt, op_num); here.
Comment 3 Jakub Jelinek 2011-11-09 17:17:54 UTC
Or use
  enum tree_code code = gimple_expr_code (stmt);
Comment 4 Ira Rosen 2011-11-09 17:40:26 UTC
Thanks.

Index: tree-vect-slp.c
===================================================================
--- tree-vect-slp.c     (revision 181190)
+++ tree-vect-slp.c     (working copy)
@@ -2191,7 +2191,7 @@ vect_get_constant_vectors (tree op, slp_tree slp_n
   VEC (tree, heap) *voprnds = VEC_alloc (tree, heap, number_of_vectors);
   bool constant_p, is_store;
   tree neutral_op = NULL;
-  enum tree_code code = gimple_assign_rhs_code (stmt);
+  enum tree_code code = gimple_expr_code (stmt);
   gimple def_stmt;
   struct loop *loop;

@@ -2287,22 +2287,32 @@ vect_get_constant_vectors (tree op, slp_tree slp_n
         {
           if (is_store)
             op = gimple_assign_rhs1 (stmt);
-          else if (gimple_assign_rhs_code (stmt) != COND_EXPR)
-            op = gimple_op (stmt, op_num + 1);
-         else
+          else
            {
-             if (op_num == 0 || op_num == 1)
+             switch (code)
                {
-                 tree cond = gimple_assign_rhs1 (stmt);
-                 op = TREE_OPERAND (cond, op_num);
+                 case COND_EXPR:
+                   if (op_num == 0 || op_num == 1)
+                     {
+                       tree cond = gimple_assign_rhs1 (stmt);
+                       op = TREE_OPERAND (cond, op_num);
+                     }
+                   else
+                     {
+                       if (op_num == 2)
+                         op = gimple_assign_rhs2 (stmt);
+                       else
+                         op = gimple_assign_rhs3 (stmt);
+                     }
+                   break;
+
+                 case CALL_EXPR:
+                   op = gimple_call_arg (stmt, op_num);
+                   break;
+
+                 default:
+                   op = gimple_op (stmt, op_num + 1);
                }
-             else
-               {
-                 if (op_num == 2)
-                   op = gimple_assign_rhs2 (stmt);
-                 else
-                   op = gimple_assign_rhs3 (stmt);
-               }
            }

           if (reduc_index != -1)
Comment 5 Dominique d'Humieres 2011-11-09 18:27:58 UTC
> I guess it'd better be
>
>       SUBROUTINE MLIST(MOLsp,PBCx,PBCy,PBCz, X0)
>       IMPLICIT NONE
>       INTEGER, PARAMETER :: NM=16384
>       INTEGER :: MOLsp, i
>       REAL :: PBCx, PBCy, PBCz, boxjmp, HALf=1./2.
>       REAL :: X0(2,-2:NM)
>
>          DO i = 1 , MOLsp
>             boxjmp = PBCx*INT(X0(1,i)+SIGN(HALf,X0(1,i)))
>             X0(1,i) = X0(1,i) - boxjmp
>             boxjmp = PBCy*INT(X0(2,i)+SIGN(HALf,X0(2,i)))
>             X0(2,i) = X0(2,i) - boxjmp
>          ENDDO
>       END
>
> otherwise it's an interleaving group of 3, which is only supported on NEON,
> AFAIK. I also changed the type to REAL to make it work (fail) on power.

Well, the original code in mdbx.f90 has the interleaving group of 3 (I only changed the "decoration" around the loop to keep a valid code). When trying to reduce the code removing one block made the ICE to disappear, but the above test fails on x86_64 and not on ppc.

> (My GCC address is irar@gcc.gnu.org, and not ira@gcc.gnu.org).

Sorry! I'll try to remember it next time;-)
Comment 6 rguenther@suse.de 2011-11-10 09:29:52 UTC
On Wed, 9 Nov 2011, jakub at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51058
> 
> --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-11-09 17:17:54 UTC ---
> Or use
>   enum tree_code code = gimple_expr_code (stmt);

One of the functions that IMHO should be removed ;)  (like
gimple_expr_type)
Comment 7 irar 2011-11-10 10:14:28 UTC
Author: irar
Date: Thu Nov 10 10:14:24 2011
New Revision: 181251

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181251
Log:

        PR tree-optimization/51058
        * tree-vect-slp.c (vect_get_constant_vectors): Handle CALL_EXPR.


Added:
    trunk/gcc/testsuite/gfortran.dg/vect/pr51058.f90
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vect-slp.c
Comment 8 Dominique d'Humieres 2011-11-10 14:22:48 UTC
Revision 181251 fixes the ICEs for the codes in comment #0 and #1 (thanks). However at revision 181255 compiling the original mdbx.f90, the extracted subroutine, or the reduced test

!*==MLIST.spg  processed by SPAG 6.55Dc at 09:26 on 23 Sep 2005
      SUBROUTINE MLIST(Lmethd)
      IMPLICIT DOUBLE PRECISION(A-H,O-Z)
      PARAMETER (NM=16384)
      COMMON /COUNT / NFI , LCOunt , LISter , KNTsta , KNTgor , LEP ,   &
     &                MANyon
      COMMON /LCS   / X0(3,-2:NM) , X(3,-2:NM,5) , XIN(3,-2:NM)
         DO i = 1 , MOLsp
            boxjmp = PBCx*INT(X0(1,i)+SIGN(HALf,X0(1,i)))
            X0(1,i) = X0(1,i) - boxjmp
            XIN(1,i) = XIN(1,i) - boxjmp
            boxjmp = PBCy*INT(X0(2,i)+SIGN(HALf,X0(2,i)))
            X0(2,i) = X0(2,i) - boxjmp
            XIN(2,i) = XIN(2,i) - boxjmp
            boxjmp = PBCz*INT(X0(3,i)+SIGN(HALf,X0(3,i)))
            X0(3,i) = X0(3,i) - boxjmp
            XIN(3,i) = XIN(3,i) - boxjmp
         ENDDO
      END

fails with

mlist_red_1.f90: In function 'mlist':
mlist_red_1.f90:2:0: internal compiler error: Segmentation fault

The segmentation fault goes away when compiling under gdb.
Comment 9 Jakub Jelinek 2011-11-10 18:40:05 UTC
The problem is that we have several SLP instances referring to the same call stmt, so vectorizable_call is called on the same stmt several times.
But vectorizable_call wants to replace the scalar call with a assignment and when it does the first time it is called, we get a segfault the second time because the original scalar call stmt has bb set to NULL by gsi_replace.

--- gcc/tree-vect-stmts.c.jj 12011-11-10 18:09:12.000000000 +0100
+++ gcc/tree-vect-stmts.c 2011-11-10 18:44:07.135949119 +0100
@@ -1886,6 +1886,9 @@ vectorizable_call (gimple stmt, gimple_s
      it defines is mapped to the new definition.  So just replace
      rhs of the statement with something harmless.  */
 
+  if (slp_node)
+    return true;
+
   type = TREE_TYPE (scalar_dest);
   if (is_pattern_stmt_p (stmt_info))
     lhs = gimple_call_lhs (STMT_VINFO_RELATED_STMT (stmt_info));
@@ -1893,8 +1896,7 @@ vectorizable_call (gimple stmt, gimple_s
     lhs = gimple_call_lhs (stmt);
   new_stmt = gimple_build_assign (lhs, build_zero_cst (type));
   set_vinfo_for_stmt (new_stmt, stmt_info);
-  if (!slp_node)
-    set_vinfo_for_stmt (stmt, NULL);
+  set_vinfo_for_stmt (stmt, NULL);
   STMT_VINFO_STMT (stmt_info) = new_stmt;
   gsi_replace (gsi, new_stmt, false);
   SSA_NAME_DEF_STMT (gimple_assign_lhs (new_stmt)) = new_stmt;

patch fixes the ICE, but then the scalar stmt stays in the tree at the end of *.vect pass (and at least in this case is DCEd afterwards).
If we wanted to do this replacement, I guess we could do that from
vect_schedule_slp after all vect_schedule_slp_instance calls have returned, but
we'd need to again recursively go through the instance tree.
Comment 10 Ira Rosen 2011-11-11 06:03:14 UTC
(In reply to comment #9)

> 
> patch fixes the ICE, but then the scalar stmt stays in the tree at the end of
> *.vect pass (and at least in this case is DCEd afterwards).

I wonder if we still really need this rhs change.

> If we wanted to do this replacement, I guess we could do that from
> vect_schedule_slp after all vect_schedule_slp_instance calls have returned,

Right, but only for pure SLP stmts.

> but we'd need to again recursively go through the instance tree.

Through all the instances.
We could also collect the calls during the analysis.
Comment 11 Ira Rosen 2011-11-11 06:07:05 UTC
(In reply to comment #5)
> 
> Well, the original code in mdbx.f90 has the interleaving group of 3 (I only
> changed the "decoration" around the loop to keep a valid code). 

Sorry, you are right. It's loop SLP and we do unrolling to get a multiple of vector size.
Comment 12 Jakub Jelinek 2011-11-11 12:33:44 UTC
Created attachment 25795 [details]
gcc47-pr51058.patch

Updated patch.
Comment 13 Jakub Jelinek 2011-11-11 19:56:17 UTC
Author: jakub
Date: Fri Nov 11 19:56:13 2011
New Revision: 181298

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181298
Log:
	PR tree-optimization/51058
	* tree-vect-slp.c (vect_remove_slp_scalar_calls): New function.
	(vect_schedule_slp): Call it.
	* tree-vect-stmts.c (vectorizable_call): If slp_node != NULL,
	don't replace scalar calls with setting of their lhs to zero here.

	* gcc.dg/vect/fast-math-vect-call-1.c: Add f4 test.
	* gfortran.dg/vect/pr51058-2.f90: New test.

Added:
    trunk/gcc/testsuite/gfortran.dg/vect/pr51058-2.f90
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
    trunk/gcc/tree-vect-slp.c
    trunk/gcc/tree-vect-stmts.c
Comment 14 Jakub Jelinek 2011-11-24 15:35:31 UTC
Assuming this is fixed.