Bug 36630

Summary: [4.3/4.4 Regression] ICE in vect_update_ivs_after_vectorizer
Product: gcc Reporter: Jakub Jelinek <jakub>
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: fang, gcc-bugs, razya, rguenth, spop
Priority: P2 Keywords: ice-on-valid-code
Version: 4.3.1   
Target Milestone: 4.3.3   
Host: Target: x86_64-linux
Build: Known to work:
Known to fail: Last reconfirmed: 2008-06-25 14:27:12

Description Jakub Jelinek 2008-06-25 12:47:27 UTC
void
foo (unsigned char *x, short y)
{
  short i;

  i = 2;
  while (i < y)
    {
      x[i - 1] = x[i];
      i = i + 1;
    }
}

ICEs at -O3 on x86_64-linux in 4.3/4.4, works in 4.2 with -O3 -ftree-vectorize.
Comment 1 Richard Biener 2008-06-25 14:27:12 UTC
Confirmed.
Comment 2 Ira Rosen 2008-06-26 11:57:42 UTC
For access function (short int) {(short unsigned int) i_44, +, 1}_1) evolution_part_in_loop_num() returns NULL, which causes the failure.

I tried to peel NOP_EXPRs from POLYNOMIAL_CHRECs in evolution_part_in_loop_num(), but it caused lots of testsuite failures, and I am not sure this is the correct way.

The access function is for vectorizer created calculation of peeling needed for misaligned access:
<bb 11>:
  # ivtmp.42_89 = PHI <ivtmp.42_90(16), 0(10)>
  # ivtmp.41_87 = PHI <ivtmp.41_88(16), vect_p.36_86(10)>
  # ivtmp.34_78 = PHI <ivtmp.34_79(16), vect_p.29_77(10)>
  # i_18 = PHI <i_13(16), i_44(10)>
  D.1578_4 = (long unsigned int) i_18;
  D.1579_5 = D.1578_4 + -1;
  D.1580_7 = x_6(D) + D.1579_5;
  D.1581_9 = x_6(D) + D.1578_4;
  vect_var_.35_80 = M*ivtmp.34_78{misalignment: 0};
  D.1582_10 = *D.1581_9;
  *ivtmp.41_87 = vect_var_.35_80;
  i.0_11 = (short unsigned int) i_18;
  D.1584_12 = i.0_11 + 1;
  i_13 = (short int) D.1584_12;
  ivtmp.34_79 = ivtmp.34_78 + 16;
  ivtmp.41_88 = ivtmp.41_87 + 16;
  ivtmp.42_90 = ivtmp.42_89 + 1;
  if (ivtmp.42_90 < bnd.26_52)
    goto <bb 16>;
  else
    goto <bb 12>;
Comment 3 rguenther@suse.de 2008-06-26 12:11:31 UTC
Subject: Re:  [4.3/4.4 Regression] ICE in
 vect_update_ivs_after_vectorizer

On Thu, 26 Jun 2008, irar at il dot ibm dot com wrote:

> ------- Comment #2 from irar at il dot ibm dot com  2008-06-26 11:57 -------
> For access function (short int) {(short unsigned int) i_44, +, 1}_1)
> evolution_part_in_loop_num() returns NULL, which causes the failure.
>
> I tried to peel NOP_EXPRs from POLYNOMIAL_CHRECs in
> evolution_part_in_loop_num(), but it caused lots of testsuite failures, and I
> am not sure this is the correct way.

Well, the problem is that an unsigned induction variable may validly
wrap while a signed induction variable may not.  This is probably the
cause of the testsuite failures.  Can't you simply handle a NULL return
from evolution_part_in_loop_num in the vectorizer?

Richard.
Comment 4 Ira Rosen 2008-06-26 18:33:51 UTC
(In reply to comment #3)
> Can't you simply handle a NULL return
> from evolution_part_in_loop_num in the vectorizer?

The problem is that this happens during the transformation (for the code created by the vectorizer). We assume that all the checks were done during the analysis phase, and don't expect anything to go wrong during the transformation. This is why there is an assert that the evolution is not NULL. 

But, anyway, shouldn't evolution_part_in_loop_num return a valid value for (short int) {(short unsigned int) i_44, +, 1}_1?

Ira

> Richard.

Comment 5 Ira Rosen 2008-07-16 06:19:59 UTC
This seems to fix the problem:

Index: tree-chrec.c
===================================================================
--- tree-chrec.c        (revision 137271)
+++ tree-chrec.c        (working copy)
@@ -696,6 +696,8 @@ chrec_component_in_loop_num (tree chrec,
   tree component;
   struct loop *loop = get_loop (loop_num), *chloop;

+  STRIP_NOPS (chrec);
+
   if (automatically_generated_chrec_p (chrec))
     return chrec;


I am fully checking it now. 

Ira
Comment 6 Joseph S. Myers 2008-08-27 22:04:45 UTC
4.3.2 is released, changing milestones to 4.3.3.
Comment 7 Richard Biener 2008-08-28 15:09:45 UTC
I still think that handling NULL from evolution_part_in_loop_num is the
correct thing to do.  Even if you need to move this check to the analysis
phase.

The interesting thing is that the access function during vect_analyze_scalar_cycles_1 is

{2, +, 1}_1

which is because after the vectorized part of the loop the prologue
remains which has a non-constant evolution start.

So with the reasoning that you analyzed the access function of the
original loop properly you can probably strip the conversion that
confuses you at just vect_update_ivs_after_vectorizer.  (Or store
the relevant information during analysis where the evolution is
still simple enough)

This would fix the ICE, but I wonder if it may cause wrong code because
of mismatched types somehow. 
Comment 8 Ira Rosen 2008-09-03 10:43:20 UTC
(In reply to comment #7)
> I still think that handling NULL from evolution_part_in_loop_num is the
> correct thing to do.  Even if you need to move this check to the analysis
> phase.
>
> The interesting thing is that the access function during
> vect_analyze_scalar_cycles_1 is
> 
> {2, +, 1}_1
> 
> which is because after the vectorized part of the loop the prologue
> remains which has a non-constant evolution start.

 
I tried to do the check during the analysis, but it seems impossible, since, as you wrote, this access function does not exist during the analysis.

> 
> So with the reasoning that you analyzed the access function of the
> original loop properly you can probably strip the conversion that
> confuses you at just vect_update_ivs_after_vectorizer.  

I tested this on the vectorizer tests:

Index: tree-vect-transform.c
===================================================================
--- tree-vect-transform.c       (revision 139930)
+++ tree-vect-transform.c       (working copy)
@@ -6529,6 +6529,7 @@ vect_update_ivs_after_vectorizer (loop_v

       access_fn = analyze_scalar_evolution (loop, PHI_RESULT (phi));
       gcc_assert (access_fn);
+      STRIP_NOPS (access_fn);
       evolution_part =
         unshare_expr (evolution_part_in_loop_num (access_fn, loop->num));
       gcc_assert (evolution_part != NULL_TREE);


> (Or store
> the relevant information during analysis where the evolution is
> still simple enough)
> 
> This would fix the ICE, but I wonder if it may cause wrong code because
> of mismatched types somehow. 

At least, this does not happen with the vectorizer testsuite.

Another thing, 4.4 does not vectorize this loop anymore (and, therefore, there is no ICE), because of unknown number of iterations: 

(analyze_scalar_evolution
  (loop_nb = 1)
  (scalar = i_18)
(get_scalar_evolution
  (scalar = i_18)
  (scalar_evolution = i_18))
(set_scalar_evolution
  (scalar = i_18)
  (scalar_evolution = i_18))
)
  (set_nb_iterations_in_loop = scev_not_known))
(get_loop_exit_condition
  if (y_3(D) > i_13)
)



Comment 9 Sebastian Pop 2008-09-03 21:02:48 UTC
Subject: Re:  [4.3/4.4 Regression] ICE in vect_update_ivs_after_vectorizer

> Another thing, 4.4 does not vectorize this loop anymore (and, therefore, there
> is no ICE), because of unknown number of iterations:

I also have remarked on one of the graphite testcases scop-matmult.c
that we had a regression in the precision of the number of iterations from
last graphite merge till yesterday's merge, i.e. from 138275 to 139870
We used to have a symbolic number of iterations, but now the result
is scev_dont_know.  I have not yet tracked this down to the patch that
produced this regression.

Sebastian
Comment 10 irar 2008-09-07 07:15:28 UTC
Subject: Bug 36630

Author: irar
Date: Sun Sep  7 07:14:03 2008
New Revision: 140081

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=140081
Log:
	PR tree-optimization/36630
	* tree-vect-transform.c (vect_update_ivs_after_vectorizer):
	Call STRIP_NOPS before calling evolution_part_in_loop_num.


Added:
    branches/gcc-4_3-branch/gcc/testsuite/gcc.dg/vect/pr36630.c
Modified:
    branches/gcc-4_3-branch/gcc/ChangeLog
    branches/gcc-4_3-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_3-branch/gcc/tree-vect-transform.c

Comment 11 irar 2008-09-07 10:07:00 UTC
Subject: Bug 36630

Author: irar
Date: Sun Sep  7 10:05:37 2008
New Revision: 140085

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=140085
Log:
	PR tree-optimization/36630
	* tree-vect-transform.c (vect_update_ivs_after_vectorizer):
	Call STRIP_NOPS before calling evolution_part_in_loop_num.


Added:
    trunk/gcc/testsuite/gcc.dg/vect/pr36630.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vect-transform.c

Comment 12 Ira Rosen 2008-09-07 11:05:06 UTC
Fixed.
Comment 13 Ira Rosen 2008-09-08 07:44:53 UTC
(In reply to comment #9)
> Subject: Re:  [4.3/4.4 Regression] ICE in vect_update_ivs_after_vectorizer
> 
> > Another thing, 4.4 does not vectorize this loop anymore (and, therefore, there
> > is no ICE), because of unknown number of iterations:
> 
> I also have remarked on one of the graphite testcases scop-matmult.c
> that we had a regression in the precision of the number of iterations from
> last graphite merge till yesterday's merge, i.e. from 138275 to 139870
> We used to have a symbolic number of iterations, but now the result
> is scev_dont_know.  I have not yet tracked this down to the patch that
> produced this regression.
> 
> Sebastian
> 

I opened pr37416 for this issue.

Ira