22236 – [4.1 Regression] wrong code for casts and scev

Bug 22236 - [4.1 Regression] wrong code for casts and scev

Summary: [4.1 Regression] wrong code for casts and scev

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	4.1.0

Importance:	P2 critical
Target Milestone:	4.1.0
Assignee:	Not yet assigned to anyone

URL:	http://gcc.gnu.org/ml/gcc-patches/200...
Keywords:	wrong-code

Depends on:
Blocks:	14490
	Show dependency tree / graph

Reported:	2005-06-29 20:36 UTC by Andrew Pinski
Modified:	2023-12-31 19:10 UTC (History)
CC List:	3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2005-07-22 19:07:15

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andrew Pinski 2005-06-29 20:36:23 UTC

The following should not abort but does at -O3.
void abort(void);
void
foo (signed char a)
{
  if (a - 0x7F > 1)
    abort();
}
int main()
{
  unsigned char b;
  for(b = 0;b <0xFF;b++)
   foo (b);
}

Comment 1 Andrew Pinski 2005-06-29 20:37:44 UTC

I should note that after I fix PR 14490, this becomes a latent bug.

Comment 2 Andrew Pinski 2005-06-29 20:50:41 UTC

I think this was introduced by:
2005-06-07  Sebastian Pop  <pop@cri.ensmp.fr>

        PR 18403 and meta PR 21861.
        * Makefile.in (tree-chrec.o): Depend on CFGLOOP_H and TREE_FLOW_H.
        * tree-chrec.c: Include cfgloop.h and tree-flow.h.
        (evolution_function_is_invariant_rec_p,
        evolution_function_is_invariant_p): New.
        (chrec_convert): Use an extra parameter AT_STMT for refining the
        information that is passed down to convert_step.  Integrate the
        code that was in count_ev_in_wider_type.


The reason why I say that is because it worked on 20050603 but failed on 20050610.

I also think this is the cause of PR 22212 too.

Comment 3 Andrew Pinski 2005-06-29 20:59:24 UTC

Here is a testcase which will fail even after the patch for 14490:
void abort(void);

static inline void
foo (signed char a)
{
  int b = a - 0x7F;
  if (b > 1)
    abort();
}

int main()
{
  unsigned char b;
  for(b = 0;b <0xFF;b++)
   foo (b);
}

Comment 4 Andrew Pinski 2005-06-30 02:04:35 UTC

Note you might need -fno-vrp to expose the bug.

Comment 5 Andrew Pinski 2005-06-30 02:04:49 UTC

Note you might need -fno-tree-vrp to expose the bug.

Comment 6 Andrew Pinski 2005-07-06 13:40:26 UTC

Confirmed:
-O0 passes
-O1 fails
-O2 passes (because of VRP)
-O3 passes (because of VRP).

Comment 7 Steven Bosscher 2005-07-22 21:02:16 UTC

Sebastian, an you look into this please?

Comment 8 sebastian.pop@cri.ensmp.fr 2005-07-26 10:06:28 UTC

Subject: Re:  [4.1 Regression] wrong code for casts and scev

After inlining, we end up with a loop containing the following code:

   b.0_3 = (signed char) b_8;
   D.1621_4 = (int) b.0_3;
   a_5 = (signed char) D.1621_4;
   D.1640_6 = (int) a_5;
   b_7 = D.1640_6 - 127;
   if (b_7 > 1) goto <L3>; else goto <L9>;
   
that is equivalent to:

   b_7 = ((int) (signed char) (int) (signed char) b_8) - 127;
   if (b_7 > 1) goto <L3>; else goto <L9>;

with b_8 = (unsigned char) {1, +, 1}

   b_7 = ((int) (signed char) (int) (signed char) {(uchar)1, +, (uchar)1}) - 127;
   if (b_7 > 1) goto <L3>; else goto <L9>;

A sequence of unsigned char 1, 2, ..., 255 has to be converted to
signed char that would wrap with -fwrapv: 1, 2, ..., 127, -128, ...
So here is a patch that keeps the cast if the sequence wraps.  The
remaining problem is that with this patch, chrec_convert gets more
picky about the things that it transforms: in some of the testcases of
the vectorizer, we get some fails because the number of iterations is
not known, making scev_probably_wraps_p to return true, and finally
the conversion is kept.

I have bootstrapped this patch on x86_64, but there are some
regressions:

FAIL: gcc.dg/vect/vect-46.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-50.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-50.c scan-tree-dump-times Vectorizing an unaligned access 2
FAIL: gcc.dg/vect/vect-50.c scan-tree-dump-times Alignment of access forced using peeling 1
FAIL: gcc.dg/vect/vect-52.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-52.c scan-tree-dump-times Vectorizing an unaligned access 2
FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times Alignment of access forced using peeling 1
FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times Vectorizing an unaligned access 2
FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vectorized 1 loops 3
FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times Alignment of access forced using peeling 3

In all these cases, the loop bound is a parameter.  If IP-constant
propagation is not used, chrec_convert has not enough information for
removing the cast.  I propose to modify all these testcases to make
the loop bound explicit.

FAIL: gcc.dg/vect/vect-77.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-77.c scan-tree-dump-times Vectorizing an unaligned access 1
FAIL: gcc.dg/vect/vect-78.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-78.c scan-tree-dump-times Vectorizing an unaligned access 1

For these two regressions, the problem is the same: we end with an
evolution: ib_16 + (aint *) ((long unsigned int) {off_11, +, 1}_1 * 4)
in which the casts cannot be removed because the offset is not known,
and even if the number of iterations is known, chrec_convert cannot
prove that it does not overflow.  I propose to propagate the offset
by hand, or to wait for ipcp ;-)

FAIL: gcc.dg/vect/vect-87.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-87.c scan-tree-dump-times Alignment of access forced using peeling 1
FAIL: gcc.dg/vect/vect-88.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-88.c scan-tree-dump-times Alignment of access forced using peeling 1

For these two testcases we'll need IP-value range propagation.  I
think that we'll have to modify these testcases by inlining the code.
Dorit, could you look at vect-{77,78,87,88}.c testcases?  Thanks.

	* tree-cfg.c (print_succ_bbs): Correctly print successors.
	* tree-chrec.c (chrec_convert): Call scev_probably_wraps_p for checking
	that the iv does not wrap before converting the iv.
	* tree-ssa-loop-ivcanon.c: Correct typo in comment.
	* tree-ssa-loop-ivopts.c (idx_find_step): Add a fixme note.
	* tree-ssa-loop-niter.c (scev_probably_wraps_p): Check that AT_STMT is
	not null.
	(convert_step): Add a comment.

Index: tree-cfg.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-cfg.c,v
retrieving revision 2.211
diff -d -u -p -r2.211 tree-cfg.c
--- tree-cfg.c	12 Jul 2005 13:20:28 -0000	2.211
+++ tree-cfg.c	26 Jul 2005 09:59:38 -0000
@@ -4454,7 +4454,7 @@ static void print_pred_bbs (FILE *, basi
 static void print_succ_bbs (FILE *, basic_block bb);
 
 
-/* Print the predecessors indexes of edge E on FILE.  */
+/* Print on FILE the indexes for the predecessors of basic_block BB.  */
 
 static void
 print_pred_bbs (FILE *file, basic_block bb)
@@ -4463,11 +4463,11 @@ print_pred_bbs (FILE *file, basic_block 
   edge_iterator ei;
 
   FOR_EACH_EDGE (e, ei, bb->preds)
-    fprintf (file, "bb_%d", e->src->index);
+    fprintf (file, "bb_%d ", e->src->index);
 }
 
 
-/* Print the successors indexes of edge E on FILE.  */
+/* Print on FILE the indexes for the successors of basic_block BB.  */
 
 static void
 print_succ_bbs (FILE *file, basic_block bb)
@@ -4476,7 +4476,7 @@ print_succ_bbs (FILE *file, basic_block 
   edge_iterator ei;
 
   FOR_EACH_EDGE (e, ei, bb->succs)
-    fprintf (file, "bb_%d", e->src->index);
+    fprintf (file, "bb_%d ", e->dest->index);
 }
 
 
Index: tree-chrec.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-chrec.c,v
retrieving revision 2.22
diff -d -u -p -r2.22 tree-chrec.c
--- tree-chrec.c	13 Jul 2005 10:08:36 -0000	2.22
+++ tree-chrec.c	26 Jul 2005 09:59:38 -0000
@@ -1110,9 +1110,24 @@ chrec_convert (tree type, tree chrec, tr
 
   if (evolution_function_is_affine_p (chrec))
     {
-      tree step = convert_step (current_loops->parray[CHREC_VARIABLE (chrec)],
- 				type, CHREC_LEFT (chrec), CHREC_RIGHT (chrec),
- 				at_stmt);
+      tree step;
+      bool dummy;
+
+      /* Avoid conversion of (signed char) {(uchar)1, +, (uchar)1}_x
+	 when it is not possible to prove that the scev does not wrap.
+	 See PR22236, where a sequence 1, 2, ..., 255 has to be
+	 converted to signed char, but this would wrap: 1, 2, ...,
+	 127, -128, ...  The result should not be {(schar)1, +,
+	 (schar)1}_x, but instead, we should keep the conversion:
+	 (schar) {(uchar)1, +, (uchar)1}_x.  */
+      if (scev_probably_wraps_p (type, CHREC_LEFT (chrec), CHREC_RIGHT (chrec),
+				 at_stmt,
+				 current_loops->parray[CHREC_VARIABLE (chrec)],
+				 &dummy))
+	return fold_convert (type, chrec);
+
+      step = convert_step (current_loops->parray[CHREC_VARIABLE (chrec)], type,
+			   CHREC_LEFT (chrec), CHREC_RIGHT (chrec), at_stmt);
       if (!step)
  	return fold_convert (type, chrec);
 
Index: tree-ssa-loop-ivcanon.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivcanon.c,v
retrieving revision 2.18
diff -d -u -p -r2.18 tree-ssa-loop-ivcanon.c
--- tree-ssa-loop-ivcanon.c	21 Jul 2005 07:24:10 -0000	2.18
+++ tree-ssa-loop-ivcanon.c	26 Jul 2005 09:59:38 -0000
@@ -252,7 +252,7 @@ try_unroll_loop_completely (struct loops
 }
 
 /* Adds a canonical induction variable to LOOP if suitable.  LOOPS is the loops
-   tree.  CREATE_IV is true if we may create a new iv.  UL determines what
+   tree.  CREATE_IV is true if we may create a new iv.  UL determines 
    which loops we are allowed to completely unroll.  If TRY_EVAL is true, we try
    to determine the number of iterations of a loop by direct evaluation. 
    Returns true if cfg is changed.  */
Index: tree-ssa-loop-ivopts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivopts.c,v
retrieving revision 2.85
diff -d -u -p -r2.85 tree-ssa-loop-ivopts.c
--- tree-ssa-loop-ivopts.c	21 Jul 2005 07:24:10 -0000	2.85
+++ tree-ssa-loop-ivopts.c	26 Jul 2005 09:59:38 -0000
@@ -1443,6 +1443,8 @@ idx_find_step (tree base, tree *idx, voi
     /* The step for pointer arithmetics already is 1 byte.  */
     step = build_int_cst (sizetype, 1);
 
+  /* FIXME: convert_step should not be used outside chrec_convert: fix
+     this by calling chrec_convert.  */
   iv_step = convert_step (dta->ivopts_data->current_loop,
 			  sizetype, iv->base, iv->step, dta->stmt);
 
Index: tree-ssa-loop-niter.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-niter.c,v
retrieving revision 2.33
diff -d -u -p -r2.33 tree-ssa-loop-niter.c
--- tree-ssa-loop-niter.c	21 Jul 2005 07:24:12 -0000	2.33
+++ tree-ssa-loop-niter.c	26 Jul 2005 09:59:38 -0000
@@ -1707,7 +1707,7 @@ scev_probably_wraps_p (tree type, tree b
      i_2 to wrap around, but not i.0_6, because it is of a signed
      type.  This causes VRP to erroneously fold the predicate above
      because it thinks that i.0_6 cannot be negative.  */
-  if (TREE_CODE (at_stmt) == MODIFY_EXPR)
+  if (at_stmt && TREE_CODE (at_stmt) == MODIFY_EXPR)
     {
       tree rhs = TREE_OPERAND (at_stmt, 1);
       tree outer_t = TREE_TYPE (rhs);
@@ -1748,7 +1748,8 @@ scev_probably_wraps_p (tree type, tree b
 }
 
 /* Return the conversion to NEW_TYPE of the STEP of an induction
-   variable BASE + STEP * I at AT_STMT.  */
+   variable BASE + STEP * I at AT_STMT.  When it fails, return
+   NULL_TREE.  */
 
 tree
 convert_step (struct loop *loop, tree new_type, tree base, tree step,
Index: testsuite/gcc.dg/vect/vect-46.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c,v
retrieving revision 1.7
diff -d -u -p -r1.7 vect-46.c
--- testsuite/gcc.dg/vect/vect-46.c	31 Mar 2005 18:34:18 -0000	1.7
+++ testsuite/gcc.dg/vect/vect-46.c	26 Jul 2005 09:59:52 -0000
@@ -23,11 +23,11 @@ void bar (afloat *pa, afloat *pb, afloat
 
 
 int
-main1 (int n , afloat * __restrict__ pa, afloat * __restrict__ pb, afloat * __restrict__ pc)
+main1 (afloat * __restrict__ pa, afloat * __restrict__ pb, afloat * __restrict__ pc)
 {
   int i;
 
-  for (i = 0; i < n; i++)
+  for (i = 0; i < N; i++)
     {
       pa[i] = pb[i] * pc[i];
     }
@@ -38,14 +38,13 @@ main1 (int n , afloat * __restrict__ pa,
 int main (void)
 {
   int i;
-  int n=N;
   afloat a[N];
   afloat b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57};
   afloat c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
 
   check_vect ();
 
-  main1 (n,a,b,c);
+  main1 (a,b,c);
   bar (a,b,c);
   return 0;
 }
Index: testsuite/gcc.dg/vect/vect-50.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-50.c,v
retrieving revision 1.9
diff -d -u -p -r1.9 vect-50.c
--- testsuite/gcc.dg/vect/vect-50.c	31 Mar 2005 18:34:18 -0000	1.9
+++ testsuite/gcc.dg/vect/vect-50.c	26 Jul 2005 09:59:52 -0000
@@ -22,11 +22,11 @@ void bar (float *pa, float *pb, float *p
 
 
 int
-main1 (int n, float * __restrict__ pa, float * __restrict__ pb, float * __restrict__ pc)
+main1 (float * __restrict__ pa, float * __restrict__ pb, float * __restrict__ pc)
 {
   int i;
 
-  for (i = 0; i < n; i++)
+  for (i = 0; i < N; i++)
     {
       pa[i] = pb[i] * pc[i];
     }
@@ -45,7 +45,7 @@ int main (void)
 
   check_vect ();
 
-  main1 (N,a,b,c);
+  main1 (a,b,c);
   return 0;
 }
 
Index: testsuite/gcc.dg/vect/vect-52.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-52.c,v
retrieving revision 1.9
diff -d -u -p -r1.9 vect-52.c
--- testsuite/gcc.dg/vect/vect-52.c	31 Mar 2005 18:34:18 -0000	1.9
+++ testsuite/gcc.dg/vect/vect-52.c	26 Jul 2005 09:59:52 -0000
@@ -23,11 +23,11 @@ void bar (float *pa, float *pb, float *p
 
 
 int
-main1 (int n, afloat * __restrict__ pa, float * __restrict__ pb, float * __restrict__ pc)
+main1 (afloat * __restrict__ pa, float * __restrict__ pb, float * __restrict__ pc)
 {
   int i;
 
-  for (i = 0; i < n; i++)
+  for (i = 0; i < N; i++)
     {
       pa[i] = pb[i] * pc[i];
     }
@@ -46,8 +46,8 @@ int main (void)
 
   check_vect ();
 
-  main1 (N,a,&b[1],c);
-  main1 (N,a,&b[1],&c[1]);
+  main1 (a,&b[1],c);
+  main1 (a,&b[1],&c[1]);
   return 0;
 }
 
Index: testsuite/gcc.dg/vect/vect-58.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-58.c,v
retrieving revision 1.10
diff -d -u -p -r1.10 vect-58.c
--- testsuite/gcc.dg/vect/vect-58.c	31 Mar 2005 18:34:18 -0000	1.10
+++ testsuite/gcc.dg/vect/vect-58.c	26 Jul 2005 09:59:52 -0000
@@ -23,11 +23,11 @@ void bar (afloat *pa, afloat *pb, afloat
 
 
 int
-main1 (int n , afloat * __restrict__ pa, afloat * __restrict__ pb, afloat * __restrict__ pc)
+main1 (afloat * __restrict__ pa, afloat * __restrict__ pb, afloat * __restrict__ pc)
 {
   int i;
 
-  for (i = 0; i < n/2; i++)
+  for (i = 0; i < N/2; i++)
     {
       pa[i+1] = pb[i+1] * pc[i+1];
     }
@@ -40,14 +40,13 @@ main1 (int n , afloat * __restrict__ pa,
 int main (void)
 {
   int i;
-  int n=N;
   afloat a[N];
   afloat b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57};
   afloat c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
 
   check_vect ();
 
-  main1 (n,a,b,c);
+  main1 (a,b,c);
   return 0;
 }
 
Index: testsuite/gcc.dg/vect/vect-60.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-60.c,v
retrieving revision 1.9
diff -d -u -p -r1.9 vect-60.c
--- testsuite/gcc.dg/vect/vect-60.c	31 Mar 2005 18:34:18 -0000	1.9
+++ testsuite/gcc.dg/vect/vect-60.c	26 Jul 2005 09:59:52 -0000
@@ -23,11 +23,11 @@ void bar (afloat *pa, afloat *pb, afloat
 
 
 int
-main1 (int n , afloat * __restrict__ pa, afloat * __restrict__ pb, afloat * __restrict__ pc)
+main1 (afloat * __restrict__ pa, afloat * __restrict__ pb, afloat * __restrict__ pc)
 {
   int i;
 
-  for (i = 0; i < n/2; i++)
+  for (i = 0; i < N/2; i++)
     {
       pa[i] = pb[i+1] * pc[i+1];
     }
@@ -40,14 +40,13 @@ main1 (int n , afloat * __restrict__ pa,
 int main (void)
 {
   int i;
-  int n=N;
   afloat a[N];
   afloat b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57};
   afloat c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
 
   check_vect ();
 
-  main1 (n,a,b,c);
+  main1 (a,b,c);
   return 0;
 }
 
Index: testsuite/gcc.dg/vect/vect-77.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-77.c,v
retrieving revision 1.11
diff -d -u -p -r1.11 vect-77.c
--- testsuite/gcc.dg/vect/vect-77.c	7 Jun 2005 19:51:25 -0000	1.11
+++ testsuite/gcc.dg/vect/vect-77.c	26 Jul 2005 09:59:52 -0000
@@ -20,7 +20,6 @@ int main1 (aint *ib, int off)
       ia[i] = ib[i+off];
     }
 
-
   /* check results:  */
   for (i = 0; i < N; i++)
     {
Index: testsuite/gcc.dg/vect/vect-92.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-92.c,v
retrieving revision 1.4
diff -d -u -p -r1.4 vect-92.c
--- testsuite/gcc.dg/vect/vect-92.c	16 Jul 2005 18:56:53 -0000	1.4
+++ testsuite/gcc.dg/vect/vect-92.c	26 Jul 2005 09:59:52 -0000
@@ -50,17 +50,17 @@ main2 (afloat * __restrict__ pa, afloat 
 }
 
 int
-main3 (afloat * __restrict__ pa, afloat * __restrict__ pb, afloat * __restrict__ pc, int n)
+main3 (afloat * __restrict__ pa, afloat * __restrict__ pb, afloat * __restrict__ pc)
 {
   int i;
 
-  for (i = 0; i < n; i++)
+  for (i = 0; i < N-1; i++)
     {
       pa[i+1] = pb[i+1] * pc[i+1];
     }
 
   /* check results:  */
-  for (i = 0; i < n; i++)
+  for (i = 0; i < N-1; i++)
     {
       if (pa[i+1] != (pb[i+1] * pc[i+1]))
 	abort ();
@@ -80,7 +80,7 @@ int main (void)
 
   main1 (a,b,c);
   main2 (a,b,c);
-  main3 (a,b,c,N-1);
+  main3 (a,b,c);
 
   return 0;
 }

Comment 9 richard.guenther@gmail.com 2005-07-26 10:38:42 UTC

Subject: Re:  [4.1 Regression] wrong code for casts and scev

You should also add the (two) testcases from the PR.

Richard.

Comment 10 Dorit Naishlos 2005-07-26 13:59:05 UTC

Subject: Re:  [4.1 Regression] wrong code for casts and
 scev





Hi Sebastian,

The modifications you suggest will make the tests uninteresting - they were
introduced with unknown loop-bound/offset on purpose. In fact, for most of
these tests there are already versions with explicit constant values for
the loop-bound/offset -
vect-46.c with explicit loop-bound becomes vect-40.c
vect-50.c with explicit loop-bound becomes vect-44.c
vect-52.c with explicit loop-bound becomes vect-48.c
vect-58.c with explicit loop-bound becomes vect-54.c
vect-60.c with explicit loop-bound becomes vect-56.c
vect-77.c and vect-78.c with explicit offset become vect-75.c
and vect-92.c also uses unknown loop-bound on purpose.

Can we change something else in the tests to make the evolution-analyzer
return something saner? by changing types of variables? by using some flag?

(by the way, where does it fail the vectorizer? (what are the last things
that dump details reports?))

dorit

>
> FAIL: gcc.dg/vect/vect-46.c scan-tree-dump-times vectorized 1 loops 1
> FAIL: gcc.dg/vect/vect-50.c scan-tree-dump-times vectorized 1 loops 1
> FAIL: gcc.dg/vect/vect-50.c scan-tree-dump-times Vectorizing an
> unaligned access 2
> FAIL: gcc.dg/vect/vect-50.c scan-tree-dump-times Alignment of access
> forced using peeling 1
> FAIL: gcc.dg/vect/vect-52.c scan-tree-dump-times vectorized 1 loops 1
> FAIL: gcc.dg/vect/vect-52.c scan-tree-dump-times Vectorizing an
> unaligned access 2
> FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vectorized 1 loops 1
> FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times Alignment of access
> forced using peeling 1
> FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vectorized 1 loops 1
> FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times Vectorizing an
> unaligned access 2
> FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vectorized 1 loops 3
> FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times Alignment of access
> forced using peeling 3
>
> In all these cases, the loop bound is a parameter.  If IP-constant
> propagation is not used, chrec_convert has not enough information for
> removing the cast.  I propose to modify all these testcases to make
> the loop bound explicit.
>
> FAIL: gcc.dg/vect/vect-77.c scan-tree-dump-times vectorized 1 loops 1
> FAIL: gcc.dg/vect/vect-77.c scan-tree-dump-times Vectorizing an
> unaligned access 1
> FAIL: gcc.dg/vect/vect-78.c scan-tree-dump-times vectorized 1 loops 1
> FAIL: gcc.dg/vect/vect-78.c scan-tree-dump-times Vectorizing an
> unaligned access 1
>
> For these two regressions, the problem is the same: we end with an
> evolution: ib_16 + (aint *) ((long unsigned int) {off_11, +, 1}_1 * 4)
> in which the casts cannot be removed because the offset is not known,
> and even if the number of iterations is known, chrec_convert cannot
> prove that it does not overflow.  I propose to propagate the offset
> by hand, or to wait for ipcp ;-)
>
> FAIL: gcc.dg/vect/vect-87.c scan-tree-dump-times vectorized 1 loops 1
> FAIL: gcc.dg/vect/vect-87.c scan-tree-dump-times Alignment of access
> forced using peeling 1
> FAIL: gcc.dg/vect/vect-88.c scan-tree-dump-times vectorized 1 loops 1
> FAIL: gcc.dg/vect/vect-88.c scan-tree-dump-times Alignment of access
> forced using peeling 1
>
> For these two testcases we'll need IP-value range propagation.  I
> think that we'll have to modify these testcases by inlining the code.
> Dorit, could you look at vect-{77,78,87,88}.c testcases?  Thanks.
>
>    * tree-cfg.c (print_succ_bbs): Correctly print successors.
>    * tree-chrec.c (chrec_convert): Call scev_probably_wraps_p for
checking
>    that the iv does not wrap before converting the iv.
>    * tree-ssa-loop-ivcanon.c: Correct typo in comment.
>    * tree-ssa-loop-ivopts.c (idx_find_step): Add a fixme note.
>    * tree-ssa-loop-niter.c (scev_probably_wraps_p): Check that AT_STMT is
>    not null.
>    (convert_step): Add a comment.
>
> Index: tree-cfg.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-cfg.c,v
> retrieving revision 2.211
> diff -d -u -p -r2.211 tree-cfg.c
> --- tree-cfg.c   12 Jul 2005 13:20:28 -0000   2.211
> +++ tree-cfg.c   26 Jul 2005 09:59:38 -0000
> @@ -4454,7 +4454,7 @@ static void print_pred_bbs (FILE *, basi
>  static void print_succ_bbs (FILE *, basic_block bb);
>
>
> -/* Print the predecessors indexes of edge E on FILE.  */
> +/* Print on FILE the indexes for the predecessors of basic_block BB.  */
>
>  static void
>  print_pred_bbs (FILE *file, basic_block bb)
> @@ -4463,11 +4463,11 @@ print_pred_bbs (FILE *file, basic_block
>    edge_iterator ei;
>
>    FOR_EACH_EDGE (e, ei, bb->preds)
> -    fprintf (file, "bb_%d", e->src->index);
> +    fprintf (file, "bb_%d ", e->src->index);
>  }
>
>
> -/* Print the successors indexes of edge E on FILE.  */
> +/* Print on FILE the indexes for the successors of basic_block BB.  */
>
>  static void
>  print_succ_bbs (FILE *file, basic_block bb)
> @@ -4476,7 +4476,7 @@ print_succ_bbs (FILE *file, basic_block
>    edge_iterator ei;
>
>    FOR_EACH_EDGE (e, ei, bb->succs)
> -    fprintf (file, "bb_%d", e->src->index);
> +    fprintf (file, "bb_%d ", e->dest->index);
>  }
>
>
> Index: tree-chrec.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-chrec.c,v
> retrieving revision 2.22
> diff -d -u -p -r2.22 tree-chrec.c
> --- tree-chrec.c   13 Jul 2005 10:08:36 -0000   2.22
> +++ tree-chrec.c   26 Jul 2005 09:59:38 -0000
> @@ -1110,9 +1110,24 @@ chrec_convert (tree type, tree chrec, tr
>
>    if (evolution_function_is_affine_p (chrec))
>      {
> -      tree step = convert_step
> (current_loops->parray[CHREC_VARIABLE (chrec)],
> -             type, CHREC_LEFT (chrec), CHREC_RIGHT (chrec),
> -             at_stmt);
> +      tree step;
> +      bool dummy;
> +
> +      /* Avoid conversion of (signed char) {(uchar)1, +, (uchar)1}_x
> +    when it is not possible to prove that the scev does not wrap.
> +    See PR22236, where a sequence 1, 2, ..., 255 has to be
> +    converted to signed char, but this would wrap: 1, 2, ...,
> +    127, -128, ...  The result should not be {(schar)1, +,
> +    (schar)1}_x, but instead, we should keep the conversion:
> +    (schar) {(uchar)1, +, (uchar)1}_x.  */
> +      if (scev_probably_wraps_p (type, CHREC_LEFT (chrec),
> CHREC_RIGHT (chrec),
> +             at_stmt,
> +             current_loops->parray[CHREC_VARIABLE (chrec)],
> +             &dummy))
> +   return fold_convert (type, chrec);
> +
> +      step = convert_step (current_loops->parray[CHREC_VARIABLE
> (chrec)], type,
> +            CHREC_LEFT (chrec), CHREC_RIGHT (chrec), at_stmt);
>        if (!step)
>      return fold_convert (type, chrec);
>
> Index: tree-ssa-loop-ivcanon.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivcanon.c,v
> retrieving revision 2.18
> diff -d -u -p -r2.18 tree-ssa-loop-ivcanon.c
> --- tree-ssa-loop-ivcanon.c   21 Jul 2005 07:24:10 -0000   2.18
> +++ tree-ssa-loop-ivcanon.c   26 Jul 2005 09:59:38 -0000
> @@ -252,7 +252,7 @@ try_unroll_loop_completely (struct loops
>  }
>
>  /* Adds a canonical induction variable to LOOP if suitable.  LOOPS
> is the loops
> -   tree.  CREATE_IV is true if we may create a new iv.  UL determines
what
> +   tree.  CREATE_IV is true if we may create a new iv.  UL determines
>     which loops we are allowed to completely unroll.  If TRY_EVAL is
> true, we try
>     to determine the number of iterations of a loop by direct evaluation.

>     Returns true if cfg is changed.  */
> Index: tree-ssa-loop-ivopts.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivopts.c,v
> retrieving revision 2.85
> diff -d -u -p -r2.85 tree-ssa-loop-ivopts.c
> --- tree-ssa-loop-ivopts.c   21 Jul 2005 07:24:10 -0000   2.85
> +++ tree-ssa-loop-ivopts.c   26 Jul 2005 09:59:38 -0000
> @@ -1443,6 +1443,8 @@ idx_find_step (tree base, tree *idx, voi
>      /* The step for pointer arithmetics already is 1 byte.  */
>      step = build_int_cst (sizetype, 1);
>
> +  /* FIXME: convert_step should not be used outside chrec_convert: fix
> +     this by calling chrec_convert.  */
>    iv_step = convert_step (dta->ivopts_data->current_loop,
>             sizetype, iv->base, iv->step, dta->stmt);
>
> Index: tree-ssa-loop-niter.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-niter.c,v
> retrieving revision 2.33
> diff -d -u -p -r2.33 tree-ssa-loop-niter.c
> --- tree-ssa-loop-niter.c   21 Jul 2005 07:24:12 -0000   2.33
> +++ tree-ssa-loop-niter.c   26 Jul 2005 09:59:38 -0000
> @@ -1707,7 +1707,7 @@ scev_probably_wraps_p (tree type, tree b
>       i_2 to wrap around, but not i.0_6, because it is of a signed
>       type.  This causes VRP to erroneously fold the predicate above
>       because it thinks that i.0_6 cannot be negative.  */
> -  if (TREE_CODE (at_stmt) == MODIFY_EXPR)
> +  if (at_stmt && TREE_CODE (at_stmt) == MODIFY_EXPR)
>      {
>        tree rhs = TREE_OPERAND (at_stmt, 1);
>        tree outer_t = TREE_TYPE (rhs);
> @@ -1748,7 +1748,8 @@ scev_probably_wraps_p (tree type, tree b
>  }
>
>  /* Return the conversion to NEW_TYPE of the STEP of an induction
> -   variable BASE + STEP * I at AT_STMT.  */
> +   variable BASE + STEP * I at AT_STMT.  When it fails, return
> +   NULL_TREE.  */
>
>  tree
>  convert_step (struct loop *loop, tree new_type, tree base, tree step,
> Index: testsuite/gcc.dg/vect/vect-46.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c,v
> retrieving revision 1.7
> diff -d -u -p -r1.7 vect-46.c
> --- testsuite/gcc.dg/vect/vect-46.c   31 Mar 2005 18:34:18 -0000   1.7
> +++ testsuite/gcc.dg/vect/vect-46.c   26 Jul 2005 09:59:52 -0000
> @@ -23,11 +23,11 @@ void bar (afloat *pa, afloat *pb, afloat
>
>
>  int
> -main1 (int n , afloat * __restrict__ pa, afloat * __restrict__ pb,
> afloat * __restrict__ pc)
> +main1 (afloat * __restrict__ pa, afloat * __restrict__ pb, afloat *
> __restrict__ pc)
>  {
>    int i;
>
> -  for (i = 0; i < n; i++)
> +  for (i = 0; i < N; i++)
>      {
>        pa[i] = pb[i] * pc[i];
>      }
> @@ -38,14 +38,13 @@ main1 (int n , afloat * __restrict__ pa,
>  int main (void)
>  {
>    int i;
> -  int n=N;
>    afloat a[N];
>    afloat b[N] =
{0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57};
>    afloat c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
>
>    check_vect ();
>
> -  main1 (n,a,b,c);
> +  main1 (a,b,c);
>    bar (a,b,c);
>    return 0;
>  }
> Index: testsuite/gcc.dg/vect/vect-50.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-50.c,v
> retrieving revision 1.9
> diff -d -u -p -r1.9 vect-50.c
> --- testsuite/gcc.dg/vect/vect-50.c   31 Mar 2005 18:34:18 -0000   1.9
> +++ testsuite/gcc.dg/vect/vect-50.c   26 Jul 2005 09:59:52 -0000
> @@ -22,11 +22,11 @@ void bar (float *pa, float *pb, float *p
>
>
>  int
> -main1 (int n, float * __restrict__ pa, float * __restrict__ pb,
> float * __restrict__ pc)
> +main1 (float * __restrict__ pa, float * __restrict__ pb, float *
> __restrict__ pc)
>  {
>    int i;
>
> -  for (i = 0; i < n; i++)
> +  for (i = 0; i < N; i++)
>      {
>        pa[i] = pb[i] * pc[i];
>      }
> @@ -45,7 +45,7 @@ int main (void)
>
>    check_vect ();
>
> -  main1 (N,a,b,c);
> +  main1 (a,b,c);
>    return 0;
>  }
>
> Index: testsuite/gcc.dg/vect/vect-52.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-52.c,v
> retrieving revision 1.9
> diff -d -u -p -r1.9 vect-52.c
> --- testsuite/gcc.dg/vect/vect-52.c   31 Mar 2005 18:34:18 -0000   1.9
> +++ testsuite/gcc.dg/vect/vect-52.c   26 Jul 2005 09:59:52 -0000
> @@ -23,11 +23,11 @@ void bar (float *pa, float *pb, float *p
>
>
>  int
> -main1 (int n, afloat * __restrict__ pa, float * __restrict__ pb,
> float * __restrict__ pc)
> +main1 (afloat * __restrict__ pa, float * __restrict__ pb, float *
> __restrict__ pc)
>  {
>    int i;
>
> -  for (i = 0; i < n; i++)
> +  for (i = 0; i < N; i++)
>      {
>        pa[i] = pb[i] * pc[i];
>      }
> @@ -46,8 +46,8 @@ int main (void)
>
>    check_vect ();
>
> -  main1 (N,a,&b[1],c);
> -  main1 (N,a,&b[1],&c[1]);
> +  main1 (a,&b[1],c);
> +  main1 (a,&b[1],&c[1]);
>    return 0;
>  }
>
> Index: testsuite/gcc.dg/vect/vect-58.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-58.c,v
> retrieving revision 1.10
> diff -d -u -p -r1.10 vect-58.c
> --- testsuite/gcc.dg/vect/vect-58.c   31 Mar 2005 18:34:18 -0000   1.10
> +++ testsuite/gcc.dg/vect/vect-58.c   26 Jul 2005 09:59:52 -0000
> @@ -23,11 +23,11 @@ void bar (afloat *pa, afloat *pb, afloat
>
>
>  int
> -main1 (int n , afloat * __restrict__ pa, afloat * __restrict__ pb,
> afloat * __restrict__ pc)
> +main1 (afloat * __restrict__ pa, afloat * __restrict__ pb, afloat *
> __restrict__ pc)
>  {
>    int i;
>
> -  for (i = 0; i < n/2; i++)
> +  for (i = 0; i < N/2; i++)
>      {
>        pa[i+1] = pb[i+1] * pc[i+1];
>      }
> @@ -40,14 +40,13 @@ main1 (int n , afloat * __restrict__ pa,
>  int main (void)
>  {
>    int i;
> -  int n=N;
>    afloat a[N];
>    afloat b[N] =
{0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57};
>    afloat c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
>
>    check_vect ();
>
> -  main1 (n,a,b,c);
> +  main1 (a,b,c);
>    return 0;
>  }
>
> Index: testsuite/gcc.dg/vect/vect-60.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-60.c,v
> retrieving revision 1.9
> diff -d -u -p -r1.9 vect-60.c
> --- testsuite/gcc.dg/vect/vect-60.c   31 Mar 2005 18:34:18 -0000   1.9
> +++ testsuite/gcc.dg/vect/vect-60.c   26 Jul 2005 09:59:52 -0000
> @@ -23,11 +23,11 @@ void bar (afloat *pa, afloat *pb, afloat
>
>
>  int
> -main1 (int n , afloat * __restrict__ pa, afloat * __restrict__ pb,
> afloat * __restrict__ pc)
> +main1 (afloat * __restrict__ pa, afloat * __restrict__ pb, afloat *
> __restrict__ pc)
>  {
>    int i;
>
> -  for (i = 0; i < n/2; i++)
> +  for (i = 0; i < N/2; i++)
>      {
>        pa[i] = pb[i+1] * pc[i+1];
>      }
> @@ -40,14 +40,13 @@ main1 (int n , afloat * __restrict__ pa,
>  int main (void)
>  {
>    int i;
> -  int n=N;
>    afloat a[N];
>    afloat b[N] =
{0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57};
>    afloat c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
>
>    check_vect ();
>
> -  main1 (n,a,b,c);
> +  main1 (a,b,c);
>    return 0;
>  }
>
> Index: testsuite/gcc.dg/vect/vect-77.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-77.c,v
> retrieving revision 1.11
> diff -d -u -p -r1.11 vect-77.c
> --- testsuite/gcc.dg/vect/vect-77.c   7 Jun 2005 19:51:25 -0000   1.11
> +++ testsuite/gcc.dg/vect/vect-77.c   26 Jul 2005 09:59:52 -0000
> @@ -20,7 +20,6 @@ int main1 (aint *ib, int off)
>        ia[i] = ib[i+off];
>      }
>
> -
>    /* check results:  */
>    for (i = 0; i < N; i++)
>      {
> Index: testsuite/gcc.dg/vect/vect-92.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/vect/vect-92.c,v
> retrieving revision 1.4
> diff -d -u -p -r1.4 vect-92.c
> --- testsuite/gcc.dg/vect/vect-92.c   16 Jul 2005 18:56:53 -0000   1.4
> +++ testsuite/gcc.dg/vect/vect-92.c   26 Jul 2005 09:59:52 -0000
> @@ -50,17 +50,17 @@ main2 (afloat * __restrict__ pa, afloat
>  }
>
>  int
> -main3 (afloat * __restrict__ pa, afloat * __restrict__ pb, afloat *
> __restrict__ pc, int n)
> +main3 (afloat * __restrict__ pa, afloat * __restrict__ pb, afloat *
> __restrict__ pc)
>  {
>    int i;
>
> -  for (i = 0; i < n; i++)
> +  for (i = 0; i < N-1; i++)
>      {
>        pa[i+1] = pb[i+1] * pc[i+1];
>      }
>
>    /* check results:  */
> -  for (i = 0; i < n; i++)
> +  for (i = 0; i < N-1; i++)
>      {
>        if (pa[i+1] != (pb[i+1] * pc[i+1]))
>     abort ();
> @@ -80,7 +80,7 @@ int main (void)
>
>    main1 (a,b,c);
>    main2 (a,b,c);
> -  main3 (a,b,c,N-1);
> +  main3 (a,b,c);
>
>    return 0;
>  }

Comment 11 sebastian.pop@cri.ensmp.fr 2005-07-26 15:15:06 UTC

Subject: Re:  [4.1 Regression] wrong code for casts and scev

Dorit Naishlos wrote:
> 
> The modifications you suggest will make the tests uninteresting - they were
> introduced with unknown loop-bound/offset on purpose. In fact, for most of
> these tests there are already versions with explicit constant values for
> the loop-bound/offset -

Okay, I won't modify any of these testcases.

> vect-46.c with explicit loop-bound becomes vect-40.c
> vect-50.c with explicit loop-bound becomes vect-44.c
> vect-52.c with explicit loop-bound becomes vect-48.c
> vect-58.c with explicit loop-bound becomes vect-54.c
> vect-60.c with explicit loop-bound becomes vect-56.c
> vect-77.c and vect-78.c with explicit offset become vect-75.c
> and vect-92.c also uses unknown loop-bound on purpose.
> 
> Can we change something else in the tests to make the evolution-analyzer
> return something saner? by changing types of variables? by using some flag?
> 

I don't think it is possible to properly convert these ivs without
knowing an approximation of the number of iterations.

We can get this either from -fipcp, but this would make the testcases
redundant as you said, or having the IP alias analysis that tells us
that the only pointers passed to main1 () in vect-46 point to data
whose size is 256, and from this extract an estimation of the number
of iterations, or last solution explained below.

> (by the way, where does it fail the vectorizer? (what are the last things
> that dump details reports?))

compiling vect-46.c produces the following:

(...
(set_scalar_evolution 
  (scalar = D.2054_15)
  (scalar_evolution = (afloat * restrict) {0, +, 4}_1 + pb_14))
)

/home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: Access function of ptr: (afloat * restrict) {0, +, 4}_1 + pb_14
/home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: not vectorized: ptr is loop invariant.
/home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: not vectorized: unhandled data ref: D.2055_16 = *D.2054_15
/home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: bad data references.
/home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: vectorized 0 loops in function.

Now that we keep the cast, (afloat * restrict) {(uint) 0, +, (uint)
4}_1 + pb_14 cannot be folded into {(afloat * restrict)pb_14, +,
(afloat * restrict)4}_1, that's why the vectorizer cannot recognize
the pattern.

The main problem is that the code in the loop contains a cast of the
signed iv i_18 to unsigned:

    bb_2 (preds = {bb_3 bb_1 }, succs = {bb_3 bb_5 })
    {
      # TMT.5_17 = PHI <TMT.5_27(3), TMT.5_26(1)>;
      # i_18 = PHI <i_24(3), 0(1)>;
    <L0>:;
      D.2050_6 = (long unsigned int) i_18;
      D.2051_7 = D.2050_6 * 4;
      D.2052_8 = (afloat * restrict) D.2051_7;
      D.2053_10 = D.2052_8 + pa_9;
      D.2054_15 = D.2052_8 + pb_14;
      #   VUSE <TMT.5_17>;
      D.2055_16 = *D.2054_15;
      D.2056_21 = D.2052_8 + pc_20;
      #   VUSE <TMT.5_17>;
      D.2057_22 = *D.2056_21;
      D.2058_23 = D.2055_16 * D.2057_22;
      #   TMT.5_27 = V_MAY_DEF <TMT.5_17>;
      *D.2053_10 = D.2058_23;
      i_24 = i_18 + 1;
      if (n_3 > i_24) goto <L9>; else goto <L11>;

    }

The solution is to have an estimation of the loop bound based on the
fact that i_24 and i_18 do not wrap.  From this estimation, I think
that the other vars D.2050_6, D.2051_7 and D.2052_8 can be proved to
not wrap.  I'm working on some code for estimating niter from signed
non wrapping ivs.

Sebastian

Comment 12 Daniel Berlin 2005-07-26 18:34:03 UTC

Subject: Re:  [4.1 Regression] wrong code for
	casts and scev

On Tue, 2005-07-26 at 12:10 +0200, Sebastian Pop wrote:
> After inlining, we end up with a loop containing the following code:
> 
>    b.0_3 = (signed char) b_8;
>    D.1621_4 = (int) b.0_3;
>    a_5 = (signed char) D.1621_4;
>    D.1640_6 = (int) a_5;
>    b_7 = D.1640_6 - 127;
>    if (b_7 > 1) goto <L3>; else goto <L9>;
>    
> that is equivalent to:
> 
>    b_7 = ((int) (signed char) (int) (signed char) b_8) - 127;
>    if (b_7 > 1) goto <L3>; else goto <L9>;
> 
> with b_8 = (unsigned char) {1, +, 1}
> 
>    b_7 = ((int) (signed char) (int) (signed char) {(uchar)1, +, (uchar)1}) - 127;
>    if (b_7 > 1) goto <L3>; else goto <L9>;
> 
> A sequence of unsigned char 1, 2, ..., 255 has to be converted to
> signed char that would wrap with -fwrapv: 1, 2, ..., 127, -128, ...
> So here is a patch that keeps the cast if the sequence wraps. 

You mean keep the cast at all, or keep it in the evolution describing
the number of iterations in the loop.

If you mean keep it in the number of iterations of the loop, that seems
wrong:

Let's look at the case again
void abort(void);

static inline void
foo (signed char a)
{
  int b = a - 0x7F;
  if (b > 1)
    abort();
}

int main()
{
  unsigned char b;
  for(b = 0;b <0xFF;b++)
   foo (b);
}

Note that foo doesn't *actually* change the value of the passed in parameter.
The loop still iterates 0xFF times.

Why is the cast affecting the number of iterations in the loop at all?
We should still calculate the number of iterations to be 255 in any case.

also,

in *all* of the vectorizer testcases, number_of_iterations_in_loop should be determined to be "n", an int.

The code in those cases provably doesn't wrap (signed integer arithmetic doesn't wrap without -fwrapv), 
So why do we think it should wrap, when it's not casted at all?

Comment 13 Daniel Berlin 2005-07-26 18:41:18 UTC

Subject: Re:  [4.1 Regression] wrong code for
	casts and scev

On Tue, 2005-07-26 at 17:19 +0200, Sebastian Pop wrote:
> Dorit Naishlos wrote:
> > 
> > The modifications you suggest will make the tests uninteresting - they were
> > introduced with unknown loop-bound/offset on purpose. In fact, for most of
> > these tests there are already versions with explicit constant values for
> > the loop-bound/offset -
> 
> Okay, I won't modify any of these testcases.
> 
> > vect-46.c with explicit loop-bound becomes vect-40.c
> > vect-50.c with explicit loop-bound becomes vect-44.c
> > vect-52.c with explicit loop-bound becomes vect-48.c
> > vect-58.c with explicit loop-bound becomes vect-54.c
> > vect-60.c with explicit loop-bound becomes vect-56.c
> > vect-77.c and vect-78.c with explicit offset become vect-75.c
> > and vect-92.c also uses unknown loop-bound on purpose.
> > 
> > Can we change something else in the tests to make the evolution-analyzer
> > return something saner? by changing types of variables? by using some flag?
> > 
> 
> I don't think it is possible to properly convert these ivs without
> knowing an approximation of the number of iterations.

I disagree.

They are int.  
Nothing modifies the upper bound in the loop.  
The upper bound is also int.
They are not casted back and forth to unsigned.
The only definition of the induction variable is in the bumper statement
(i = i + 1).
This should be more than sufficient to prove it does not wrap, and that
we can convert it.
--Dan

Comment 14 sebastian.pop@cri.ensmp.fr 2005-07-27 09:01:56 UTC

Subject: Re:  [4.1 Regression] wrong code for casts and scev

dberlin at dberlin dot org wrote:
> > A sequence of unsigned char 1, 2, ..., 255 has to be converted to
> > signed char that would wrap with -fwrapv: 1, 2, ..., 127, -128, ...
> > So here is a patch that keeps the cast if the sequence wraps. 
> 
> You mean keep the cast at all, or keep it in the evolution describing
> the number of iterations in the loop.
> 

Sorry for not being clear.  When chrec_convert fails, it simply
returns fold_convert (type, chrec), and because the folder does not
know about the chrecs at all it just returns a convert_expr 
(outer_type) {(inner_type)x, +, (inner_type)y}.

> If you mean keep it in the number of iterations of the loop, that seems
> wrong:
> 
> 
> Let's look at the case again
> void abort(void);
> 
> static inline void
> foo (signed char a)
> {
>   int b = a - 0x7F;
>   if (b > 1)
>     abort();
> }
> 
> int main()
> {
>   unsigned char b;
>   for(b = 0;b <0xFF;b++)
>    foo (b);
> }
> 
> Note that foo doesn't *actually* change the value of the passed in parameter.
> The loop still iterates 0xFF times.
> 
> Why is the cast affecting the number of iterations in the loop at all?
> We should still calculate the number of iterations to be 255 in any case.
> 

Once inlined, you have a loop with two exits: the first is the natural
exit after 255 iterations, the second is a jump outside the loop to a
bb that contains the abort function.  

If we have an evolution {(signed char) 0, +, (signed char) 1} - 127
can be folded to {(schar) -127, +, (schar) 1}, and the loop exit
guarded by the "if (b > 1)" becomes true for iteration 128.

So, there are two exits: the first one exits after 255 iterations the
second one exits after 128, and niter takes the minimum between these
two estimations, that is 128.

Now the failing step is that it is not possible to convert the
unsigned sequence {0, +, 1} that stands for 0, 1, ..., 255, into the
signed sequence {0, +, 1} that stands for 0, 1, ..., 127.  Here, a
part of the sequence has been lost in the translation.  

The solution here is to keep the cast around the first sequence:
(schar) {(uchar)0, +, (uchar)1}.  This is always safe with both
wrapping and non-wrapping semantics.

> also,
> 
> in *all* of the vectorizer testcases, number_of_iterations_in_loop should be determined to be "n", an int.
> 
> The code in those cases provably doesn't wrap (signed integer arithmetic doesn't wrap without -fwrapv), 
> So why do we think it should wrap, when it's not casted at all?
> 

The scev_probably_wraps_p analyzer is still too weak for infering that
the sequences used by the vectorizer don't wrap.

Let's take the example from vect-77.c

  D.2035_21 = offD.2023_11 + iD.2026_30;
  D.2036_22 = (long unsigned intD.4) D.2035_21;
  D.2037_23 = D.2036_22 * 4;
  D.2038_24 = (aintD.2020 *) D.2037_23;
  D.2039_25 = ibD.2022_16 + D.2038_24;

because you have this cast to unsigned and then the multiplication by
4 and an unknown initial value off_11, even if you know that the main
index i_30 doesn't wrap, you still can chose an offset large enough
for making D.2037_23 wrap.  So we need some extra information about
the range of offset if we want to convert the scev of D.2035_21.

For the moment this is the point where the analyzer fails to properly
convert the scev, and produces:

(set_scalar_evolution 
  (scalar = D.2036_22)
  (scalar_evolution = (long unsigned int) {off_11, +, 1}_1))

Is it possible to assume that a variable converted to a pointer type
does not wrap? As in: D.2038_24 = (aintD.2020 *) D.2037_23;

Comment 15 sebastian.pop@cri.ensmp.fr 2005-07-27 09:04:47 UTC

Subject: Re:  [4.1 Regression] wrong code for casts and scev

dberlin at dberlin dot org wrote:
> > I don't think it is possible to properly convert these ivs without
> > knowing an approximation of the number of iterations.
> 
> I disagree.
> 

Okay, we could (and should) use other informations than the number of
iterations for proving non-wrappingness ;-)

Comment 16 GCC Commits 2005-08-13 17:28:53 UTC

Subject: Bug 22236

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	spop@gcc.gnu.org	2005-08-13 17:28:43

Modified files:
	gcc            : ChangeLog tree-cfg.c tree-chrec.c 
	                 tree-data-ref.c tree-data-ref.h 
	                 tree-flow-inline.h tree-flow.h 
	                 tree-scalar-evolution.c tree-ssa-loop-ivcanon.c 
	                 tree-ssa-loop-ivopts.c tree-ssa-loop-niter.c 
	                 tree-vect-analyze.c tree-vrp.c 
Added files:
	gcc/testsuite/gcc.dg/tree-ssa: pr22236.c 

Log message:
	PR tree-optimization/22236
	* tree-cfg.c (print_pred_bbs, print_succ_bbs): Correctly print
	successors and predecessors.
	* tree-chrec.c (chrec_convert): Before converting, check that
	sequences don't wrap.
	* tree-data-ref.c (compute_estimated_nb_iterations): Moved ...
	(analyze_array): Extern.
	(find_data_references_in_loop): Remove call to
	compute_estimated_nb_iterations.
	* tree-data-ref.h (analyze_array): Declared.
	* tree-flow-inline.h (single_ssa_tree_operand, single_ssa_use_operand,
	single_ssa_def_operand, zero_ssa_operands): Fix documentation.
	* tree-flow.h (scev_probably_wraps_p): Declare with an extra parameter.
	* tree-scalar-evolution.c (instantiate_parameters_1): Factor entry
	condition.
	* tree-ssa-loop-ivcanon.c: Fix documentation.
	* tree-ssa-loop-ivopts.c (idx_find_step): Add a fixme note.
	* tree-ssa-loop-niter.c (compute_estimated_nb_iterations): ... here.
	(infer_loop_bounds_from_undefined): New.
	(estimate_numbers_of_iterations_loop): Use
	infer_loop_bounds_from_undefined.
	(used_in_pointer_arithmetic_p): New.
	(scev_probably_wraps_p): Pass an extra parameter.  Call
	used_in_pointer_arithmetic_p.  Check that AT_STMT is not null.
	(convert_step): Fix documentation.
	* tree-vrp.c (adjust_range_with_scev): Call instantiate_parameters.
	Use initial_condition_in_loop_num and evolution_part_in_loop_num
	instead of CHREC_LEFT and CHREC_RIGHT.  Adjust the call to
	scev_probably_wraps_p.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.9720&r2=2.9721
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-cfg.c.diff?cvsroot=gcc&r1=2.215&r2=2.216
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-chrec.c.diff?cvsroot=gcc&r1=2.22&r2=2.23
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-data-ref.c.diff?cvsroot=gcc&r1=2.35&r2=2.36
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-data-ref.h.diff?cvsroot=gcc&r1=2.10&r2=2.11
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-flow-inline.h.diff?cvsroot=gcc&r1=2.55&r2=2.56
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-flow.h.diff?cvsroot=gcc&r1=2.131&r2=2.132
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-scalar-evolution.c.diff?cvsroot=gcc&r1=2.34&r2=2.35
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-loop-ivcanon.c.diff?cvsroot=gcc&r1=2.19&r2=2.20
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-loop-ivopts.c.diff?cvsroot=gcc&r1=2.86&r2=2.87
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-loop-niter.c.diff?cvsroot=gcc&r1=2.35&r2=2.36
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-vect-analyze.c.diff?cvsroot=gcc&r1=2.34&r2=2.35
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-vrp.c.diff?cvsroot=gcc&r1=2.51&r2=2.52
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/gcc.dg/tree-ssa/pr22236.c.diff?cvsroot=gcc&r1=NONE&r2=1.1

Comment 17 Andrew Pinski 2005-08-17 03:04:25 UTC

Hmm, we now have a missed optimization of removing the loop but that is not a regression so closing 
as fixed.