[patch] PR 30730, PR 26900

Zdenek Dvorak rakdver@atrey.karlin.mff.cuni.cz
Sun Mar 11 19:42:00 GMT 2007


Hello,

this is another attempt to fix some # of iterations analysis problems.
Unlike the previous try (http://gcc.gnu.org/ml/gcc-patches/2007-02/msg00931.html),
it avoids adding new transformations to fold-const, since such
transformations are not likely to be useful in any other context, and
might slow down rest of the compiler unnecessarily.

In addition to being possibly inefficient, another problem with my
previous attempts that try to use fold-const is that it is often
nontrivial to even formulate the assumptions, because of the risk of
introducing overflows.  For example, consider the following loop:

int i;

if (a - 2 < b)
  {
    i = a;
    while (i < b)
     i += 2;
  }

Since i is signed, it cannot overflow, and the number of iterations of
this loop can be expressed as ((unsigned) b - (unsigned) a + 1) / 2,
unless the computation overflows or a > b + 1.  These assumptions can be
verified using the information contained in the guard before the loop.
However, formulating these assumptions as tree expressions is nontrivial
-- in particular, asking whether a > b + 1 would be wrong if b + 1
overflows.

But, it turns out that most of these assumptions can be formulated as
queries on the value range of the difference of the bounds of the
loop, where the difference is computed without overflows (in unlimited
precision).  This also has an advantage that such computation without
overflows is easier to reason about, and the value range is fairly
easy to derive from the loop guard(s).

A small problem is that the bounds of such a value range may not belong
to the original type -- we need one more bit of precision to express
them -- and we do not necessarily have a wider type to represent the
value.  So, the patch uses gmp to represent them.

Some cleanups should follow up this patch.  I guess that with some
effort, we might be able to get rid of simplify_using_initial_conditions
or at least limit its usage.  Also, it might be nice to check whether
some code cannot be shared with vrp (although not caring about the
value ranges themselves, but rather about the value range of the
difference in unlimited precision seems to make this somewhat unlikely).

Bootstrapped & regtested on i686 and x86_64.  Given the size of the
patch, I will wait for some time to give people opportunity to comment
before commiting it.

Zdenek

	PR tree-optimization/30730
	PR tree-optimization/26900
	* tree-ssa-loop-niter.c: Include gmp.h.
	(bounds): New type.
	(mpz_set_double_int, get_type_bounds, mpz_to_double_int,
	split_to_var_and_offset, determine_value_range,
	bound_difference_of_offsetted_base, refine_bounds_using_guard,
	bound_difference, bounds_add, bounds_negate,
	number_of_iterations_ne_max, dump_affine_iv): New functions.
	(number_of_iterations_ne, number_of_iterations_lt_to_ne,
	assert_loop_rolls_lt, assert_loop_rolls_le): Use bounds on the
	difference of initial and final value of control iv to validate
	results.
	(number_of_iterations_cond): Add loop parameter.  Determine bounds
	on the difference of the extremes of the control iv.  Add dumps.
	(expand_simple_operations): Handle phi nodes.
	(simplify_using_initial_conditions): Do not record used conditions.
	(number_of_iterations_exit): Pass loop to number_of_iterations_cond.
	Do not set additional_info.
	(implies_nonnegative_p, implies_ge_p): Removed.
	(derive_constant_upper_bound): Do not use parameter `additional'.
	(record_estimate): Parameter `additional' removed.  Parameter
	`i_bound' added.  Do not call derive_constant_upper_bound.
	(record_nonwrapping_iv): Use derive_constant_upper_bound to
	bound the number of iterations estimate.
	(estimate_numbers_of_iterations_loop): Pass the estimate from
	the number of iterations analysis to record_estimate.
	* tree.h (multiple_of_p): Declare.
	* tree-scalar-evolution.c (expression_expensive_p): Removed.
	(scev_const_prop): Do not check expression_expensive_p.
	* fold-const.c (multiple_of_p): Exported.
	* double-int.c (double_int_mask): Exported.
	* double-int.h (double_int_mask): Declare.
	* tree-flow.h (struct tree_niter_desc): Removed additional_info
	field.  Added max field.

	* gcc.dg/tree-ssa/loop-26.c: New test.

Index: tree-ssa-loop-niter.c
===================================================================
*** tree-ssa-loop-niter.c	(revision 122817)
--- tree-ssa-loop-niter.c	(working copy)
*************** Software Foundation, 51 Franklin Street,
*** 42,50 ****
--- 42,55 ----
  #include "flags.h"
  #include "toplev.h"
  #include "tree-inline.h"
+ #include "gmp.h"
  
  #define SWAP(X, Y) do { void *tmp = (X); (X) = (Y); (Y) = tmp; } while (0)
  
+ /* The maximum number of dominator BBs we search for conditions
+    of loop header copies we use for simplifying a conditional
+    expression.  */
+ #define MAX_DOMINATORS_TO_WALK 8
  
  /*
  
*************** Software Foundation, 51 Franklin Street,
*** 52,57 ****
--- 57,548 ----
  
  */
  
+ /* Bounds on some value, BELOW <= X <= UP.  */
+ 
+ typedef struct
+ {
+   mpz_t below, up;
+ } bounds;
+ 
+ /* Sets RESULT to VAL, taken unsigned if UNS is true and as signed
+    otherwise.  */
+ 
+ static void
+ mpz_set_double_int (mpz_t result, double_int val, bool uns)
+ {
+   bool negate = false;
+   unsigned HOST_WIDE_INT vp[2];
+ 
+   if (!uns && double_int_negative_p (val))
+     {
+       negate = true;
+       val = double_int_neg (val);
+     }
+ 
+   vp[0] = val.low;
+   vp[1] = (unsigned HOST_WIDE_INT) val.high;
+   mpz_import (result, 2, -1, sizeof (HOST_WIDE_INT), 0, 0, vp);
+ 
+   if (negate)
+     mpz_neg (result, result);
+ }
+ 
+ /* Stores bounds of TYPE to MIN and MAX.  */
+ 
+ static void
+ get_type_bounds (tree type, mpz_t min, mpz_t max)
+ {
+   if (TYPE_UNSIGNED (type))
+     {
+       mpz_set_ui (min, 0);
+       mpz_set_double_int (max, double_int_mask (TYPE_PRECISION (type)), true);
+     }
+   else
+     {
+       double_int mx, mn;
+       
+       mx = double_int_mask (TYPE_PRECISION (type) - 1);
+       mn = double_int_sext (double_int_add (mx, double_int_one),
+ 			    TYPE_PRECISION (type));
+       mpz_set_double_int (max, mx, true);
+       mpz_set_double_int (min, mn, false);
+     }
+ }
+ 
+ /* Returns VAL converted to TYPE.  If VAL does not fit in TYPE,
+    the minimum or maximum value of the type is returned instead.  */
+ 
+ static double_int
+ mpz_to_double_int (tree type, mpz_t val)
+ {
+   mpz_t min, max;
+   unsigned HOST_WIDE_INT vp[2];
+   bool negate = false;
+   size_t count;
+   double_int res;
+ 
+   mpz_init (min);
+   mpz_init (max);
+   get_type_bounds (type, min, max);
+ 
+   if (mpz_cmp (val, min) < 0)
+     mpz_set (val, min);
+   else if (mpz_cmp (val, max) > 0)
+     mpz_set (val, max);
+ 
+   if (mpz_sgn (val) < 0)
+     negate = true;
+ 
+   vp[0] = 0;
+   vp[1] = 0;
+   mpz_export (vp, &count, -1, sizeof (HOST_WIDE_INT), 0, 0, val);
+   gcc_assert (count <= 2);
+   
+   mpz_clear (min);
+   mpz_clear (max);
+ 
+   res.low = vp[0];
+   res.high = (HOST_WIDE_INT) vp[1];
+ 
+   res = double_int_ext (res, TYPE_PRECISION (type), TYPE_UNSIGNED (type));
+   if (negate)
+     res = double_int_neg (res);
+ 
+   return res;
+ }
+ 
+ /* Splits expression EXPR to a variable part VAR and constant OFFSET.  */
+ 
+ static void
+ split_to_var_and_offset (tree expr, tree *var, mpz_t offset)
+ {
+   tree type = TREE_TYPE (expr);
+   tree op0, op1;
+   double_int off;
+   bool negate = false;
+ 
+   *var = expr;
+   mpz_set_ui (offset, 0);
+ 
+   switch (TREE_CODE (expr))
+     {
+     case MINUS_EXPR:
+       negate = true;
+       /* Fallthru.  */
+ 
+     case PLUS_EXPR:
+       op0 = TREE_OPERAND (expr, 0);
+       op1 = TREE_OPERAND (expr, 1);
+ 
+       if (TREE_CODE (op1) != INTEGER_CST)
+ 	break;
+ 
+       *var = op0;
+       /* Always sign extend the offset.  */
+       off = double_int_sext (tree_to_double_int (op1),
+ 			     TYPE_PRECISION (type));
+       mpz_set_double_int (offset, off, false);
+       break;
+ 
+     case INTEGER_CST:
+       *var = build_int_cst_type (type, 0);
+       off = tree_to_double_int (expr);
+       mpz_set_double_int (offset, off, TYPE_UNSIGNED (type));
+       break;
+ 
+     default:
+       break;
+     }
+ }
+ 
+ /* Stores estimate on the minimum/maximum value of the expression VAR + OFF
+    in TYPE to MIN and MAX.  */
+ 
+ static void
+ determine_value_range (tree type, tree var, mpz_t off,
+ 		       mpz_t min, mpz_t max)
+ {
+   /* If the expression is a constant, we know its value exactly.  */
+   if (integer_zerop (var))
+     {
+       mpz_set (min, off);
+       mpz_set (max, off);
+       return;
+     }
+ 
+   /* If the computation may wrap, we know nothing about the value, except for
+      the range of the type.  */
+   get_type_bounds (type, min, max);
+   if (!nowrap_type_p (type))
+     return;
+ 
+   /* Since the addition of OFF does not wrap, if OFF is positive, then we may
+      add it to MIN, otherwise to MAX.  */
+   if (mpz_sgn (off) < 0)
+     mpz_add (max, max, off);
+   else
+     mpz_add (min, min, off);
+ }
+ 
+ /* Stores the bounds on the difference of the values of the expressions
+    (var + X) and (var + Y), computed in TYPE, to BNDS.  */
+ 
+ static void
+ bound_difference_of_offsetted_base (tree type, mpz_t x, mpz_t y,
+ 				    bounds *bnds)
+ {
+   int rel = mpz_cmp (x, y);
+   bool may_wrap = !nowrap_type_p (type);
+   mpz_t m;
+ 
+   /* If X == Y, then the expressions are always equal.
+      If X > Y, there are the following possibilities:
+        a) neither of var + X and var + Y overflow or underflow, or both of
+ 	  them do.  Then their difference is X - Y.
+        b) var + X overflows, and var + Y does not.  Then the values of the
+ 	  expressions are var + X - M and var + Y, where M is the range of
+ 	  the type, and their difference is X - Y - M.
+        c) var + Y underflows and var + X does not.  Their difference again
+ 	  is M - X + Y.
+        Therefore, if the arithmetics in type does not overflow, then the
+        bounds are (X - Y, X - Y), otherwise they are (X - Y - M, X - Y)
+      Similarly, if X < Y, the bounds are either (X - Y, X - Y) or
+      (X - Y, X - Y + M).  */
+ 
+   if (rel == 0)
+     {
+       mpz_set_ui (bnds->below, 0);
+       mpz_set_ui (bnds->up, 0);
+       return;
+     }
+ 
+   mpz_init (m);
+   mpz_set_double_int (m, double_int_mask (TYPE_PRECISION (type)), true);
+   mpz_add_ui (m, m, 1);
+   mpz_sub (bnds->up, x, y);
+   mpz_set (bnds->below, bnds->up);
+ 
+   if (may_wrap)
+     {
+       if (rel > 0)
+ 	mpz_sub (bnds->below, bnds->below, m);
+       else
+ 	mpz_add (bnds->up, bnds->up, m);
+     }
+ 
+   mpz_clear (m);
+ }
+ 
+ /* From condition C0 CMP C1 derives information regarding the
+    difference of values of VARX + OFFX and VARY + OFFY, computed in TYPE,
+    and stores it to BNDS.  */
+ 
+ static void
+ refine_bounds_using_guard (tree type, tree varx, mpz_t offx,
+ 			   tree vary, mpz_t offy,
+ 			   tree c0, enum tree_code cmp, tree c1,
+ 			   bounds *bnds)
+ {
+   tree varc0, varc1, tmp;
+   mpz_t offc0, offc1, loffx, loffy, bnd;
+   bool lbound = false;
+   bool no_wrap = nowrap_type_p (type);
+   bool x_ok, y_ok;
+ 
+   switch (cmp)
+     {
+     case LT_EXPR:
+     case LE_EXPR:
+     case GT_EXPR:
+     case GE_EXPR:
+       break;
+ 
+     case EQ_EXPR:
+       /* We could derive quite precise information from EQ_EXPR, however, such
+ 	 a guard is unlikely to appear, so we do not bother with handling it. 
+ 	 TODO.  */
+       return;
+ 
+     case NE_EXPR:
+       /* NE_EXPR comparisons do not contain much of useful information (except for
+ 	 special cases like comparing with the bounds of the type, TODO).  */
+       return;
+     default:
+       return;
+     } 
+ 
+   mpz_init (offc0);
+   mpz_init (offc1);
+   split_to_var_and_offset (expand_simple_operations (c0), &varc0, offc0);
+   split_to_var_and_offset (expand_simple_operations (c1), &varc1, offc1);
+ 
+   /* We are only interested in comparisons of expressions based on VARX and
+      VARY.  TODO -- we might also be able to derive some bounds from
+      expressions containing just one of the variables.  */
+ 
+   if (operand_equal_p (varx, varc1, 0))
+     {
+       tmp = varc0; varc0 = varc1; varc1 = tmp;
+       mpz_swap (offc0, offc1);
+       cmp = swap_tree_comparison (cmp);
+     }
+ 
+   if (!operand_equal_p (varx, varc0, 0)
+       || !operand_equal_p (vary, varc1, 0))
+     goto end;
+ 
+   mpz_init_set (loffx, offx);
+   mpz_init_set (loffy, offy);
+ 
+   if (cmp == GT_EXPR || cmp == GE_EXPR)
+     {
+       tmp = varx; varx = vary; vary = tmp;
+       mpz_swap (offc0, offc1);
+       mpz_swap (loffx, loffy);
+       cmp = swap_tree_comparison (cmp);
+       lbound = true;
+     }
+ 
+   /* If there is no overflow, the condition implies that
+ 
+      (VARX + OFFX) cmp (VARY + OFFY) + (OFFX - OFFY + OFFC1 - OFFC0).
+ 
+      The overflows and underflows may complicate things a bit; each
+      overflow decreases the appropriate offset by M, and underflow
+      increases it by M.  The above inequality would not necessarily be
+      true if
+    
+      -- VARX + OFFX underflows and VARX + OFFC0 does not, or
+ 	VARX + OFFC0 overflows, but VARX + OFFX does not.
+ 	This may only happen if OFFX < OFFC0.
+      -- VARY + OFFY overflows and VARY + OFFC1 does not, or
+ 	VARY + OFFC1 underflows and VARY + OFFY does not.
+ 	This may only happen if OFFY > OFFC1.  */
+ 
+   if (no_wrap)
+     {
+       x_ok = true;
+       y_ok = true;
+     }
+   else
+     {
+       x_ok = (integer_zerop (varx)
+ 	      || mpz_cmp (loffx, offc0) >= 0);
+       y_ok = (integer_zerop (vary)
+ 	      || mpz_cmp (loffy, offc1) <= 0);
+     }
+ 
+   if (x_ok && y_ok)
+     {
+       mpz_init (bnd);
+       mpz_sub (bnd, loffx, loffy);
+       mpz_add (bnd, bnd, offc1);
+       mpz_sub (bnd, bnd, offc0);
+ 
+       if (cmp == LT_EXPR)
+ 	mpz_sub_ui (bnd, bnd, 1);
+ 
+       if (lbound)
+ 	{
+ 	  mpz_neg (bnd, bnd);
+ 	  if (mpz_cmp (bnds->below, bnd) < 0)
+ 	    mpz_set (bnds->below, bnd);
+ 	}
+       else
+ 	{
+ 	  if (mpz_cmp (bnd, bnds->up) < 0)
+ 	    mpz_set (bnds->up, bnd);
+ 	}
+       mpz_clear (bnd);
+     }
+ 
+   mpz_clear (loffx);
+   mpz_clear (loffy);
+ end:
+   mpz_clear (offc0);
+   mpz_clear (offc1);
+ }
+ 
+ /* Stores the bounds on the value of the expression X - Y in LOOP to BNDS.
+    The subtraction is considered to be performed in arbitrary precision,
+    without overflows.
+  
+    We do not attempt to be too clever regarding the value ranges of X and
+    Y; most of the time, they are just integers or ssa names offsetted by
+    integer.  However, we try to use the information contained in the
+    comparisons before the loop (usually created by loop header copying).  */
+ 
+ static void
+ bound_difference (struct loop *loop, tree x, tree y, bounds *bnds)
+ {
+   tree type = TREE_TYPE (x);
+   tree varx, vary;
+   mpz_t offx, offy;
+   mpz_t minx, maxx, miny, maxy;
+   int cnt = 0;
+   edge e;
+   basic_block bb;
+   tree cond, c0, c1, ctype;
+   enum tree_code cmp;
+ 
+   mpz_init (bnds->below);
+   mpz_init (bnds->up);
+   mpz_init (offx);
+   mpz_init (offy);
+   split_to_var_and_offset (x, &varx, offx);
+   split_to_var_and_offset (y, &vary, offy);
+ 
+   if (!integer_zerop (varx)
+       && operand_equal_p (varx, vary, 0))
+     {
+       /* Special case VARX == VARY -- we just need to compare the
+          offsets.  The matters are a bit more complicated in the
+ 	 case addition of offsets may wrap.  */
+       bound_difference_of_offsetted_base (type, offx, offy, bnds);
+     }
+   else
+     {
+       /* Otherwise, use the value ranges to determine the initial
+ 	 estimates on below and up.  */
+       mpz_init (minx);
+       mpz_init (maxx);
+       mpz_init (miny);
+       mpz_init (maxy);
+       determine_value_range (type, varx, offx, minx, maxx);
+       determine_value_range (type, vary, offy, miny, maxy);
+ 
+       mpz_sub (bnds->below, minx, maxy);
+       mpz_sub (bnds->up, maxx, miny);
+       mpz_clear (minx);
+       mpz_clear (maxx);
+       mpz_clear (miny);
+       mpz_clear (maxy);
+     }
+ 
+   /* If both X and Y are constants, we cannot get any more precise.  */
+   if (integer_zerop (varx) && integer_zerop (vary))
+     goto end;
+ 
+   /* Now walk the dominators of the loop header and use the entry
+      guards to refine the estimates.  */
+   for (bb = loop->header;
+        bb != ENTRY_BLOCK_PTR && cnt < MAX_DOMINATORS_TO_WALK;
+        bb = get_immediate_dominator (CDI_DOMINATORS, bb))
+     {
+       if (!single_pred_p (bb))
+ 	continue;
+       e = single_pred_edge (bb);
+ 
+       if (!(e->flags & (EDGE_TRUE_VALUE | EDGE_FALSE_VALUE)))
+ 	continue;
+ 
+       cond = COND_EXPR_COND (last_stmt (e->src));
+       if (!COMPARISON_CLASS_P (cond))
+ 	continue;
+       c0 = TREE_OPERAND (cond, 0);
+       cmp = TREE_CODE (cond);
+       c1 = TREE_OPERAND (cond, 1);
+       ctype = TREE_TYPE (c0);
+ 
+       if (!tree_ssa_useless_type_conversion_1 (ctype, type))
+ 	continue;
+ 
+       if (e->flags & EDGE_FALSE_VALUE)
+ 	cmp = invert_tree_comparison (cmp, false);
+ 
+       refine_bounds_using_guard (type, varx, offx, vary, offy,
+ 				 c0, cmp, c1, bnds);
+       ++cnt;
+     }
+ 
+ end:
+   mpz_clear (offx);
+   mpz_clear (offy);
+ }
+ 
+ /* Update the bounds in BNDS that restrict the value of X to the bounds
+    that restrict the value of X + DELTA.  X can be obtained as a
+    difference of two values in TYPE.  */
+ 
+ static void
+ bounds_add (bounds *bnds, double_int delta, tree type)
+ {
+   mpz_t mdelta, max;
+ 
+   mpz_init (mdelta);
+   mpz_set_double_int (mdelta, delta, false);
+ 
+   mpz_init (max);
+   mpz_set_double_int (max, double_int_mask (TYPE_PRECISION (type)), true);
+ 
+   mpz_add (bnds->up, bnds->up, mdelta);
+   mpz_add (bnds->below, bnds->below, mdelta);
+ 
+   if (mpz_cmp (bnds->up, max) > 0)
+     mpz_set (bnds->up, max);
+ 
+   mpz_neg (max, max);
+   if (mpz_cmp (bnds->below, max) < 0)
+     mpz_set (bnds->below, max);
+ 
+   mpz_clear (mdelta);
+   mpz_clear (max);
+ }
+ 
+ /* Update the bounds in BNDS that restrict the value of X to the bounds
+    that restrict the value of -X.  */
+ 
+ static void
+ bounds_negate (bounds *bnds)
+ {
+   mpz_t tmp;
+ 
+   mpz_init_set (tmp, bnds->up);
+   mpz_neg (bnds->up, bnds->below);
+   mpz_neg (bnds->below, tmp);
+   mpz_clear (tmp);
+ }
+ 
  /* Returns inverse of X modulo 2^s, where MASK = 2^s-1.  */
  
  static tree
*************** inverse (tree x, tree mask)
*** 96,121 ****
    return rslt;
  }
  
  /* Determines number of iterations of loop whose ending condition
     is IV <> FINAL.  TYPE is the type of the iv.  The number of
     iterations is stored to NITER.  NEVER_INFINITE is true if
     we know that the exit must be taken eventually, i.e., that the IV
     ever reaches the value FINAL (we derived this earlier, and possibly set
!    NITER->assumptions to make sure this is the case).  */
  
  static bool
  number_of_iterations_ne (tree type, affine_iv *iv, tree final,
! 			 struct tree_niter_desc *niter, bool never_infinite)
  {
    tree niter_type = unsigned_type_for (type);
    tree s, c, d, bits, assumption, tmp, bound;
  
    niter->control = *iv;
    niter->bound = final;
    niter->cmp = NE_EXPR;
  
!   /* Rearrange the terms so that we get inequality s * i <> c, with s
!      positive.  Also cast everything to the unsigned type.  */
    if (tree_int_cst_sign_bit (iv->step))
      {
        s = fold_convert (niter_type,
--- 587,657 ----
    return rslt;
  }
  
+ /* Derives the upper bound BND on the number of executions of loop with exit
+    condition S * i <> C, assuming that the loop is not infinite.  If
+    NO_OVERFLOW is true, then the control variable of the loop does not
+    overflow.  If NO_OVERFLOW is true or BNDS.below >= 0, then BNDS.up
+    contains the upper bound on the value of C.  */
+ 
+ static void
+ number_of_iterations_ne_max (mpz_t bnd, bool no_overflow, tree c, tree s,
+ 			     bounds *bnds)
+ {
+   double_int max;
+   mpz_t d;
+ 
+   /* If the control variable does not overflow, the number of iterations is
+      at most c / s.  Otherwise it is at most the period of the control
+      variable.  */
+   if (!no_overflow && !multiple_of_p (TREE_TYPE (c), c, s))
+     {
+       max = double_int_mask (TYPE_PRECISION (TREE_TYPE (c))
+ 			     - tree_low_cst (num_ending_zeros (s), 1));
+       mpz_set_double_int (bnd, max, true);
+       return;
+     }
+ 
+   /* Determine the upper bound on C.  */
+   if (no_overflow || mpz_sgn (bnds->below) >= 0)
+     mpz_set (bnd, bnds->up);
+   else if (TREE_CODE (c) == INTEGER_CST)
+     mpz_set_double_int (bnd, tree_to_double_int (c), true);
+   else
+     mpz_set_double_int (bnd, double_int_mask (TYPE_PRECISION (TREE_TYPE (c))),
+ 			true);
+ 
+   mpz_init (d);
+   mpz_set_double_int (d, tree_to_double_int (s), true);
+   mpz_fdiv_q (bnd, bnd, d);
+   mpz_clear (d);
+ }
+ 
  /* Determines number of iterations of loop whose ending condition
     is IV <> FINAL.  TYPE is the type of the iv.  The number of
     iterations is stored to NITER.  NEVER_INFINITE is true if
     we know that the exit must be taken eventually, i.e., that the IV
     ever reaches the value FINAL (we derived this earlier, and possibly set
!    NITER->assumptions to make sure this is the case).  BNDS contains the
!    bounds on the difference FINAL - IV->base.  */
  
  static bool
  number_of_iterations_ne (tree type, affine_iv *iv, tree final,
! 			 struct tree_niter_desc *niter, bool never_infinite,
! 			 bounds *bnds)
  {
    tree niter_type = unsigned_type_for (type);
    tree s, c, d, bits, assumption, tmp, bound;
+   mpz_t max;
  
    niter->control = *iv;
    niter->bound = final;
    niter->cmp = NE_EXPR;
  
!   /* Rearrange the terms so that we get inequality S * i <> C, with S
!      positive.  Also cast everything to the unsigned type.  If IV does
!      not overflow, BNDS bounds the value of C.  Also, this is the
!      case if the computation |FINAL - IV->base| does not overflow, i.e.,
!      if BNDS->below in the result is nonnegative.  */
    if (tree_int_cst_sign_bit (iv->step))
      {
        s = fold_convert (niter_type,
*************** number_of_iterations_ne (tree type, affi
*** 123,128 ****
--- 659,665 ----
        c = fold_build2 (MINUS_EXPR, niter_type,
  		       fold_convert (niter_type, iv->base),
  		       fold_convert (niter_type, final));
+       bounds_negate (bnds);
      }
    else
      {
*************** number_of_iterations_ne (tree type, affi
*** 132,137 ****
--- 669,679 ----
  		       fold_convert (niter_type, iv->base));
      }
  
+   mpz_init (max);
+   number_of_iterations_ne_max (max, iv->no_overflow, c, s, bnds);
+   niter->max = mpz_to_double_int (niter_type, max);
+   mpz_clear (max);
+ 
    /* First the trivial cases -- when the step is 1.  */
    if (integer_onep (s))
      {
*************** number_of_iterations_ne (tree type, affi
*** 175,191 ****
     of the step.  The assumptions necessary to ensure that the computation
     of the final value does not overflow are recorded in NITER.  If we
     find the final value, we adjust DELTA and return TRUE.  Otherwise
!    we return false.  */
  
  static bool
  number_of_iterations_lt_to_ne (tree type, affine_iv *iv0, affine_iv *iv1,
  			       struct tree_niter_desc *niter,
! 			       tree *delta, tree step)
  {
    tree niter_type = TREE_TYPE (step);
    tree mod = fold_build2 (FLOOR_MOD_EXPR, niter_type, *delta, step);
    tree tmod;
    tree assumption = boolean_true_node, bound, noloop;
  
    if (TREE_CODE (mod) != INTEGER_CST)
      return false;
--- 717,737 ----
     of the step.  The assumptions necessary to ensure that the computation
     of the final value does not overflow are recorded in NITER.  If we
     find the final value, we adjust DELTA and return TRUE.  Otherwise
!    we return false.  BNDS bounds the value of IV1->base - IV0->base,
!    and will be updated by the same amount as DELTA.  */
  
  static bool
  number_of_iterations_lt_to_ne (tree type, affine_iv *iv0, affine_iv *iv1,
  			       struct tree_niter_desc *niter,
! 			       tree *delta, tree step,
! 			       bounds *bnds)
  {
    tree niter_type = TREE_TYPE (step);
    tree mod = fold_build2 (FLOOR_MOD_EXPR, niter_type, *delta, step);
    tree tmod;
+   mpz_t mmod;
    tree assumption = boolean_true_node, bound, noloop;
+   bool ret = false;
  
    if (TREE_CODE (mod) != INTEGER_CST)
      return false;
*************** number_of_iterations_lt_to_ne (tree type
*** 193,198 ****
--- 739,748 ----
      mod = fold_build2 (MINUS_EXPR, niter_type, step, mod);
    tmod = fold_convert (type, mod);
  
+   mpz_init (mmod);
+   mpz_set_double_int (mmod, tree_to_double_int (mod), true);
+   mpz_neg (mmod, mmod);
+ 
    if (integer_nonzerop (iv0->step))
      {
        /* The final value of the iv is iv1->base + MOD, assuming that this
*************** number_of_iterations_lt_to_ne (tree type
*** 205,216 ****
  	  assumption = fold_build2 (LE_EXPR, boolean_type_node,
  				    iv1->base, bound);
  	  if (integer_zerop (assumption))
! 	    return false;
  	}
!       noloop = fold_build2 (GT_EXPR, boolean_type_node,
! 			    iv0->base,
! 			    fold_build2 (PLUS_EXPR, type,
! 					 iv1->base, tmod));
      }
    else
      {
--- 755,769 ----
  	  assumption = fold_build2 (LE_EXPR, boolean_type_node,
  				    iv1->base, bound);
  	  if (integer_zerop (assumption))
! 	    goto end;
  	}
!       if (mpz_cmp (mmod, bnds->below) < 0)
! 	noloop = boolean_false_node;
!       else
! 	noloop = fold_build2 (GT_EXPR, boolean_type_node,
! 			      iv0->base,
! 			      fold_build2 (PLUS_EXPR, type,
! 					   iv1->base, tmod));
      }
    else
      {
*************** number_of_iterations_lt_to_ne (tree type
*** 224,235 ****
  	  assumption = fold_build2 (GE_EXPR, boolean_type_node,
  				    iv0->base, bound);
  	  if (integer_zerop (assumption))
! 	    return false;
  	}
!       noloop = fold_build2 (GT_EXPR, boolean_type_node,
! 			    fold_build2 (MINUS_EXPR, type,
! 					 iv0->base, tmod),
! 			    iv1->base);
      }
  
    if (!integer_nonzerop (assumption))
--- 777,791 ----
  	  assumption = fold_build2 (GE_EXPR, boolean_type_node,
  				    iv0->base, bound);
  	  if (integer_zerop (assumption))
! 	    goto end;
  	}
!       if (mpz_cmp (mmod, bnds->below) < 0)
! 	noloop = boolean_false_node;
!       else
! 	noloop = fold_build2 (GT_EXPR, boolean_type_node,
! 			      fold_build2 (MINUS_EXPR, type,
! 					   iv0->base, tmod),
! 			      iv1->base);
      }
  
    if (!integer_nonzerop (assumption))
*************** number_of_iterations_lt_to_ne (tree type
*** 240,247 ****
      niter->may_be_zero = fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
  				      niter->may_be_zero,
  				      noloop);
    *delta = fold_build2 (PLUS_EXPR, niter_type, *delta, mod);
!   return true;
  }
  
  /* Add assertions to NITER that ensure that the control variable of the loop
--- 796,808 ----
      niter->may_be_zero = fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
  				      niter->may_be_zero,
  				      noloop);
+   bounds_add (bnds, tree_to_double_int (mod), type);
    *delta = fold_build2 (PLUS_EXPR, niter_type, *delta, mod);
! 
!   ret = true;
! end:
!   mpz_clear (mmod);
!   return ret;
  }
  
  /* Add assertions to NITER that ensure that the control variable of the loop
*************** assert_no_overflow_lt (tree type, affine
*** 315,328 ****
  }
  
  /* Add an assumption to NITER that a loop whose ending condition
!    is IV0 < IV1 rolls.  TYPE is the type of the control iv.  */
  
  static void
  assert_loop_rolls_lt (tree type, affine_iv *iv0, affine_iv *iv1,
! 		      struct tree_niter_desc *niter)
  {
    tree assumption = boolean_true_node, bound, diff;
    tree mbz, mbzl, mbzr;
  
    if (integer_nonzerop (iv0->step))
      {
--- 876,950 ----
  }
  
  /* Add an assumption to NITER that a loop whose ending condition
!    is IV0 < IV1 rolls.  TYPE is the type of the control iv.  BNDS
!    bounds the value of IV1->base - IV0->base.  */
  
  static void
  assert_loop_rolls_lt (tree type, affine_iv *iv0, affine_iv *iv1,
! 		      struct tree_niter_desc *niter, bounds *bnds)
  {
    tree assumption = boolean_true_node, bound, diff;
    tree mbz, mbzl, mbzr;
+   bool rolls_p, no_overflow_p;
+   double_int dstep;
+   mpz_t mstep, max;
+ 
+   /* We are going to compute the number of iterations as
+      (iv1->base - iv0->base + step - 1) / step, computed in the unsigned
+      variant of TYPE.  This formula only works if 
+      
+      -step + 1 <= (iv1->base - iv0->base) <= MAX - step + 1
+    
+      (where MAX is the maximum value of the unsigned variant of TYPE, and
+      the computations in this formula are performed in full precision
+      (without overflows).
+ 
+      Usually, for loops with exit condition iv0->base + step * i < iv1->base,
+      we have a condition of form iv0->base - step < iv1->base before the loop,
+      and for loops iv0->base < iv1->base - step * i the condition
+      iv0->base < iv1->base + step, due to loop header copying, which enable us
+      to prove the lower bound.
+      
+      The upper bound is more complicated.  Unless the expressions for initial
+      and final value themselves contain enough information, we usually cannot
+      derive it from the context.  */
+ 
+   /* First check whether the answer does not follow from the bounds we gathered
+      before.  */
+   if (integer_nonzerop (iv0->step))
+     dstep = tree_to_double_int (iv0->step);
+   else
+     {
+       dstep = double_int_sext (tree_to_double_int (iv1->step),
+ 			       TYPE_PRECISION (type));
+       dstep = double_int_neg (dstep);
+     }
+ 
+   mpz_init (mstep);
+   mpz_set_double_int (mstep, dstep, true);
+   mpz_neg (mstep, mstep);
+   mpz_add_ui (mstep, mstep, 1);
+ 
+   rolls_p = mpz_cmp (mstep, bnds->below) <= 0;
+ 
+   mpz_init (max);
+   mpz_set_double_int (max, double_int_mask (TYPE_PRECISION (type)), true);
+   mpz_add (max, max, mstep);
+   no_overflow_p = (mpz_cmp (bnds->up, max) <= 0
+ 		   /* For pointers, only values lying inside a single object
+ 		      can be compared or manipulated by pointer arithmetics.
+ 		      Gcc in general does not allow or handle objects larger
+ 		      than half of the address space, hence the upper bound
+ 		      is satisfied for pointers.  */
+ 		   || POINTER_TYPE_P (type));
+   mpz_clear (mstep);
+   mpz_clear (max);
+ 
+   if (rolls_p && no_overflow_p)
+     return;
+ 
+   /* Now the hard part; we must formulate the assumption(s) as expressions, and
+      we must be careful not to introduce overflow.  */
  
    if (integer_nonzerop (iv0->step))
      {
*************** assert_loop_rolls_lt (tree type, affine_
*** 362,388 ****
        mbzr = fold_build2 (MINUS_EXPR, type, iv1->base, diff);
      }
  
-   mbz = fold_build2 (GT_EXPR, boolean_type_node, mbzl, mbzr);
- 
    if (!integer_nonzerop (assumption))
      niter->assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
  				      niter->assumptions, assumption);
!   if (!integer_zerop (mbz))
!     niter->may_be_zero = fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
! 				      niter->may_be_zero, mbz);
  }
  
  /* Determines number of iterations of loop whose ending condition
     is IV0 < IV1.  TYPE is the type of the iv.  The number of
!    iterations is stored to NITER.  */
  
  static bool
  number_of_iterations_lt (tree type, affine_iv *iv0, affine_iv *iv1,
  			 struct tree_niter_desc *niter,
! 			 bool never_infinite ATTRIBUTE_UNUSED)
  {
    tree niter_type = unsigned_type_for (type);
    tree delta, step, s;
  
    if (integer_nonzerop (iv0->step))
      {
--- 984,1014 ----
        mbzr = fold_build2 (MINUS_EXPR, type, iv1->base, diff);
      }
  
    if (!integer_nonzerop (assumption))
      niter->assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
  				      niter->assumptions, assumption);
!   if (!rolls_p)
!     {
!       mbz = fold_build2 (GT_EXPR, boolean_type_node, mbzl, mbzr);
!       niter->may_be_zero = fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
! 					niter->may_be_zero, mbz);
!     }
  }
  
  /* Determines number of iterations of loop whose ending condition
     is IV0 < IV1.  TYPE is the type of the iv.  The number of
!    iterations is stored to NITER.  BNDS bounds the difference
!    IV1->base - IV0->base.  */
  
  static bool
  number_of_iterations_lt (tree type, affine_iv *iv0, affine_iv *iv1,
  			 struct tree_niter_desc *niter,
! 			 bool never_infinite ATTRIBUTE_UNUSED,
! 			 bounds *bnds)
  {
    tree niter_type = unsigned_type_for (type);
    tree delta, step, s;
+   mpz_t mstep, tmp;
  
    if (integer_nonzerop (iv0->step))
      {
*************** number_of_iterations_lt (tree type, affi
*** 412,421 ****
  	 for (i = iv1->base; i > iv0->base; i--).
  	     
  	 In both cases # of iterations is iv1->base - iv0->base, assuming that
! 	 iv1->base >= iv0->base.  */
!       niter->may_be_zero = fold_build2 (LT_EXPR, boolean_type_node,
! 					iv1->base, iv0->base);
        niter->niter = delta;
        return true;
      }
  
--- 1038,1055 ----
  	 for (i = iv1->base; i > iv0->base; i--).
  	     
  	 In both cases # of iterations is iv1->base - iv0->base, assuming that
! 	 iv1->base >= iv0->base.
! 
!          First try to derive a lower bound on the value of
! 	 iv1->base - iv0->base, computed in full precision.  If the difference
! 	 is nonnegative, we are done, otherwise we must record the
! 	 condition.  */
! 
!       if (mpz_sgn (bnds->below) < 0)
! 	niter->may_be_zero = fold_build2 (LT_EXPR, boolean_type_node,
! 					  iv1->base, iv0->base);
        niter->niter = delta;
+       niter->max = mpz_to_double_int (niter_type, bnds->up);
        return true;
      }
  
*************** number_of_iterations_lt (tree type, affi
*** 428,434 ****
    /* If we can determine the final value of the control iv exactly, we can
       transform the condition to != comparison.  In particular, this will be
       the case if DELTA is constant.  */
!   if (number_of_iterations_lt_to_ne (type, iv0, iv1, niter, &delta, step))
      {
        affine_iv zps;
  
--- 1062,1069 ----
    /* If we can determine the final value of the control iv exactly, we can
       transform the condition to != comparison.  In particular, this will be
       the case if DELTA is constant.  */
!   if (number_of_iterations_lt_to_ne (type, iv0, iv1, niter, &delta, step,
! 				     bnds))
      {
        affine_iv zps;
  
*************** number_of_iterations_lt (tree type, affi
*** 438,444 ****
  	 zps does not overflow.  */
        zps.no_overflow = true;
  
!       return number_of_iterations_ne (type, &zps, delta, niter, true);
      }
  
    /* Make sure that the control iv does not overflow.  */
--- 1073,1079 ----
  	 zps does not overflow.  */
        zps.no_overflow = true;
  
!       return number_of_iterations_ne (type, &zps, delta, niter, true, bnds);
      }
  
    /* Make sure that the control iv does not overflow.  */
*************** number_of_iterations_lt (tree type, affi
*** 448,459 ****
    /* We determine the number of iterations as (delta + step - 1) / step.  For
       this to work, we must know that iv1->base >= iv0->base - step + 1,
       otherwise the loop does not roll.  */
!   assert_loop_rolls_lt (type, iv0, iv1, niter);
  
    s = fold_build2 (MINUS_EXPR, niter_type,
  		   step, build_int_cst (niter_type, 1));
    delta = fold_build2 (PLUS_EXPR, niter_type, delta, s);
    niter->niter = fold_build2 (FLOOR_DIV_EXPR, niter_type, delta, step);
    return true;
  }
  
--- 1083,1105 ----
    /* We determine the number of iterations as (delta + step - 1) / step.  For
       this to work, we must know that iv1->base >= iv0->base - step + 1,
       otherwise the loop does not roll.  */
!   assert_loop_rolls_lt (type, iv0, iv1, niter, bnds);
  
    s = fold_build2 (MINUS_EXPR, niter_type,
  		   step, build_int_cst (niter_type, 1));
    delta = fold_build2 (PLUS_EXPR, niter_type, delta, s);
    niter->niter = fold_build2 (FLOOR_DIV_EXPR, niter_type, delta, step);
+ 
+   mpz_init (mstep);
+   mpz_init (tmp);
+   mpz_set_double_int (mstep, tree_to_double_int (step), true);
+   mpz_add (tmp, bnds->up, mstep);
+   mpz_sub_ui (tmp, tmp, 1);
+   mpz_fdiv_q (tmp, tmp, mstep);
+   niter->max = mpz_to_double_int (niter_type, tmp);
+   mpz_clear (mstep);
+   mpz_clear (tmp);
+ 
    return true;
  }
  
*************** number_of_iterations_lt (tree type, affi
*** 462,472 ****
     iterations is stored to NITER.  NEVER_INFINITE is true if
     we know that this condition must eventually become false (we derived this
     earlier, and possibly set NITER->assumptions to make sure this
!    is the case).  */
  
  static bool
  number_of_iterations_le (tree type, affine_iv *iv0, affine_iv *iv1,
! 			 struct tree_niter_desc *niter, bool never_infinite)
  {
    tree assumption;
  
--- 1108,1119 ----
     iterations is stored to NITER.  NEVER_INFINITE is true if
     we know that this condition must eventually become false (we derived this
     earlier, and possibly set NITER->assumptions to make sure this
!    is the case).  BNDS bounds the difference IV1->base - IV0->base.  */
  
  static bool
  number_of_iterations_le (tree type, affine_iv *iv0, affine_iv *iv1,
! 			 struct tree_niter_desc *niter, bool never_infinite,
! 			 bounds *bnds)
  {
    tree assumption;
  
*************** number_of_iterations_le (tree type, affi
*** 497,503 ****
    else
      iv0->base = fold_build2 (MINUS_EXPR, type,
  			     iv0->base, build_int_cst (type, 1));
!   return number_of_iterations_lt (type, iv0, iv1, niter, never_infinite);
  }
  
  /* Determine the number of iterations according to condition (for staying
--- 1144,1171 ----
    else
      iv0->base = fold_build2 (MINUS_EXPR, type,
  			     iv0->base, build_int_cst (type, 1));
! 
!   bounds_add (bnds, double_int_one, type);
! 
!   return number_of_iterations_lt (type, iv0, iv1, niter, never_infinite, bnds);
! }
! 
! /* Dumps description of affine induction variable IV to FILE.  */
! 
! static void
! dump_affine_iv (FILE *file, affine_iv *iv)
! {
!   if (!integer_zerop (iv->step))
!     fprintf (file, "[");
! 
!   print_generic_expr (dump_file, iv->base, TDF_SLIM);
! 
!   if (!integer_zerop (iv->step))
!     {
!       fprintf (file, ", + , ");
!       print_generic_expr (dump_file, iv->step, TDF_SLIM);
!       fprintf (file, "]%s", iv->no_overflow ? "(no_overflow)" : "");
!     }
  }
  
  /* Determine the number of iterations according to condition (for staying
*************** number_of_iterations_le (tree type, affi
*** 507,512 ****
--- 1175,1182 ----
     type TYPE, which must be an integer or pointer type.  The steps of the
     ivs must be constants (or NULL_TREE, which is interpreted as constant zero).
  
+    LOOP is the loop whose number of iterations we are determining.
+ 
     ONLY_EXIT is true if we are sure this is the only way the loop could be
     exited (including possibly non-returning function calls, exceptions, etc.)
     -- in this case we can use the information whether the control induction
*************** number_of_iterations_le (tree type, affi
*** 518,528 ****
     was determined (possibly with some assumptions).  */
  
  static bool
! number_of_iterations_cond (tree type, affine_iv *iv0, enum tree_code code,
  			   affine_iv *iv1, struct tree_niter_desc *niter,
  			   bool only_exit)
  {
!   bool never_infinite;
  
    /* The meaning of these assumptions is this:
       if !assumptions
--- 1188,1200 ----
     was determined (possibly with some assumptions).  */
  
  static bool
! number_of_iterations_cond (struct loop *loop,
! 			   tree type, affine_iv *iv0, enum tree_code code,
  			   affine_iv *iv1, struct tree_niter_desc *niter,
  			   bool only_exit)
  {
!   bool never_infinite, ret;
!   bounds bnds;
  
    /* The meaning of these assumptions is this:
       if !assumptions
*************** number_of_iterations_cond (tree type, af
*** 532,538 ****
    niter->assumptions = boolean_true_node;
    niter->may_be_zero = boolean_false_node;
    niter->niter = NULL_TREE;
!   niter->additional_info = boolean_true_node;
  
    niter->bound = NULL_TREE;
    niter->cmp = ERROR_MARK;
--- 1204,1210 ----
    niter->assumptions = boolean_true_node;
    niter->may_be_zero = boolean_false_node;
    niter->niter = NULL_TREE;
!   niter->max = double_int_zero;
  
    niter->bound = NULL_TREE;
    niter->cmp = ERROR_MARK;
*************** number_of_iterations_cond (tree type, af
*** 618,640 ****
    if (integer_zerop (fold_build2 (code, boolean_type_node, iv0->base, iv1->base)))
      {
        niter->niter = build_int_cst (unsigned_type_for (type), 0);
        return true;
      }
! 
    /* OK, now we know we have a senseful loop.  Handle several cases, depending
       on what comparison operator is used.  */
    switch (code)
      {
      case NE_EXPR:
        gcc_assert (integer_zerop (iv1->step));
!       return number_of_iterations_ne (type, iv0, iv1->base, niter, never_infinite);
      case LT_EXPR:
!       return number_of_iterations_lt (type, iv0, iv1, niter, never_infinite);
      case LE_EXPR:
!       return number_of_iterations_le (type, iv0, iv1, niter, never_infinite);
      default:
        gcc_unreachable ();
      }
  }
  
  /* Substitute NEW for OLD in EXPR and fold the result.  */
--- 1290,1378 ----
    if (integer_zerop (fold_build2 (code, boolean_type_node, iv0->base, iv1->base)))
      {
        niter->niter = build_int_cst (unsigned_type_for (type), 0);
+       niter->max = double_int_zero;
        return true;
      }
! 	  
    /* OK, now we know we have a senseful loop.  Handle several cases, depending
       on what comparison operator is used.  */
+   bound_difference (loop, iv1->base, iv0->base, &bnds);
+ 
+   if (dump_file && (dump_flags & TDF_DETAILS))
+     {
+       fprintf (dump_file,
+ 	       "Analysing # of iterations of loop %d\n", loop->num);
+ 
+       fprintf (dump_file, "  exit condition ");
+       dump_affine_iv (dump_file, iv0);
+       fprintf (dump_file, " %s ",
+ 	       code == NE_EXPR ? "!="
+ 	       : code == LT_EXPR ? "<"
+ 	       : "<=");
+       dump_affine_iv (dump_file, iv1);
+       fprintf (dump_file, "\n");
+ 
+       fprintf (dump_file, "  bounds on difference of bases: ");
+       mpz_out_str (dump_file, 10, bnds.below);
+       fprintf (dump_file, " ... ");
+       mpz_out_str (dump_file, 10, bnds.up);
+       fprintf (dump_file, "\n");
+     }
+ 
    switch (code)
      {
      case NE_EXPR:
        gcc_assert (integer_zerop (iv1->step));
!       ret = number_of_iterations_ne (type, iv0, iv1->base, niter,
! 				     never_infinite, &bnds);
!       break;
! 
      case LT_EXPR:
!       ret = number_of_iterations_lt (type, iv0, iv1, niter, never_infinite,
! 				     &bnds);
!       break;
! 
      case LE_EXPR:
!       ret = number_of_iterations_le (type, iv0, iv1, niter, never_infinite,
! 				     &bnds);
!       break;
! 
      default:
        gcc_unreachable ();
      }
+ 
+   mpz_clear (bnds.up);
+   mpz_clear (bnds.below);
+ 
+   if (dump_file && (dump_flags & TDF_DETAILS))
+     {
+       if (ret)
+ 	{
+ 	  fprintf (dump_file, "  result:\n");
+ 	  if (!integer_nonzerop (niter->assumptions))
+ 	    {
+ 	      fprintf (dump_file, "    under assumptions ");
+ 	      print_generic_expr (dump_file, niter->assumptions, TDF_SLIM);
+ 	      fprintf (dump_file, "\n");
+ 	    }
+ 
+ 	  if (!integer_zerop (niter->may_be_zero))
+ 	    {
+ 	      fprintf (dump_file, "    zero if ");
+ 	      print_generic_expr (dump_file, niter->may_be_zero, TDF_SLIM);
+ 	      fprintf (dump_file, "\n");
+ 	    }
+ 
+ 	  fprintf (dump_file, "    # of iterations ");
+ 	  print_generic_expr (dump_file, niter->niter, TDF_SLIM);
+ 	  fprintf (dump_file, ", bounded by ");
+ 	  dump_double_int (dump_file, niter->max, true);
+ 	  fprintf (dump_file, "\n");
+ 	}
+       else
+ 	fprintf (dump_file, "  failed\n\n");
+     }
+   return ret;
  }
  
  /* Substitute NEW for OLD in EXPR and fold the result.  */
*************** expand_simple_operations (tree expr)
*** 718,723 ****
--- 1456,1479 ----
      return expr;
  
    stmt = SSA_NAME_DEF_STMT (expr);
+   if (TREE_CODE (stmt) == PHI_NODE)
+     {
+       basic_block src, dest;
+ 
+       if (PHI_NUM_ARGS (stmt) != 1)
+ 	return expr;
+       e = PHI_ARG_DEF (stmt, 0);
+ 
+       /* Avoid propagating through loop exit phi nodes, which
+ 	 could break loop-closed SSA form restrictions.  */
+       dest = bb_for_stmt (stmt);
+       src = single_pred (dest);
+       if (TREE_CODE (e) == SSA_NAME
+ 	  && src->loop_father != dest->loop_father)
+ 	return expr;
+ 
+       return expand_simple_operations (e);
+     }
    if (TREE_CODE (stmt) != GIMPLE_MODIFY_STMT)
      return expr;
  
*************** tree_simplify_using_condition (tree cond
*** 861,883 ****
    return tree_simplify_using_condition_1 (cond, expr);
  }
  
- /* The maximum number of dominator BBs we search for conditions
-    of loop header copies we use for simplifying a conditional
-    expression.  */
- #define MAX_DOMINATORS_TO_WALK 8
- 
  /* Tries to simplify EXPR using the conditions on entry to LOOP.
-    Record the conditions used for simplification to CONDS_USED.
     Returns the simplified expression (or EXPR unchanged, if no
     simplification was possible).*/
  
  static tree
! simplify_using_initial_conditions (struct loop *loop, tree expr,
! 				   tree *conds_used)
  {
    edge e;
    basic_block bb;
!   tree exp, cond;
    int cnt = 0;
  
    if (TREE_CODE (expr) == INTEGER_CST)
--- 1617,1632 ----
    return tree_simplify_using_condition_1 (cond, expr);
  }
  
  /* Tries to simplify EXPR using the conditions on entry to LOOP.
     Returns the simplified expression (or EXPR unchanged, if no
     simplification was possible).*/
  
  static tree
! simplify_using_initial_conditions (struct loop *loop, tree expr)
  {
    edge e;
    basic_block bb;
!   tree cond;
    int cnt = 0;
  
    if (TREE_CODE (expr) == INTEGER_CST)
*************** simplify_using_initial_conditions (struc
*** 900,914 ****
        cond = COND_EXPR_COND (last_stmt (e->src));
        if (e->flags & EDGE_FALSE_VALUE)
  	cond = invert_truthvalue (cond);
!       exp = tree_simplify_using_condition (cond, expr);
! 
!       if (exp != expr)
! 	*conds_used = fold_build2 (TRUTH_AND_EXPR,
! 				   boolean_type_node,
! 				   *conds_used,
! 				   cond);
! 
!       expr = exp;
        ++cnt;
      }
  
--- 1649,1655 ----
        cond = COND_EXPR_COND (last_stmt (e->src));
        if (e->flags & EDGE_FALSE_VALUE)
  	cond = invert_truthvalue (cond);
!       expr = tree_simplify_using_condition (cond, expr);
        ++cnt;
      }
  
*************** number_of_iterations_exit (struct loop *
*** 1065,1071 ****
  
    iv0.base = expand_simple_operations (iv0.base);
    iv1.base = expand_simple_operations (iv1.base);
!   if (!number_of_iterations_cond (type, &iv0, code, &iv1, niter,
  				  loop_only_exit_p (loop, exit)))
      {
        fold_undefer_and_ignore_overflow_warnings ();
--- 1806,1812 ----
  
    iv0.base = expand_simple_operations (iv0.base);
    iv1.base = expand_simple_operations (iv1.base);
!   if (!number_of_iterations_cond (loop, type, &iv0, code, &iv1, niter,
  				  loop_only_exit_p (loop, exit)))
      {
        fold_undefer_and_ignore_overflow_warnings ();
*************** number_of_iterations_exit (struct loop *
*** 1081,1095 ****
        niter->niter = simplify_using_outer_evolutions (loop, niter->niter);
      }
  
-   niter->additional_info = boolean_true_node;
    niter->assumptions
  	  = simplify_using_initial_conditions (loop,
! 					       niter->assumptions,
! 					       &niter->additional_info);
    niter->may_be_zero
  	  = simplify_using_initial_conditions (loop,
! 					       niter->may_be_zero,
! 					       &niter->additional_info);
  
    fold_undefer_and_ignore_overflow_warnings ();
  
--- 1822,1833 ----
        niter->niter = simplify_using_outer_evolutions (loop, niter->niter);
      }
  
    niter->assumptions
  	  = simplify_using_initial_conditions (loop,
! 					       niter->assumptions);
    niter->may_be_zero
  	  = simplify_using_initial_conditions (loop,
! 					       niter->may_be_zero);
  
    fold_undefer_and_ignore_overflow_warnings ();
  
*************** find_loop_niter_by_eval (struct loop *lo
*** 1469,1523 ****
  
  */
  
- /* Returns true if we can prove that COND ==> VAL >= 0.  */
- 
- static bool
- implies_nonnegative_p (tree cond, tree val)
- {
-   tree type = TREE_TYPE (val);
-   tree compare;
- 
-   if (tree_expr_nonnegative_p (val))
-     return true;
- 
-   if (integer_nonzerop (cond))
-     return false;
- 
-   compare = fold_build2 (GE_EXPR,
- 			 boolean_type_node, val, build_int_cst (type, 0));
-   compare = tree_simplify_using_condition_1 (cond, compare);
- 
-   return integer_nonzerop (compare);
- }
- 
- /* Returns true if we can prove that COND ==> A >= B.  */
- 
- static bool
- implies_ge_p (tree cond, tree a, tree b)
- {
-   tree compare = fold_build2 (GE_EXPR, boolean_type_node, a, b);
- 
-   if (integer_nonzerop (compare))
-     return true;
- 
-   if (integer_nonzerop (cond))
-     return false;
- 
-   compare = tree_simplify_using_condition_1 (cond, compare);
- 
-   return integer_nonzerop (compare);
- }
- 
  /* Returns a constant upper bound on the value of expression VAL.  VAL
     is considered to be unsigned.  If its type is signed, its value must
!    be nonnegative.
!    
!    The condition ADDITIONAL must be satisfied (for example, if VAL is
!    "(unsigned) n" and ADDITIONAL is "n > 0", then we can derive that
!    VAL is at most (unsigned) MAX_INT).  */
   
  static double_int
! derive_constant_upper_bound (tree val, tree additional)
  {
    tree type = TREE_TYPE (val);
    tree op0, op1, subtype, maxt;
--- 2207,2218 ----
  
  */
  
  /* Returns a constant upper bound on the value of expression VAL.  VAL
     is considered to be unsigned.  If its type is signed, its value must
!    be nonnegative.  */
   
  static double_int
! derive_constant_upper_bound (tree val)
  {
    tree type = TREE_TYPE (val);
    tree op0, op1, subtype, maxt;
*************** derive_constant_upper_bound (tree val, t
*** 1544,1550 ****
  	  /* If TYPE is also signed, the fact that VAL is nonnegative implies
  	     that OP0 is nonnegative.  */
  	  && TYPE_UNSIGNED (type)
! 	  && !implies_nonnegative_p (additional, op0))
  	{
  	  /* If we cannot prove that the casted expression is nonnegative,
  	     we cannot establish more useful upper bound than the precision
--- 2239,2245 ----
  	  /* If TYPE is also signed, the fact that VAL is nonnegative implies
  	     that OP0 is nonnegative.  */
  	  && TYPE_UNSIGNED (type)
! 	  && !tree_expr_nonnegative_p (op0))
  	{
  	  /* If we cannot prove that the casted expression is nonnegative,
  	     we cannot establish more useful upper bound than the precision
*************** derive_constant_upper_bound (tree val, t
*** 1554,1560 ****
  
        /* We now know that op0 is an nonnegative value.  Try deriving an upper
  	 bound for it.  */
!       bnd = derive_constant_upper_bound (op0, additional);
  
        /* If the bound does not fit in TYPE, max. value of TYPE could be
  	 attained.  */
--- 2249,2255 ----
  
        /* We now know that op0 is an nonnegative value.  Try deriving an upper
  	 bound for it.  */
!       bnd = derive_constant_upper_bound (op0);
  
        /* If the bound does not fit in TYPE, max. value of TYPE could be
  	 attained.  */
*************** derive_constant_upper_bound (tree val, t
*** 1569,1575 ****
        op1 = TREE_OPERAND (val, 1);
  
        if (TREE_CODE (op1) != INTEGER_CST
! 	  || !implies_nonnegative_p (additional, op0))
  	return max;
  
        /* Canonicalize to OP0 - CST.  Consider CST to be signed, in order to
--- 2264,2270 ----
        op1 = TREE_OPERAND (val, 1);
  
        if (TREE_CODE (op1) != INTEGER_CST
! 	  || !tree_expr_nonnegative_p (op0))
  	return max;
  
        /* Canonicalize to OP0 - CST.  Consider CST to be signed, in order to
*************** derive_constant_upper_bound (tree val, t
*** 1580,1586 ****
        if (TREE_CODE (val) == PLUS_EXPR)
  	cst = double_int_neg (cst);
  
!       bnd = derive_constant_upper_bound (op0, additional);
  
        if (double_int_negative_p (cst))
  	{
--- 2275,2281 ----
        if (TREE_CODE (val) == PLUS_EXPR)
  	cst = double_int_neg (cst);
  
!       bnd = derive_constant_upper_bound (op0);
  
        if (double_int_negative_p (cst))
  	{
*************** derive_constant_upper_bound (tree val, t
*** 1611,1625 ****
  	   */
  
  	  /* This should only happen if the type is unsigned; however, for
! 	     programs that use overflowing signed arithmetics even with
  	     -fno-wrapv, this condition may also be true for signed values.  */
  	  if (double_int_ucmp (bnd, cst) < 0)
  	    return max;
  
! 	  if (TYPE_UNSIGNED (type)
! 	      && !implies_ge_p (additional,
! 				op0, double_int_to_tree (type, cst)))
! 	    return max;
  
  	  bnd = double_int_add (bnd, double_int_neg (cst));
  	}
--- 2306,2323 ----
  	   */
  
  	  /* This should only happen if the type is unsigned; however, for
! 	     buggy programs that use overflowing signed arithmetics even with
  	     -fno-wrapv, this condition may also be true for signed values.  */
  	  if (double_int_ucmp (bnd, cst) < 0)
  	    return max;
  
! 	  if (TYPE_UNSIGNED (type))
! 	    {
! 	      tree tem = fold_binary (GE_EXPR, boolean_type_node, op0,
! 				      double_int_to_tree (type, cst));
! 	      if (!tem || integer_nonzerop (tem))
! 		return max;
! 	    }
  
  	  bnd = double_int_add (bnd, double_int_neg (cst));
  	}
*************** derive_constant_upper_bound (tree val, t
*** 1634,1640 ****
  	  || tree_int_cst_sign_bit (op1))
  	return max;
  
!       bnd = derive_constant_upper_bound (op0, additional);
        return double_int_udiv (bnd, tree_to_double_int (op1), FLOOR_DIV_EXPR);
  
      case BIT_AND_EXPR:
--- 2332,2338 ----
  	  || tree_int_cst_sign_bit (op1))
  	return max;
  
!       bnd = derive_constant_upper_bound (op0);
        return double_int_udiv (bnd, tree_to_double_int (op1), FLOOR_DIV_EXPR);
  
      case BIT_AND_EXPR:
*************** derive_constant_upper_bound (tree val, t
*** 1649,1675 ****
        if (TREE_CODE (stmt) != GIMPLE_MODIFY_STMT
  	  || GIMPLE_STMT_OPERAND (stmt, 0) != val)
  	return max;
!       return derive_constant_upper_bound (GIMPLE_STMT_OPERAND (stmt, 1),
! 	  				  additional);
  
      default: 
        return max;
      }
  }
  
! /* Records that AT_STMT is executed at most BOUND + 1 times in LOOP.  The
!    additional condition ADDITIONAL is recorded with the bound.  IS_EXIT
     is true if the loop is exited immediately after STMT, and this exit
     is taken at last when the STMT is executed BOUND + 1 times.
     REALISTIC is true if the estimate comes from a reliable source
!    (number of iterations analysis, or size of data accessed in the loop).  */
  
  static void
! record_estimate (struct loop *loop, tree bound, tree additional, tree at_stmt,
! 		 bool is_exit, bool realistic)
  {
    struct nb_iter_bound *elt = xmalloc (sizeof (struct nb_iter_bound));
-   double_int i_bound = derive_constant_upper_bound (bound, additional);
  
    if (dump_file && (dump_flags & TDF_DETAILS))
      {
--- 2347,2371 ----
        if (TREE_CODE (stmt) != GIMPLE_MODIFY_STMT
  	  || GIMPLE_STMT_OPERAND (stmt, 0) != val)
  	return max;
!       return derive_constant_upper_bound (GIMPLE_STMT_OPERAND (stmt, 1));
  
      default: 
        return max;
      }
  }
  
! /* Records that AT_STMT is executed at most BOUND + 1 times in LOOP.  IS_EXIT
     is true if the loop is exited immediately after STMT, and this exit
     is taken at last when the STMT is executed BOUND + 1 times.
     REALISTIC is true if the estimate comes from a reliable source
!    (number of iterations analysis, or size of data accessed in the loop).
!    I_BOUND is an unsigned double_int upper estimate on BOUND.  */
  
  static void
! record_estimate (struct loop *loop, tree bound, double_int i_bound,
! 		 tree at_stmt, bool is_exit, bool realistic)
  {
    struct nb_iter_bound *elt = xmalloc (sizeof (struct nb_iter_bound));
  
    if (dump_file && (dump_flags & TDF_DETAILS))
      {
*************** record_nonwrapping_iv (struct loop *loop
*** 1701,1706 ****
--- 2397,2403 ----
  {
    tree niter_bound, extreme, delta;
    tree type = TREE_TYPE (base), unsigned_type;
+   double_int max;
  
    if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
      return;
*************** record_nonwrapping_iv (struct loop *loop
*** 1741,1748 ****
    /* STMT is executed at most NITER_BOUND + 1 times, since otherwise the value
       would get out of the range.  */
    niter_bound = fold_build2 (FLOOR_DIV_EXPR, unsigned_type, delta, step);
!   record_estimate (loop, niter_bound, boolean_true_node, stmt,
! 		   false, data_size_bounds_p);
  }
  
  /* Initialize LOOP->ESTIMATED_NB_ITERATIONS with the lowest safe
--- 2438,2445 ----
    /* STMT is executed at most NITER_BOUND + 1 times, since otherwise the value
       would get out of the range.  */
    niter_bound = fold_build2 (FLOOR_DIV_EXPR, unsigned_type, delta, step);
!   max = derive_constant_upper_bound (niter_bound);
!   record_estimate (loop, niter_bound, max, stmt, false, data_size_bounds_p);
  }
  
  /* Initialize LOOP->ESTIMATED_NB_ITERATIONS with the lowest safe
*************** estimate_numbers_of_iterations_loop (str
*** 2065,2072 ****
  	niter = build3 (COND_EXPR, type, niter_desc.may_be_zero,
  			build_int_cst (type, 0),
  			niter);
!       record_estimate (loop, niter,
! 		       niter_desc.additional_info,
  		       last_stmt (ex->src),
  		       true, true);
      }
--- 2762,2768 ----
  	niter = build3 (COND_EXPR, type, niter_desc.may_be_zero,
  			build_int_cst (type, 0),
  			niter);
!       record_estimate (loop, niter, niter_desc.max,
  		       last_stmt (ex->src),
  		       true, true);
      }
Index: tree.h
===================================================================
*** tree.h	(revision 122817)
--- tree.h	(working copy)
*************** extern enum tree_code invert_tree_compar
*** 4481,4486 ****
--- 4481,4487 ----
  
  extern bool tree_expr_nonzero_p (tree);
  extern bool tree_expr_nonzero_warnv_p (tree, bool *);
+ extern int multiple_of_p (tree, tree, tree);
  
  /* In builtins.c */
  extern tree fold_call_expr (tree, bool);
Index: tree-scalar-evolution.c
===================================================================
*** tree-scalar-evolution.c	(revision 122817)
--- tree-scalar-evolution.c	(working copy)
*************** scev_finalize (void)
*** 2864,2877 ****
    BITMAP_FREE (already_instantiated);
  }
  
- /* Returns true if EXPR looks expensive.  */
- 
- static bool
- expression_expensive_p (tree expr)
- {
-   return force_expr_to_var_cost (expr) >= target_spill_cost;
- }
- 
  /* Replace ssa names for that scev can prove they are constant by the
     appropriate constants.  Also perform final value replacement in loops,
     in case the replacement expressions are cheap.
--- 2864,2869 ----
*************** scev_const_prop (void)
*** 2958,2967 ****
  	continue;
  
        niter = number_of_latch_executions (loop);
!       if (niter == chrec_dont_know
! 	  /* If computing the number of iterations is expensive, it may be
! 	     better not to introduce computations involving it.  */
! 	  || expression_expensive_p (niter))
  	continue;
  
        /* Ensure that it is possible to insert new statements somewhere.  */
--- 2950,2962 ----
  	continue;
  
        niter = number_of_latch_executions (loop);
!       /* We used to check here whether the computation of NITER is expensive,
! 	 and avoided final value elimination if that is the case.  The problem
! 	 is that it is hard to evaluate whether the expression is too
! 	 expensive, as we do not know what optimization opportunities the
! 	 the elimination of the final value may reveal.  Therefore, we now
! 	 eliminate the final values of induction variables unconditionally.  */
!       if (niter == chrec_dont_know)
  	continue;
  
        /* Ensure that it is possible to insert new statements somewhere.  */
Index: fold-const.c
===================================================================
*** fold-const.c	(revision 122817)
--- fold-const.c	(working copy)
*************** static tree fold_truthop (enum tree_code
*** 131,137 ****
  static tree optimize_minmax_comparison (enum tree_code, tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
  static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *);
- static int multiple_of_p (tree, tree, tree);
  static tree fold_binary_op_with_conditional_arg (enum tree_code, tree,
  						 tree, tree,
  						 tree, tree, int);
--- 131,136 ----
*************** fold_build_call_array_initializer (tree 
*** 13115,13121 ****
     (where the same SAVE_EXPR (J) is used in the original and the
     transformed version).  */
  
! static int
  multiple_of_p (tree type, tree top, tree bottom)
  {
    if (operand_equal_p (top, bottom, 0))
--- 13114,13120 ----
     (where the same SAVE_EXPR (J) is used in the original and the
     transformed version).  */
  
! int
  multiple_of_p (tree type, tree top, tree bottom)
  {
    if (operand_equal_p (top, bottom, 0))
Index: testsuite/gcc.dg/tree-ssa/loop-26.c
===================================================================
*** testsuite/gcc.dg/tree-ssa/loop-26.c	(revision 0)
--- testsuite/gcc.dg/tree-ssa/loop-26.c	(revision 0)
***************
*** 0 ****
--- 1,29 ----
+ /* PR 30730, PR 26900, number of iterations analysis should be able to
+    determine number of iterations of the following loops unconditionally.  */
+ 
+ /* { dg-do compile } */
+ /* { dg-options "-O -fstrict-overflow -fdump-tree-empty" } */
+ 
+ unsigned foo(unsigned int n)
+ {
+   unsigned x = 0;;
+ 
+   while (n > 10)
+     {
+       n -= 2;
+       x++;
+     }
+ 
+   return x;
+ }
+ 
+ int foo0(int i0, int i1)
+ {
+   int i, j = 0;
+   for (i=i0; i<=i1+1; ++i)
+     ++j;
+   return j;
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "Removing empty loop" 2 "empty" } } */
+ /* { dg-final { cleanup-tree-dump "empty" } } */
Index: double-int.c
===================================================================
*** double-int.c	(revision 122817)
--- double-int.c	(working copy)
*************** Software Foundation, 51 Franklin Street,
*** 26,32 ****
  
  /* Returns mask for PREC bits.  */
  
! static inline double_int
  double_int_mask (unsigned prec)
  {
    unsigned HOST_WIDE_INT m;
--- 26,32 ----
  
  /* Returns mask for PREC bits.  */
  
! double_int
  double_int_mask (unsigned prec)
  {
    unsigned HOST_WIDE_INT m;
Index: double-int.h
===================================================================
*** double-int.h	(revision 122817)
--- double-int.h	(working copy)
*************** void dump_double_int (FILE *, double_int
*** 134,139 ****
--- 134,140 ----
  double_int double_int_ext (double_int, unsigned, bool);
  double_int double_int_sext (double_int, unsigned);
  double_int double_int_zext (double_int, unsigned);
+ double_int double_int_mask (unsigned);
  
  #define ALL_ONES (~((unsigned HOST_WIDE_INT) 0))
  
Index: tree-flow.h
===================================================================
*** tree-flow.h	(revision 122817)
--- tree-flow.h	(working copy)
*************** struct tree_niter_desc
*** 822,839 ****
  			   a loop (provided that assumptions == true and
  			   may_be_zero == false), more precisely the number
  			   of executions of the latch of the loop.  */
!   tree additional_info;	/* The boolean expression.  Sometimes we use additional
! 			   knowledge to simplify the other expressions
! 			   contained in this structure (for example the
! 			   knowledge about value ranges of operands on entry to
! 			   the loop).  If this is a case, conjunction of such
! 			   condition is stored in this field, so that we do not
! 			   lose the information: for example if may_be_zero
! 			   is (n <= 0) and niter is (unsigned) n, we know
! 			   that the number of iterations is at most
! 			   MAX_SIGNED_INT.  However if the (n <= 0) assumption
! 			   is eliminated (by looking at the guard on entry of
! 			   the loop), then the information would be lost.  */
  
    /* The simplified shape of the exit condition.  The loop exits if
       CONTROL CMP BOUND is false, where CMP is one of NE_EXPR,
--- 822,829 ----
  			   a loop (provided that assumptions == true and
  			   may_be_zero == false), more precisely the number
  			   of executions of the latch of the loop.  */
!   double_int max;	/* The upper bound on the number of iterations of
! 			   the loop.  */
  
    /* The simplified shape of the exit condition.  The loop exits if
       CONTROL CMP BOUND is false, where CMP is one of NE_EXPR,



More information about the Gcc-patches mailing list