This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: Determine more IVs to be non-overflowing

From: Richard Biener <rguenther at suse dot de>
To: Jan Hubicka <hubicka at ucw dot cz>
Cc: gcc-patches at gcc dot gnu dot org
Date: Wed, 29 Jun 2016 09:50:38 +0200 (CEST)
Subject: Re: Determine more IVs to be non-overflowing
Authentication-results: sourceware.org; auth=none
References: <20160627140938 dot GF98078 at kam dot mff dot cuni dot cz>
On Mon, 27 Jun 2016, Jan Hubicka wrote:

> Hi,
> this patch makes simple_iv to determine more often that IV can not overflow.
> First I commonized the logic in simple_iv with nowrap_type_p because it tests
> the same. Second I added iv_can_overflow_p which uses known upper bound on
> number of iteration to see if the IV calculation can overflow.

+  if (!get_max_loop_iterations (loop, &nit))
+    /* 10GHz CPU runing one cycle loop will reach 2^60 iterations in 260
+       years.  I won't live long enough to be forced to fix the
+       miscompilation.  Having the limit here will let us to consider
+       64bit IVs with base 0 and step 1...16 as non-wrapping which makes
+       niter and ivopts go smoother.  */
+    nit = ((widest_int)1 << 60);

I think that's on the border of being acceptable.  Consider said loop
being autoparallelized and run on a PFlop cluster.  Please take it out
for now.

+  if (INTEGRAL_TYPE_P (type))
+    {
+      maxt = TYPE_MAX_VALUE (type);
+      mint = TYPE_MIN_VALUE (type);
+    }
+  else
+    {
+      maxt = upper_bound_in_type (type, type);
+      mint = lower_bound_in_type (type, type);
+    }

I think we don't support any non-integral/pointer IVs and thus you
can simply use

  type_min/max = wi::max/min_value (TYPE_PRECISION (type), TYPE_SIGN 
(type));

+  type_min = wi::to_widest (mint);
+  type_max = wi::to_widest (maxt);
+  if ((base_max + step_max * (nit + 1)) > (type_max)
+      || type_min > (base_min + step_min * (nit + 1)))
+    return true;

so I'm not convinced that widest_int precision is enough here.  Can't
you use wide_ints instead of widest_ints and use the wi::add / wi::mult 
overloads which provide the overflow flag?  Otherwise do what
VRP does and use FIXED_WIDE_INT to properly handle __int128_t IVs.

> One interesting thig is that the ivcanon IV variables that goes from niter to 0
> are believed to be wrapping.  This is because the type is unsigned and -1 is
> then large number.
> 
> It is not specified what overflow means, I suppose one can think of it as overflow
> in the calucaltion what sort of happens in this case.
> Inspecting the code I think both users (niter and ivopts) agrees with different
> interpretation that the induction can not wrap: that is the sequence produced
> is always monotonously increasing or decreasing.

That makes sense.

> I would suggest to incrementally rename it to no_wrap_p and add comment in this
> sense and handle this case, too.  It will need update at one place in ivopts where
> we are detemrining the direction of iteration:
> 
>   /* We need to know that the candidate induction variable does not overflow.
>      While more complex analysis may be used to prove this, for now just
>      check that the variable appears in the original program and that it
>      is computed in a type that guarantees no overflows.  */
>   cand_type = TREE_TYPE (cand->iv->base);
>   if (cand->pos != IP_ORIGINAL || !nowrap_type_p (cand_type))
>     return false;
> ..
>   step = int_cst_value (cand->iv->step);                                        
> ...
>   /* Determine the new comparison operator.  */                                 
>   comp = step < 0 ? GT_EXPR : LT_EXPR;                                          
> 
> (which is the only occurence of step). I guess we can add function iv_direction
> that will return -1,0,1 for monotonously decreasing, constant/unknown and
> monotonously increasing inductions.

scev provides scev_direction for this (of course we don't have the chrec
anymore here).

Richard.

> 
> Honza
> 
> 	* tree-scalar-evolution.h (iv_can_overflow_p): Declare.
> 	* tree-scalar-evolution.c (iv_can_overflow_p): New funcition.
> 	(simple_iv): Use it.
> 	* tree-ssa-loop-niter.c (nowrap_type_p): Use ANY_INTEGRAL_TYPE_P
> 
> 	* gcc.dg/tree-ssa/scev-14.c: New testcase.
> Index: tree-scalar-evolution.h
> ===================================================================
> --- tree-scalar-evolution.h	(revision 237798)
> +++ tree-scalar-evolution.h	(working copy)
> @@ -38,6 +38,7 @@ extern unsigned int scev_const_prop (voi
>  extern bool expression_expensive_p (tree);
>  extern bool simple_iv (struct loop *, struct loop *, tree, struct affine_iv *,
>  		       bool);
> +extern bool iv_can_overflow_p (struct loop *, tree, tree, tree);
>  extern tree compute_overall_effect_of_inner_loop (struct loop *, tree);
>  
>  /* Returns the basic block preceding LOOP, or the CFG entry block when
> Index: tree-scalar-evolution.c
> ===================================================================
> --- tree-scalar-evolution.c	(revision 237798)
> +++ tree-scalar-evolution.c	(working copy)
> @@ -3309,6 +3310,70 @@ scev_reset (void)
>      }
>  }
>  
> +/* Return true if the IV calculation in TYPE can overflow based on the knowledge
> +   of the upper bound on the number of iterations of LOOP, the BASE and STEP
> +   of IV.
> +
> +   We do not use information whether TYPE can overflow so it is safe to
> +   use this test even for derived IVs not computed every iteration or
> +   hypotetical IVs to be inserted into code.  */
> +
> +bool
> +iv_can_overflow_p (struct loop *loop, tree type, tree base, tree step)
> +{
> +  widest_int nit, base_min, base_max, step_min, step_max, type_min, type_max;
> +  wide_int min, max;
> +  tree maxt, mint;
> +
> +  if (TREE_CODE (base) == INTEGER_CST)
> +    base_min = base_max = wi::to_widest (base);
> +  else if (TREE_CODE (base) == SSA_NAME && !POINTER_TYPE_P (TREE_TYPE (base))
> +	   && get_range_info (base, &min, &max) == VR_RANGE)
> +    {
> +      base_min = widest_int::from (min, TYPE_SIGN (TREE_TYPE (base)));
> +      base_max = widest_int::from (max, TYPE_SIGN (TREE_TYPE (base)));
> +    }
> +  else
> +    return true;
> +
> +  if (TREE_CODE (step) == INTEGER_CST)
> +    step_min = step_max = wi::to_widest (step);
> +  else if (TREE_CODE (step) == SSA_NAME && !POINTER_TYPE_P (TREE_TYPE (base))
> +	   && get_range_info (step, &min, &max) == VR_RANGE)
> +    {
> +      step_min = widest_int::from (min, TYPE_SIGN (TREE_TYPE (step)));
> +      step_max = widest_int::from (max, TYPE_SIGN (TREE_TYPE (step)));
> +    }
> +  else
> +    return true;
> +
> +  if (!get_max_loop_iterations (loop, &nit))
> +    /* 10GHz CPU runing one cycle loop will reach 2^60 iterations in 260
> +       years.  I won't live long enough to be forced to fix the
> +       miscompilation.  Having the limit here will let us to consider 
> +       64bit IVs with base 0 and step 1...16 as non-wrapping which makes
> +       niter and ivopts go smoother.  */
> +    nit = ((widest_int)1 << 60);
> +
> +  if (INTEGRAL_TYPE_P (type))
> +    {
> +      maxt = TYPE_MAX_VALUE (type);
> +      mint = TYPE_MIN_VALUE (type);
> +    }
> +  else
> +    {
> +      maxt = upper_bound_in_type (type, type);
> +      mint = lower_bound_in_type (type, type);
> +    }
> +
> +  type_min = wi::to_widest (mint);
> +  type_max = wi::to_widest (maxt);
> +  if ((base_max + step_max * (nit + 1)) > (type_max)
> +      || type_min > (base_min + step_min * (nit + 1)))
> +    return true;
> +  return false;
> +}
> +
>  /* Checks whether use of OP in USE_LOOP behaves as a simple affine iv with
>     respect to WRTO_LOOP and returns its base and step in IV if possible
>     (see analyze_scalar_evolution_in_loop for more details on USE_LOOP
> @@ -3375,8 +3440,12 @@ simple_iv (struct loop *wrto_loop, struc
>    if (tree_contains_chrecs (iv->base, NULL))
>      return false;
>  
> -  iv->no_overflow = (!folded_casts && ANY_INTEGRAL_TYPE_P (type)
> -		     && TYPE_OVERFLOW_UNDEFINED (type));
> +  iv->no_overflow = !folded_casts && nowrap_type_p (type);
> +
> +  if (!iv->no_overflow
> +      && !iv_can_overflow_p (wrto_loop, type, iv->base, iv->step))
> +    iv->no_overflow = true;
> +
>  
>    /* Try to simplify iv base:
>  
> Index: tree-ssa-loop-niter.c
> ===================================================================
> --- tree-ssa-loop-niter.c	(revision 237798)
> +++ tree-ssa-loop-niter.c	(working copy)
> @@ -4105,7 +4105,7 @@ n_of_executions_at_most (gimple *stmt,
>  bool
>  nowrap_type_p (tree type)
>  {
> -  if (INTEGRAL_TYPE_P (type)
> +  if (ANY_INTEGRAL_TYPE_P (type)
>        && TYPE_OVERFLOW_UNDEFINED (type))
>      return true;
>  
> Index: testsuite/gcc.dg/tree-ssa/scev-14.c
> ===================================================================
> --- testsuite/gcc.dg/tree-ssa/scev-14.c	(revision 0)
> +++ testsuite/gcc.dg/tree-ssa/scev-14.c	(working copy)
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
> +int a[100];
> +void t(unsigned int n)
> +{
> +  unsigned int i;
> +  for (i=0; i<n; i++)
> +     a[i]++;
> +}
> +/* { dg-final { scan-tree-dump "Overflowness wrto loop niter:	No-overflow"  "ivopts" } } */
> +/* { dg-final { scan-tree-dump-not "Overflowness wrto loop niter:	Overflow" "ivopts" } } */
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Follow-Ups:
- Re: Determine more IVs to be non-overflowing
  - From: Jan Hubicka
References:
- Determine more IVs to be non-overflowing
  - From: Jan Hubicka
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]