[PATCH GCC]Proving no-trappness for array ref in tree if-conv using loop niter information.

Bin.Cheng amker.cheng@gmail.com
Fri May 6 09:40:00 GMT 2016


On Tue, May 3, 2016 at 11:08 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, May 3, 2016 at 12:01 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Mon, May 2, 2016 at 10:00 AM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener
>>>> <richard.guenther@gmail.com> wrote:
>>>>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>>>>> Hi,
>>>>>> Tree if-conversion sometimes cannot convert conditional array reference into unconditional one.  Root cause is GCC conservatively assumes newly introduced array reference could be out of array bound and thus trapping.  This patch improves the situation by proving the converted unconditional array reference is within array bound using loop niter information.  To be specific, it checks every index of array reference to see if it's within bound in ifcvt_memrefs_wont_trap.  This patch also factors out base_object_writable checking if the base object is writable or not.
>>>>>> Bootstrap and test on x86_64 and aarch64, is it OK?
>>>>>
>>>>> I think you miss to handle the case optimally where the only
>>>>> non-ARRAY_REF idx is the dereference of the
>>>>> base-pointer for, say, p->a[i].  In this case we can use
>>>>> base_master_dr to see if p is unconditionally dereferenced
>>>> Yes, will pick up this case.
>>>>
>>>>> in the loop.  You also fail to handle the case where we have
>>>>> MEM_REF[&x].a[i] that is, you see a decl base.
>>>> I am having difficulty in creating this case for ifcvt, any advices?  Thanks.
>>>
>>> Sth like
>>>
>>> float a[128];
>>> float foo (int n, int i)
>>> {
>>>   return (*((float(*)[n])a))[i];
>>> }
>>>
>>> should do the trick (w/o the component-ref).  Any other type-punning
>>> would do it, too.
>>>
>>>>> I suppose for_each_index should be fixed for this particular case (to
>>>>> return true), same for TARGET_MEM_REF TMR_BASE.
>>>>>
>>>>> +  /* The case of nonconstant bounds could be handled, but it would be
>>>>> +     complicated.  */
>>>>> +  if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low)
>>>>> +      || !high || TREE_CODE (high) != INTEGER_CST)
>>>>> +    return false;
>>>>> +
>>>>>
>>>>> handling of a non-zero but constant low bound is important - otherwise
>>>>> all this is a no-op for Fortran.  It
>>>>> shouldn't be too difficult to handle after all.  In fact I think your
>>>>> code does handle it correctly already.
>>>>>
>>>>> +  if (!init || TREE_CODE (init) != INTEGER_CST
>>>>> +      || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
>>>>> +    return false;
>>>>>
>>>>> step == 0 should be easy to handle as well, no?  The index will simply
>>>>> always be 'init' ...
>>>>>
>>>>> +  /* In case the relevant bound of the array does not fit in type, or
>>>>> +     it does, but bound + step (in type) still belongs into the range of the
>>>>> +     array, the index may wrap and still stay within the range of the array
>>>>> +     (consider e.g. if the array is indexed by the full range of
>>>>> +     unsigned char).
>>>>> +
>>>>> +     To make things simpler, we require both bounds to fit into type, although
>>>>> +     there are cases where this would not be strictly necessary.  */
>>>>> +  if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type))
>>>>> +    return false;
>>>>> +
>>>>> +  low = fold_convert (type, low);
>>>>>
>>>>> please use wide_int for all of this.
>>>> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not
>>>> sure what's the meaning by "handle "low = fold_convert (type, low);"
>>>> related code in wide_int".   Do you mean to use tree_int_cst_compare
>>>> instead of tree_int_cst_compare in the following code?
>>>
>>> I don't think you need any kind of fits-to-type check here.  You'd simply
>>> use to_widest () when operating on / comparing with high/low.
>> But what would happen if low/high and init/step are different in type
>> sign-ness?  Anything special I need to do before using wi::ltu_p or
>> wi::lts_p directly?
>
> You want to use to_widest (min) which extends according to sign to
> an "infinite" precision signed integer.  So you can then use the new
> operator< overloads as well.
>
Hi,
Here is the updated patch.  It includes below changes according to
review comments:

1) It uses widest_int for all INTEGER_CST tree computations, which
simplifies the patch a lot.
2) It covers array with non-zero low bound, which is important for Fortran.
3) It picks up a boundary case so that ifc-11.c/vect-23.c/vect-24.c
can be handled.
4) It also checks within bound array reference inside a structure like
p->a[i] by using base_master_dr in tree-if-conv.c so that ifc-12.c can
be handled.

It leaves two other review comments not addressed:
1) It doesn't handle array reference whose idx is a wrapping SCEV.
Because only read is safe and vectorizer itself may be confused by it
now.
2) It doesn't handle array reference in the form of
"MEM[(float[0:(sizetype) ((long int) SAVE_EXPR <m.2> + -1)]
*)&b][i_1];". To handle this case, existing code in
array_at_struct_end_p as well as this patch need to be improved.  I
tend to handle it in an independent patch.

With these changes, now cases pr61194.c and vect-23.c can be
vectorized, I removed XFAIL from them.  Also vect-mask-store-move-1.c
is affected and not vectorized now, this is tricky and it exposes a
known bug PR65206 in vectorizer's dependence analyzer.  This should be
handled independently too.  Also gcc.dg/vect/vect-24.c now is XPASS on
AArch64, but not x86_64.  Root cause is dom2 jump threads first
iteration of loop thus idx_within_array_bound is failed.  I didn't
check if jump threading is good in this case, either I remove the
XFAIL mark.  I tend to improve idx_within_array_bound by using VRP
information in a following patch.
Otherwise bootstrap and test on x86_64 and AArch64 are fine.  Is this
version OK?

Thanks,
bin

2016-05-04  Bin Cheng  <bin.cheng@arm.com>

    * tree-if-conv.c (tree-ssa-loop.h): Include header file.
    (tree-ssa-loop-niter.h): Ditto.
    (idx_within_array_bound, ref_within_array_bound): New functions.
    (ifcvt_memrefs_wont_trap): Check if array ref is within bound.
    Factor out check on writable base object to ...
    (base_object_writable): ... here.

gcc/testsuite/ChangeLog
2016-05-04  Bin Cheng  <bin.cheng@arm.com>

    * gcc.dg/tree-ssa/ifc-9.c: New test.
    * gcc.dg/tree-ssa/ifc-10.c: New test.
    * gcc.dg/tree-ssa/ifc-11.c: New test.
    * gcc.dg/tree-ssa/ifc-12.c: New test.
    * gcc.dg/vect/pr61194.c: Remove XFAIL.
    * gcc.dg/vect/vect-23.c: Remove XFAIL.
    * gcc.dg/vect/vect-mask-store-move-1.c: Revise test check.
-------------- next part --------------
diff --git a/gcc/testsuite/gcc.dg/vect/pr61194.c b/gcc/testsuite/gcc.dg/vect/pr61194.c
index 8d74e00..f7c71b9 100644
--- a/gcc/testsuite/gcc.dg/vect/pr61194.c
+++ b/gcc/testsuite/gcc.dg/vect/pr61194.c
@@ -38,4 +38,4 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-23.c b/gcc/testsuite/gcc.dg/vect/vect-23.c
index 44bed75..e463f1b 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-23.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-23.c
@@ -123,5 +123,5 @@ int main (void)
   return main1 ();
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-24.c b/gcc/testsuite/gcc.dg/vect/vect-24.c
index 09a6d7e..d27410b 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-24.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-24.c
@@ -123,5 +123,5 @@ int main (void)
   return main1 ();
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c b/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c
index f5cae4f..f928dbf 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c
@@ -15,4 +15,4 @@ void foo (int n)
       }
 }
 
-/* { dg-final { scan-tree-dump-times "Move stmt to created bb" 6 "vect" { target { i?86-*-* x86_64-*-* } } } } */
+/* { dg-final { scan-tree-dump-times "Move stmt to created bb" 4 "vect" { target { i?86-*-* x86_64-*-* } } } } */
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 32ced16..5527a7c 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -106,6 +106,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "tree-data-ref.h"
 #include "tree-scalar-evolution.h"
+#include "tree-ssa-loop.h"
+#include "tree-ssa-loop-niter.h"
 #include "tree-ssa-loop-ivopts.h"
 #include "tree-ssa-address.h"
 #include "dbgcnt.h"
@@ -740,6 +742,105 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a)
     }
 }
 
+/* Return TRUE if can prove the index IDX of an array reference REF is
+   within array bound.  Return false otherwise.  */
+
+static bool
+idx_within_array_bound (tree ref, tree *idx, void *dta)
+{
+  bool overflow;
+  widest_int niter, valid_niter, delta, wi_step;
+  tree ev, init, step;
+  tree low, high;
+  struct loop *loop = (struct loop*) dta;
+
+  /* Only support within-bound access for array references.  */
+  if (TREE_CODE (ref) != ARRAY_REF)
+    return false;
+
+  /* For arrays at the end of the structure, we are not guaranteed that they
+     do not really extend over their declared size.  However, for arrays of
+     size greater than one, this is unlikely to be intended.  */
+  if (array_at_struct_end_p (ref))
+    return false;
+
+  ev = analyze_scalar_evolution (loop, *idx);
+  ev = instantiate_parameters (loop, ev);
+  init = initial_condition (ev);
+  step = evolution_part_in_loop_num (ev, loop->num);
+
+  if (!init || TREE_CODE (init) != INTEGER_CST
+      || (step && TREE_CODE (step) != INTEGER_CST))
+    return false;
+
+  low = array_ref_low_bound (ref);
+  high = array_ref_up_bound (ref);
+
+  /* The case of nonconstant bounds could be handled, but it would be
+     complicated.  */
+  if (TREE_CODE (low) != INTEGER_CST
+      || !high || TREE_CODE (high) != INTEGER_CST)
+    return false;
+
+  /* Check if the intial idx is within bound.  */
+  if (wi::to_widest (init) < wi::to_widest (low)
+      || wi::to_widest (init) > wi::to_widest (high))
+    return false;
+
+  /* The idx is always within bound.  */
+  if (!step || integer_zerop (step))
+    return true;
+
+  if (!max_loop_iterations (loop, &niter))
+    return false;
+
+  if (wi::to_widest (step) < 0)
+    {
+      delta = wi::to_widest (init) - wi::to_widest (low);
+      wi_step = -wi::to_widest (step);
+    }
+  else
+    {
+      delta = wi::to_widest (high) - wi::to_widest (init);
+      wi_step = wi::to_widest (step);
+    }
+
+  valid_niter = wi::div_floor (delta, wi_step, SIGNED, &overflow);
+  /* The iteration space of idx is within array bound.  */
+  if (!overflow && niter <= valid_niter)
+    return true;
+
+  return false;
+}
+
+/* Return TRUE if ref is a within bound array reference.  */
+
+static bool
+ref_within_array_bound (gimple *stmt, tree ref)
+{
+  struct loop *loop = loop_containing_stmt (stmt);
+
+  gcc_assert (loop != NULL);
+  return for_each_index (&ref, idx_within_array_bound, loop);
+}
+
+
+/* Given a memory reference expression T, return TRUE if base object
+   it refers to is writable.  The base object of a memory reference
+   is the main object being referenced, which is returned by function
+   get_base_address.  */
+
+static bool
+base_object_writable (tree ref)
+{
+  tree base_tree = get_base_address (ref);
+
+  return (base_tree
+	  && DECL_P (base_tree)
+	  && decl_binds_to_current_def_p (base_tree)
+	  && !TREE_READONLY (base_tree));
+}
+
 /* Return true when the memory references of STMT won't trap in the
    if-converted code.  There are two things that we have to check for:
 
@@ -787,8 +888,13 @@ ifcvt_memrefs_wont_trap (gimple *stmt, vec<data_reference_p> drs)
   if (DR_W_UNCONDITIONALLY (*master_dr))
     return true;
 
-  /* If a is unconditionally accessed then ... */
-  if (DR_RW_UNCONDITIONALLY (*master_dr))
+  /* If a is unconditionally accessed then ...
+
+     Even a is conditional access, we can treat it as an unconditional
+     one if it's an array reference and all its index are within array
+     bound.  */
+  if (DR_RW_UNCONDITIONALLY (*master_dr)
+      || ref_within_array_bound (stmt, DR_REF (a)))
     {
       /* an unconditional read won't trap.  */
       if (DR_IS_READ (a))
@@ -799,16 +905,11 @@ ifcvt_memrefs_wont_trap (gimple *stmt, vec<data_reference_p> drs)
       if (base_master_dr
 	  && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
 	return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
-      else
-	{
-	  /* or the base is know to be not readonly.  */
-	  tree base_tree = get_base_address (DR_REF (a));
-	  if (DECL_P (base_tree)
-	      && decl_binds_to_current_def_p (base_tree)
-	      && ! TREE_READONLY (base_tree))
-	    return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
-	}
+      /* or the base is known to be not readonly.  */
+      else if (base_object_writable (DR_REF (a)))
+	return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
     }
+
   return false;
 }
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c
new file mode 100644
index 0000000..70b7422
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-ifcvt-stats" } */
+/* { dg-require-visibility "" } */
+
+int b[256] = {0}, y;
+void bar (int *);
+int foo (int x, int n)
+{
+  int i;
+  int a[128];
+
+  for (i = 0; i < n; i++)
+    {
+      a[i] = i;
+      if (x > i)
+	b[i] = y;
+    }
+  bar (a);
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-11.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-11.c
new file mode 100644
index 0000000..bacf428
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-11.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-ifcvt-stats" } */
+/* { dg-require-visibility "" } */
+
+int a[1024] = {0.0};
+int b[1024] = {0.0};
+int c[1024] = {0.0};
+int foo (float *x)
+{
+  int i = 0;
+
+  for (i = 0; i < 1024; i++)
+    {
+      c[i] = (x[i] > 0.0) ? a[i] : b[i];
+    }
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
new file mode 100644
index 0000000..89d42b4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-ifcvt-stats" } */
+/* { dg-require-visibility "" } */
+
+struct st
+{
+  int a[1024];
+  int b[1024];
+};
+
+struct st s = {0};
+int foo (int x)
+{
+  int i;
+  struct st *p = &s;
+
+  for (i = 0; i < 1024; i++)
+    {
+      if (x > i)
+	p->a[i] = p->b[i];
+    }
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c
new file mode 100644
index 0000000..24c19c0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-ifcvt-stats" } */
+/* { dg-require-visibility "" } */
+
+extern int b[256], y;
+void bar (int *, int);
+int foo (int x, int n)
+{
+  int i;
+  int a[128];
+
+  for (i = 0; i < n; i++)
+    {
+      a[i] = i;
+      if (x > i)
+	y = b[i];
+    }
+  bar (a, y);
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */


More information about the Gcc-patches mailing list