[PATCH] tree-optimization/98137 - enhance split_constant_offset range handling

Richard Biener rguenther@suse.de
Fri Dec 4 11:45:52 GMT 2020


split_constant_offset currently gives up looking at ranges when
dealing with possibly wrapping operations for looking through
conversions when the downstream analysis does not yield a SSA name.
That's overly conservative and we have a nice helper that can
deal with arbitrary expresssions.  Use that.  This helps data
reference group analysis so the testcase is fully SLP vectorized,
making use of the whole-function "BB" vectorization capabilities
we now have.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK for trunk?

Thanks,
Richard.

2020-12-04  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/98137
	* tree-data-ref.c (split_constant_offset_1): Use
	determine_value_range instead of get_range_info to handle
	arbitrary expressions.

	* gcc.dg/vect/bb-slp-pr98137.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr98137.c | 27 ++++++++++++++++++++++
 gcc/tree-data-ref.c                        | 24 +++++++++++--------
 2 files changed, 41 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr98137.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr98137.c b/gcc/testsuite/gcc.dg/vect/bb-slp-pr98137.c
new file mode 100644
index 00000000000..af43a1347ca
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr98137.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+/* { dg-require-effective-target vect_double } */
+
+void
+gemm (const double* __restrict__ A, const double* __restrict__ B,
+      double* __restrict__ C)
+{
+  unsigned int l_m = 0;
+  unsigned int l_n = 0;
+  unsigned int l_k = 0;
+
+  for ( l_n = 0; l_n < 9; l_n++ ) {
+    /* Use -O3 so this loop is unrolled completely early.  */
+    for ( l_m = 0; l_m < 10; l_m++ ) { C[(l_n*10)+l_m] = 0.0; }
+    for ( l_k = 0; l_k < 17; l_k++ ) {
+      /* Use -O3 so this loop is unrolled completely early.  */
+      for ( l_m = 0; l_m < 10; l_m++ ) {
+        C[(l_n*10)+l_m] += A[(l_k*20)+l_m] * B[(l_n*20)+l_k];
+      }
+    }
+  }
+}
+
+/* Exact scannig is difficult but we expect all loads and stores
+   and computations to be vectorized.  */
+/* { dg-final { scan-tree-dump "optimized: basic block" "slp1" } } */
diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index 3bf460cccfd..e8308ce8250 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -763,18 +763,22 @@ split_constant_offset_1 (tree type, tree op0, enum tree_code code, tree op1,
 		tree tmp_var, tmp_off;
 		split_constant_offset (op0, &tmp_var, &tmp_off, cache, limit);
 
-		/* See whether we have an SSA_NAME whose range is known
-		   to be [A, B].  */
-		if (TREE_CODE (tmp_var) != SSA_NAME)
-		  return false;
+		/* See whether we have an known range [A, B] for tmp_var.  */
 		wide_int var_min, var_max;
-		value_range_kind vr_type = get_range_info (tmp_var, &var_min,
-							   &var_max);
-		wide_int var_nonzero = get_nonzero_bits (tmp_var);
 		signop sgn = TYPE_SIGN (itype);
-		if (intersect_range_with_nonzero_bits (vr_type, &var_min,
-						       &var_max, var_nonzero,
-						       sgn) != VR_RANGE)
+		if (TREE_CODE (tmp_var) == SSA_NAME)
+		  {
+		    value_range_kind vr_type
+		      = get_range_info (tmp_var, &var_min, &var_max);
+		    wide_int var_nonzero = get_nonzero_bits (tmp_var);
+		    if (intersect_range_with_nonzero_bits (vr_type, &var_min,
+							   &var_max,
+							   var_nonzero,
+							   sgn) != VR_RANGE)
+		      return false;
+		  }
+		else if (determine_value_range (tmp_var, &var_min, &var_max)
+			 != VR_RANGE)
 		  return false;
 
 		/* See whether the range of OP0 (i.e. TMP_VAR + TMP_OFF)
-- 
2.26.2


More information about the Gcc-patches mailing list