[PATCH] Fix PR82255 (vectorizer cost model overcounts some vector load costs)

Bill Schmidt wschmidt@linux.vnet.ibm.com
Tue Sep 19 17:38:00 GMT 2017


Hi,

https://gcc.gnu.org/PR82255 identifies a problem in the vector cost model
where a vectorized load is treated as having the cost of a strided load
in a case where we will not actually generate a strided load.  This is
simply a mismatch between the conditions tested in the cost model and
those tested in the code that generates vectorized instructions.  This
patch fixes the problem by recognizing when only a single non-strided
load will be generated and reporting the cost accordingly.

I believe this patch is sufficient to catch all such cases, but I admit
that the code in vectorizable_load is complex enough that I could have
missed a trick.

I've added a test in the PowerPC cost model subdirectory.  Even though
this isn't a target-specific issue, the test does rely on a 16-byte 
vector size, so this seems safest.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this ok for trunk?

Thanks!
Bill


[gcc]

2017-09-19  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR tree-optimization/82255
	* tree-vect-stmts.c (vect_model_load_cost): Don't count
	vec_construct cost when a true strided load isn't present.

[gcc/testsuite]

2017-09-19  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR tree-optimization/82255
	* gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c: New file.


Index: gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c	(working copy)
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+/* PR82255: Ensure we don't require a vec_construct cost when we aren't
+   going to generate a strided load.  */
+
+extern int abs (int __x) __attribute__ ((__nothrow__, __leaf__)) __attribute__ ((__const__));
+
+static int
+foo (unsigned char *w, int i, unsigned char *x, int j)
+{
+  int tot = 0;
+  for (int a = 0; a < 16; a++)
+    {
+      for (int b = 0; b < 16; b++)
+	tot += abs (w[b] - x[b]);
+      w += i;
+      x += j;
+    }
+  return tot;
+}
+
+void
+bar (unsigned char *w, unsigned char *x, int i, int *result)
+{
+  *result = foo (w, 16, x, i);
+}
+
+/* { dg-final { scan-tree-dump-times "vec_construct required" 0 "vect" } } */
+
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	(revision 252760)
+++ gcc/tree-vect-stmts.c	(working copy)
@@ -1091,8 +1091,20 @@ vect_model_load_cost (stmt_vec_info stmt_info, int
 			prologue_cost_vec, body_cost_vec, true);
   if (memory_access_type == VMAT_ELEMENTWISE
       || memory_access_type == VMAT_STRIDED_SLP)
-    inside_cost += record_stmt_cost (body_cost_vec, ncopies, vec_construct,
-				     stmt_info, 0, vect_body);
+    {
+      stmt_vec_info stmt_info = vinfo_for_stmt (first_stmt);
+      int group_size = GROUP_SIZE (stmt_info);
+      int nunits = TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info));
+      if (group_size < nunits)
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "vect_model_load_cost: vec_construct required");
+	  inside_cost += record_stmt_cost (body_cost_vec, ncopies,
+					   vec_construct, stmt_info, 0,
+					   vect_body);
+	}
+    }
 
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location,



More information about the Gcc-patches mailing list