[PATCH] Fix PR82255 (vectorizer cost model overcounts some vector load costs)
Bill Schmidt
wschmidt@linux.vnet.ibm.com
Tue Sep 19 17:38:00 GMT 2017
Hi,
https://gcc.gnu.org/PR82255 identifies a problem in the vector cost model
where a vectorized load is treated as having the cost of a strided load
in a case where we will not actually generate a strided load. This is
simply a mismatch between the conditions tested in the cost model and
those tested in the code that generates vectorized instructions. This
patch fixes the problem by recognizing when only a single non-strided
load will be generated and reporting the cost accordingly.
I believe this patch is sufficient to catch all such cases, but I admit
that the code in vectorizable_load is complex enough that I could have
missed a trick.
I've added a test in the PowerPC cost model subdirectory. Even though
this isn't a target-specific issue, the test does rely on a 16-byte
vector size, so this seems safest.
Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this ok for trunk?
Thanks!
Bill
[gcc]
2017-09-19 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
PR tree-optimization/82255
* tree-vect-stmts.c (vect_model_load_cost): Don't count
vec_construct cost when a true strided load isn't present.
[gcc/testsuite]
2017-09-19 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
PR tree-optimization/82255
* gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c: New file.
Index: gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c (nonexistent)
+++ gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c (working copy)
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+/* PR82255: Ensure we don't require a vec_construct cost when we aren't
+ going to generate a strided load. */
+
+extern int abs (int __x) __attribute__ ((__nothrow__, __leaf__)) __attribute__ ((__const__));
+
+static int
+foo (unsigned char *w, int i, unsigned char *x, int j)
+{
+ int tot = 0;
+ for (int a = 0; a < 16; a++)
+ {
+ for (int b = 0; b < 16; b++)
+ tot += abs (w[b] - x[b]);
+ w += i;
+ x += j;
+ }
+ return tot;
+}
+
+void
+bar (unsigned char *w, unsigned char *x, int i, int *result)
+{
+ *result = foo (w, 16, x, i);
+}
+
+/* { dg-final { scan-tree-dump-times "vec_construct required" 0 "vect" } } */
+
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c (revision 252760)
+++ gcc/tree-vect-stmts.c (working copy)
@@ -1091,8 +1091,20 @@ vect_model_load_cost (stmt_vec_info stmt_info, int
prologue_cost_vec, body_cost_vec, true);
if (memory_access_type == VMAT_ELEMENTWISE
|| memory_access_type == VMAT_STRIDED_SLP)
- inside_cost += record_stmt_cost (body_cost_vec, ncopies, vec_construct,
- stmt_info, 0, vect_body);
+ {
+ stmt_vec_info stmt_info = vinfo_for_stmt (first_stmt);
+ int group_size = GROUP_SIZE (stmt_info);
+ int nunits = TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info));
+ if (group_size < nunits)
+ {
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_NOTE, vect_location,
+ "vect_model_load_cost: vec_construct required");
+ inside_cost += record_stmt_cost (body_cost_vec, ncopies,
+ vec_construct, stmt_info, 0,
+ vect_body);
+ }
+ }
if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
More information about the Gcc-patches
mailing list