This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Fix PR64909


The vectorizer cost model has a serious issue in not dealing well with
targets using scalar stmt cost != 1.  This is because it passes
scalar iteration _cost_ to routines scaling that cost with the targets
scalar stmt cost again.  This is for example visible on x86_64 for
all AMD archs which use high scalar stmt cost (6).

I am testing the following patch to fix that - for GCC 6 we might want
to avoid the roundoff errors that can appear.

Richard.

2015-02-10  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/64909
	* tree-vect-loop.c (vect_estimate_min_profitable_iters): Properly
	pass a scalar-stmt count estimate to the cost model.
	* tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost): Likewise.

	* gcc.dg/vect/costmodel/x86_64/costmodel-pr64909.c: New testcase.

Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	(revision 220540)
+++ gcc/tree-vect-loop.c	(working copy)
@@ -2834,6 +2834,11 @@ vect_estimate_min_profitable_iters (loop
      statements.  */
 
   scalar_single_iter_cost = vect_get_single_scalar_iteration_cost (loop_vinfo);
+  /* ???  Below we use this cost as number of stmts with scalar_stmt cost,
+     thus divide by that.  This introduces rounding errors, thus better
+     introduce a new cost kind (raw_cost?  scalar_iter_cost?). */
+  int scalar_single_iter_stmts
+    = scalar_single_iter_cost / vect_get_stmt_cost (scalar_stmt);
 
   /* Add additional cost for the peeled instructions in prologue and epilogue
      loop.
@@ -2868,10 +2873,10 @@ vect_estimate_min_profitable_iters (loop
       /* FORNOW: Don't attempt to pass individual scalar instructions to
 	 the model; just assume linear cost for scalar iterations.  */
       (void) add_stmt_cost (target_cost_data,
-			    peel_iters_prologue * scalar_single_iter_cost,
+			    peel_iters_prologue * scalar_single_iter_stmts,
 			    scalar_stmt, NULL, 0, vect_prologue);
       (void) add_stmt_cost (target_cost_data, 
-			    peel_iters_epilogue * scalar_single_iter_cost,
+			    peel_iters_epilogue * scalar_single_iter_stmts,
 			    scalar_stmt, NULL, 0, vect_epilogue);
     }
   else
@@ -2887,7 +2892,7 @@ vect_estimate_min_profitable_iters (loop
 
       (void) vect_get_known_peeling_cost (loop_vinfo, peel_iters_prologue,
 					  &peel_iters_epilogue,
-					  scalar_single_iter_cost,
+					  scalar_single_iter_stmts,
 					  &prologue_cost_vec,
 					  &epilogue_cost_vec);
 
Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c	(revision 220540)
+++ gcc/tree-vect-data-refs.c	(working copy)
@@ -1184,10 +1206,13 @@ vect_peeling_hash_get_lowest_cost (_vect
     }
 
   single_iter_cost = vect_get_single_scalar_iteration_cost (loop_vinfo);
-  outside_cost += vect_get_known_peeling_cost (loop_vinfo, elem->npeel,
-					       &dummy, single_iter_cost,
-					       &prologue_cost_vec,
-					       &epilogue_cost_vec);
+  outside_cost += vect_get_known_peeling_cost
+    (loop_vinfo, elem->npeel, &dummy,
+     /* ???  We use this cost as number of stmts with scalar_stmt cost,
+	thus divide by that.  This introduces rounding errors, thus better 
+	introduce a new cost kind (raw_cost?  scalar_iter_cost?). */
+     single_iter_cost / vect_get_stmt_cost (scalar_stmt),
+     &prologue_cost_vec, &epilogue_cost_vec);
 
   /* Prologue and epilogue costs are added to the target model later.
      These costs depend only on the scalar iteration cost, the
Index: gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr64909.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr64909.c	(revision 0)
+++ gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr64909.c	(working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-mtune=bdver1" } */
+
+unsigned short a[32];
+unsigned int b[32];
+void t()
+{
+  int i;
+  for (i=0;i<12;i++)
+    b[i]=a[i];
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]