This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[patch, vectorizer] Fix PR tree-optimization/37194 - vectorizer cost model
- From: Ira Rosen <IRAR at il dot ibm dot com>
- To: gcc-patches at gcc dot gnu dot org
- Date: Wed, 7 Jan 2009 13:27:42 +0200
- Subject: [patch, vectorizer] Fix PR tree-optimization/37194 - vectorizer cost model
Hi,
This patch fixes scalar outside cost calculation in vectorizer cost model.
Vectorizer currently handles unknown store misalignment by peeling a
statically unknown number of scalar iterations. The cost model for such
cases calculates scalar outside cost not for the original scalar version,
but including run-time guards. While when the original number of iterations
is known and no loop versioning for alignment or aliasing is performed, the
decision whether to vectorize the loop can be done statically. Therefore,
the vector cost should be compared to the cost of the original scalar loop.
Bootstrapped and tested on powerpc64-suse-linux.
O.K. for 4.4 and 4.3?
Thanks,
Ira
ChangeLog:.
PR tree-optimization/37194
* tree-vect-transform.c (vect_estimate_min_profitable_iters):
Don't add the cost of cost model guard in prologue to scalar
outside cost in case of known number of iterations.
testsuite/ChangeLog:
PR tree-optimization/37194
* gcc.dg/vect/costmodel/ppc/costmodel-pr37194.c: New test.
Index: tree-vect-transform.c
===================================================================
--- tree-vect-transform.c (revision 143113)
+++ tree-vect-transform.c (working copy)
@@ -122,7 +122,6 @@ vect_estimate_min_profitable_iters (loop
int vec_outside_cost = 0;
int scalar_single_iter_cost = 0;
int scalar_outside_cost = 0;
- bool runtime_test = false;
int vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo););
@@ -141,15 +140,7 @@ vect_estimate_min_profitable_iters (loop
return 0;
}
- /* If the number of iterations is unknown, or the
- peeling-for-misalignment amount is unknown, we will have to generate.
- a runtime test to test the loop count against the threshold. */
- if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
- || (byte_misalign < 0))
- runtime_test = true;
-
/* Requires loop versioning tests to handle misalignment. */
-
if (VEC_length (gimple, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo)))
{
/* FIXME: Make cost depend on complexity of individual check. */
@@ -337,7 +328,12 @@ vect_estimate_min_profitable_iters (loop
conditions/branch directions. Change the estimates below to
something more reasonable. */
- if (runtime_test)
+ /* If the number of iterations is known and we do not do versioning, we
can
+ decide whether to vectorize at compile time. Hence the scalar version
+ do not carry cost model guard costs. */
+ if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
+ || VEC_length (gimple, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo))
+ || VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo)))
{
/* Cost model check occurs at versioning. */
if (VEC_length (gimple, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo))
@@ -345,8 +341,8 @@ vect_estimate_min_profitable_iters (loop
scalar_outside_cost += TARG_COND_NOT_TAKEN_BRANCH_COST;
else
{
- /* Cost model occurs at prologue generation. */
- if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)).
+ /* Cost model check occurs at prologue generation. */
+ if (LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0)
scalar_outside_cost += 2 * TARG_COND_TAKEN_BRANCH_COST
+ TARG_COND_NOT_TAKEN_BRANCH_COST;
/* Cost model check occurs at epilogue generation. */
Index: testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr37194.c
===================================================================
--- testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr37194.c (revision
0)
+++ testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr37194.c (revision
0)
@@ -0,0 +1,28 @@
+/* { dg-require-effective-target vect_float } */
+/* { dg-do compile } */
+
+#include <stdlib.h>}
+#include "../../tree-vect.h"
+
+__attribute__ ((noinline)) void
+ggSpectrum_Set8(float * data, float d)
+{
+ int i;
+
+ for (i = 0; i < 8; i++)
+ data[i] = d;
+}
+}
+__attribute__ ((noinline)) void
+ggSpectrum_Set20(float * data, float d)
+{
+ int i;
+
+ for (i = 0; i < 20; i++)
+ data[i] = d;
+}
+}
+/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1
"vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+