[PATCH][RFC] Do some vectorizer-friendly canonicalization before vectorization
Dorit Nuzman
DORIT@il.ibm.com
Wed Nov 22 22:23:00 GMT 2006
> Richard Guenther <rguenther@suse.de> wrote on 21/11/2006 13:21:56:
>
> >
> > Ok, while the "pattern" is not exactly a pattern but only one
> > instruction,
>
> it's a very simple pattern indeed...
>
> > here's the pattern variant.
>
> thanks!
>
Hi Richard,
I didn't take a close enough look at the testcases you added, and missed
the fact that they would fail on ppc:
Running /home/dorit/mainline_dn/gcc/gcc/testsuite/gcc.dg/vect/vect.exp ...
FAIL: gcc.dg/vect/vect-pow-1.c scan-tree-dump pattern recognized
FAIL: gcc.dg/vect/vect-pow-2.c scan-tree-dump pattern recognized
There are a few issues here:
(1) The tests fail on ppc because there's no vector support for doubles, so
vectorization gives up even before it gets to the pattern_recog pass. If
you want to leave the test as is (operating on doubles) you should probably
add this:
/* { dg-require-effective-target vect_double } */
(2) We usually don't set the compilation flags explicitly in the testcase,
like you have in vect-pow-*.c:
/* { dg-options "-O2 -ftree-vectorize -fno-math-errno
-fdump-tree-vect-details" } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math
-fdump-tree-vect-details" } */
because vect.exp has this entire logic to determine which compilation flags
to use according to the target (for example, in the case of ppc you also
need to specify -maltivec). So what we do when we want to use additional
flags - like your -fast-math and -fno-math-errno - is use the prefix of the
testcase name to indicate which additional flags to use. E.g., all the
tests we want to compile with -fast-math are prefixed with
fast-math-vect-*.c. So for fast-math the required bits in vect.exp are
already there (you just need to change the name of the testcase). For
-fno-math-errno you also need to add this in vect.exp:
Index: vect.exp
===================================================================
--- vect.exp (revision 119088)
+++ vect.exp (working copy)
@@ -133,6 +133,12 @@
dg-runtest [lsort [glob -nocomplain
$srcdir/$subdir/no-section-anchors-*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS
+# -fno-math-errno
+set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
+lappend DEFAULT_VECTCFLAGS "-fno-math-errno"
+dg-runtest [lsort [glob -nocomplain
$srcdir/$subdir/no-math-errno-*.\[cS\]]] \
+ "" $DEFAULT_VECTCFLAGS
+
# -funswitch-loops tests
set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
lappend DEFAULT_VECTCFLAGS "-funswitch-loops"
(3) With respect to checking target support for the pattern: the general
scheme of things is that each 'vect_recog_X_pattern' function does one of
the following (this is documented in the code, but maybe not clearly
enough):
- option1 (should be the default): leave it for the caller -
'vect_pattern_recog_1' - to check target support. In this case
'vect_recog_X_pattern' initializes 'type_in' to the original (scalar) type
of the arguments of the pattern.
- option2: if the above is not applicable, the 'vect_recog_X_pattern'
function verifies target support for the pattern by itself, and initializes
'type_in' to the relevant vector_type.
What you have in 'vect_recog_pow_pattern' is that you don't check target
support, but you do initialize 'type_in' to a vector type:
*type_in = get_vectype_for_scalar_type (TREE_TYPE (base));
which in turn makes 'vect_pattern_recog_1' think that it doesn't need to
check target support for you.
I think in your case the first option applies for the mult, but the second
option applies for the sqrt (cause the target-support check in
'vect_pattern_recog_1' is not general enough - doesn't handle a
pattern_expr that is a function call).
(4) Another bit to consider: if you want to check only whether a pattern
was detected, regardless of available target support, you'd want to have
another printout before the target-support check, like the other
'vect_recog_X_pattern' functions have (otherwise you may need to add a
target keyword to be able to expect on which targets the tests succeed or
not).
So, taking the last two points into account, a fix to the
tree-vect-patterns.c part of your patch would look something like the
following (this code was not built/tested):
Index: tree-vect-patterns.c
===================================================================
--- tree-vect-patterns.c (revision 119088)
+++ tree-vect-patterns.c (working copy)
@@ -434,6 +434,7 @@
tree expr;
tree type;
tree fn, arglist, base, exp;
+ tree pattern_expr;
if (TREE_CODE (last_stmt) != MODIFY_EXPR)
return NULL;
@@ -474,7 +475,16 @@
&& tree_low_cst (exp, 0) == 2)
|| (TREE_CODE (exp) == REAL_CST
&& REAL_VALUES_EQUAL (TREE_REAL_CST (exp), dconst2)))
- return build2 (MULT_EXPR, TREE_TYPE (base), base, base);
+ {
+ pattern_expr = build2 (MULT_EXPR, TREE_TYPE (base), base, base);
+ if (vect_print_dump_info (REPORT_DETAILS))
+ {
+ fprintf (vect_dump, "vect_recog_pow_pattern: detected: ");
+ print_generic_expr (vect_dump, pattern_expr, TDF_SLIM);
+ }
+ *type_in = TREE_TYPE (base);
+ return pattern_expr;
+ }
/* Catch square root. */
if (TREE_CODE (exp) == REAL_CST
@@ -482,10 +492,22 @@
{
tree newfn = mathfn_built_in (TREE_TYPE (base), BUILT_IN_SQRT);
tree newarglist = build_tree_list (NULL_TREE, base);
- return build_function_call_expr (newfn, newarglist);
+
+ pattern_expr = build_function_call_expr (newfn, newarglist);
+ if (vect_print_dump_info (REPORT_DETAILS))
+ {
+ fprintf (vect_dump, "vect_recog_pow_pattern: detected: ");
+ print_generic_expr (vect_dump, pattern_expr, TDF_SLIM);
+ }
+
+ if ( supported by target ) /* TODO */
+ {
+ *type_in = get_vectype_for_scalar_type (TREE_TYPE (base));
+ return pattern_expr;
+ }
}
return NULL_TREE;
and then in your testcases you can choose to have this change if you don't
care about actual target support:
@@ -10,5 +11,5 @@
x[i] = __builtin_pow (x[i], 0.5);
}
-/* { dg-final { scan-tree-dump "pattern recognized" "vect" } } */
+/* { dg-final { scan-tree-dump "vect_recog_pow_pattern: detected" "vect" }
} */
/* { dg-final { cleanup-tree-dump "vect" } } */
(or leave it as is if you do).
Sorry I didn't notice these points before. I can test the fixed patch on
ppc if that helps.
dorit
> dorit
>
> > The sqrt transformation
> > will only be recognized after someone approves the vectorization
> > of builtins and I dig out my i386 backend patch to enable the
> > use of __builtin_ia32_sqrtpd and other SSE intrinsics we have there.
> >
> > Richard.
> >
> > 2006-11-21 Richard Guenther <rguenther@suse.de>
> >
> > * tree-vectorizer.h (NUM_PATTERNS): Increase.
> > * tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add
> > vect_recog_pow_pattern.
> > (vect_recog_pow_pattern): New function.
> >
> > * gcc.dg/vect/vect-pow-1.c: New testcase.
> > * gcc.dg/vect/vect-pow-2.c: Likewise.
> >
> > Index: tree-vect-patterns.c
> > ===================================================================
> > *** tree-vect-patterns.c (revision 119016)
> > --- tree-vect-patterns.c (working copy)
> > *************** static bool widened_name_p (tree, tree,
> > *** 50,59 ****
> > static tree vect_recog_widen_sum_pattern (tree, tree *, tree *);
> > static tree vect_recog_widen_mult_pattern (tree, tree *, tree *);
> > static tree vect_recog_dot_prod_pattern (tree, tree *, tree *);
> > static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] =
{
> > vect_recog_widen_mult_pattern,
> > vect_recog_widen_sum_pattern,
> > ! vect_recog_dot_prod_pattern};
> >
> >
> > /* Function widened_name_p
> > --- 50,61 ----
> > static tree vect_recog_widen_sum_pattern (tree, tree *, tree *);
> > static tree vect_recog_widen_mult_pattern (tree, tree *, tree *);
> > static tree vect_recog_dot_prod_pattern (tree, tree *, tree *);
> > + static tree vect_recog_pow_pattern (tree, tree *, tree *);
> > static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] =
{
> > vect_recog_widen_mult_pattern,
> > vect_recog_widen_sum_pattern,
> > ! vect_recog_dot_prod_pattern,
> > ! vect_recog_pow_pattern};
> >
> >
> > /* Function widened_name_p
> > *************** vect_recog_widen_mult_pattern (tree last
> > *** 400,405 ****
> > --- 402,494 ----
> > }
> >
> >
> > + /* Function vect_recog_pow_pattern
> > +
> > + Try to find the following pattern:
> > +
> > + x = POW (y, N);
> > +
> > + with POW being one of pow, powf, powi, powif and N being
> > + either 2 or 0.5.
> > +
> > + Input:
> > +
> > + * LAST_STMT: A stmt from which the pattern search begins.
> > +
> > + Output:
> > +
> > + * TYPE_IN: The type of the input arguments to the pattern.
> > +
> > + * TYPE_OUT: The type of the output of this pattern.
> > +
> > + * Return value: A new stmt that will be used to replace the
sequence
> of
> > + stmts that constitute the pattern. In this case it will be:
> > + x * x
> > + or
> > + sqrt (x)
> > + */
> > +
> > + static tree
> > + vect_recog_pow_pattern (tree last_stmt, tree *type_in, tree
*type_out)
> > + {
> > + tree expr;
> > + tree type;
> > + tree fn, arglist, base, exp;
> > +
> > + if (TREE_CODE (last_stmt) != MODIFY_EXPR)
> > + return NULL;
> > +
> > + expr = TREE_OPERAND (last_stmt, 1);
> > + type = TREE_TYPE (expr);
> > +
> > + if (TREE_CODE (expr) != CALL_EXPR)
> > + return NULL_TREE;
> > +
> > + fn = get_callee_fndecl (expr);
> > + arglist = TREE_OPERAND (expr, 1);
> > + switch (DECL_FUNCTION_CODE (fn))
> > + {
> > + case BUILT_IN_POWIF:
> > + case BUILT_IN_POWI:
> > + case BUILT_IN_POWF:
> > + case BUILT_IN_POW:
> > + base = TREE_VALUE (arglist);
> > + exp = TREE_VALUE (TREE_CHAIN (arglist));
> > + if (TREE_CODE (exp) != REAL_CST
> > + && TREE_CODE (exp) != INTEGER_CST)
> > + return NULL_TREE;
> > + break;
> > +
> > + default:;
> > + return NULL_TREE;
> > + }
> > +
> > + /* We now have a pow or powi builtin function call with a constant
> > + exponent. */
> > +
> > + *type_in = get_vectype_for_scalar_type (TREE_TYPE (base));
> > + *type_out = NULL_TREE;
> > +
> > + /* Catch squaring. */
> > + if ((host_integerp (exp, 0)
> > + && TREE_INT_CST_LOW (exp) == 2)
> > + || (TREE_CODE (exp) == REAL_CST
> > + && REAL_VALUES_EQUAL (TREE_REAL_CST (exp), dconst2)))
> > + return build2 (MULT_EXPR, TREE_TYPE (base), base, base);
> > +
> > + /* Catch square root. */
> > + if (TREE_CODE (exp) == REAL_CST
> > + && REAL_VALUES_EQUAL (TREE_REAL_CST (exp), dconsthalf))
> > + {
> > + tree newfn = mathfn_built_in (TREE_TYPE (base), BUILT_IN_SQRT);
> > + tree newarglist = build_tree_list (NULL_TREE, base);
> > + return build_function_call_expr (newfn, newarglist);
> > + }
> > +
> > + return NULL_TREE;
> > + }
> > +
> > +
> > /* Function vect_recog_widen_sum_pattern
> >
> > Try to find the following pattern:
> > Index: tree-vectorizer.h
> > ===================================================================
> > *** tree-vectorizer.h (revision 119016)
> > --- tree-vectorizer.h (working copy)
> > *************** extern loop_vec_info vect_analyze_loop (
> > *** 357,363 ****
> > Additional pattern recognition functions can (and will) be added
> > in the future. */
> > typedef tree (* vect_recog_func_ptr) (tree, tree *, tree *);
> > ! #define NUM_PATTERNS 3
> > void vect_pattern_recog (loop_vec_info);
> >
> >
> > --- 357,363 ----
> > Additional pattern recognition functions can (and will) be added
> > in the future. */
> > typedef tree (* vect_recog_func_ptr) (tree, tree *, tree *);
> > ! #define NUM_PATTERNS 4
> > void vect_pattern_recog (loop_vec_info);
> >
> >
> > Index: testsuite/gcc.dg/vect/vect-pow-1.c
> > ===================================================================
> > *** testsuite/gcc.dg/vect/vect-pow-1.c (revision 0)
> > --- testsuite/gcc.dg/vect/vect-pow-1.c (revision 0)
> > ***************
> > *** 0 ****
> > --- 1,14 ----
> > + /* { dg-do compile } */
> > + /* { dg-options "-O2 -ftree-vectorize -ffast-math -fdump-tree-
> > vect-details" } */
> > +
> > + double x[256];
> > +
> > + void foo(void)
> > + {
> > + int i;
> > + for (i=0; i<256; ++i)
> > + x[i] = x[i] * x[i];
> > + }
> > +
> > + /* { dg-final { scan-tree-dump "pattern recognized" "vect" } } */
> > + /* { dg-final { cleanup-tree-dump "vect" } } */
> > Index: testsuite/gcc.dg/vect/vect-pow-2.c
> > ===================================================================
> > *** testsuite/gcc.dg/vect/vect-pow-2.c (revision 0)
> > --- testsuite/gcc.dg/vect/vect-pow-2.c (revision 0)
> > ***************
> > *** 0 ****
> > --- 1,14 ----
> > + /* { dg-do compile } */
> > + /* { dg-options "-O2 -ftree-vectorize -fno-math-errno -fdump-tree-
> > vect-details" } */
> > +
> > + double x[256];
> > +
> > + void foo(void)
> > + {
> > + int i;
> > + for (i=0; i<256; ++i)
> > + x[i] = __builtin_pow (x[i], 0.5);
> > + }
> > +
> > + /* { dg-final { scan-tree-dump "pattern recognized" "vect" } } */
> > + /* { dg-final { cleanup-tree-dump "vect" } } */
>
More information about the Gcc-patches
mailing list