[PATCH][RFC] Do some vectorizer-friendly canonicalization before vectorization
Richard Guenther
rguenther@suse.de
Tue Nov 21 11:34:00 GMT 2006
On Mon, 20 Nov 2006, Dorit Nuzman wrote:
> >
> > Currently, especially with -funsafe-math-optimizations, we pessimize
> > vectorization by canonicalizing multiplications to calls to pow ().
> > This patch addresses this by doing a different canonicalization (or
> > tree-level expansion dependent on how you view this) before
> > vectorization.
> >
>
> funny - we just stumbled into such an occurrence - had to avoid using
> -fast-math in order to be able to vectorize an x*x that was converted to
> pow(x,2).
Yes, I investigated why for polyhedron even with vectorization for
sqrt enabled we don't do too much vectorization - we don't just because
of that transformation.
>
> > This is a simple prototype hooked into the vectorizer and only
> > transforming loop bodies. The only transformations implemented
> > for this are pow (x, 2) to x * x and pow (x, 0.5) to sqrt (x) because
> > both x * x and sqrt (x) are easy to vectorize.
> >
> > Does this look like a reasonable approach?
> >
>
> Actually, I envision this type of transformation taking place as part of
> our idiom-recognition pass in the vectorizer. In vect_pattern_recog() we
> already scan all the stmts in the loop, looking for a certain pattern
> (dot-product, widening-multiplication, maybe saturation in the future, and
> possibly pow in this case), and replace it with a new stmt (the
> 'pattern_stmt') that represents/implements the pattern (multiply/sqrt in
> this case). In fact, we don't really replace the original stmts - we add
> the 'pattern_stmt' with its def unused, and just mark the original stmts
> that they were recognized as part of a pattern to be replaced by the
> 'pattern_stmt'. Later on, the vectorizer knows to vectorize the
> pattern_stmt rather than the original stmts. So, if the loop doesn't get
> vectorized, the code doesn't change. This is explained in detail in
> tree-vect-patterns.c.
>
> If you agree that this transformation fits with the vect_pattern_recog
> approach, what you need to do is basically:
> - update VECT_NUM_PATTERNS
> - add a new function in tree-vect-patterns.c, say -
> "vect_recog_unsafe_math_patterns".
> - add the above function to the initialization of the
> vect_vect_recog_func_ptrs array
>
> The content of vect_recog_unsafe_math_patterns would basically be your
> maybe_replace_pow_expr function, expect instead of replacing the expr, just
> return it.
Ok, while the "pattern" is not exactly a pattern but only one
instruction, here's the pattern variant. The sqrt transformation
will only be recognized after someone approves the vectorization
of builtins and I dig out my i386 backend patch to enable the
use of __builtin_ia32_sqrtpd and other SSE intrinsics we have there.
Richard.
2006-11-21 Richard Guenther <rguenther@suse.de>
* tree-vectorizer.h (NUM_PATTERNS): Increase.
* tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add
vect_recog_pow_pattern.
(vect_recog_pow_pattern): New function.
* gcc.dg/vect/vect-pow-1.c: New testcase.
* gcc.dg/vect/vect-pow-2.c: Likewise.
Index: tree-vect-patterns.c
===================================================================
*** tree-vect-patterns.c (revision 119016)
--- tree-vect-patterns.c (working copy)
*************** static bool widened_name_p (tree, tree,
*** 50,59 ****
static tree vect_recog_widen_sum_pattern (tree, tree *, tree *);
static tree vect_recog_widen_mult_pattern (tree, tree *, tree *);
static tree vect_recog_dot_prod_pattern (tree, tree *, tree *);
static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
vect_recog_widen_mult_pattern,
vect_recog_widen_sum_pattern,
! vect_recog_dot_prod_pattern};
/* Function widened_name_p
--- 50,61 ----
static tree vect_recog_widen_sum_pattern (tree, tree *, tree *);
static tree vect_recog_widen_mult_pattern (tree, tree *, tree *);
static tree vect_recog_dot_prod_pattern (tree, tree *, tree *);
+ static tree vect_recog_pow_pattern (tree, tree *, tree *);
static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
vect_recog_widen_mult_pattern,
vect_recog_widen_sum_pattern,
! vect_recog_dot_prod_pattern,
! vect_recog_pow_pattern};
/* Function widened_name_p
*************** vect_recog_widen_mult_pattern (tree last
*** 400,405 ****
--- 402,494 ----
}
+ /* Function vect_recog_pow_pattern
+
+ Try to find the following pattern:
+
+ x = POW (y, N);
+
+ with POW being one of pow, powf, powi, powif and N being
+ either 2 or 0.5.
+
+ Input:
+
+ * LAST_STMT: A stmt from which the pattern search begins.
+
+ Output:
+
+ * TYPE_IN: The type of the input arguments to the pattern.
+
+ * TYPE_OUT: The type of the output of this pattern.
+
+ * Return value: A new stmt that will be used to replace the sequence of
+ stmts that constitute the pattern. In this case it will be:
+ x * x
+ or
+ sqrt (x)
+ */
+
+ static tree
+ vect_recog_pow_pattern (tree last_stmt, tree *type_in, tree *type_out)
+ {
+ tree expr;
+ tree type;
+ tree fn, arglist, base, exp;
+
+ if (TREE_CODE (last_stmt) != MODIFY_EXPR)
+ return NULL;
+
+ expr = TREE_OPERAND (last_stmt, 1);
+ type = TREE_TYPE (expr);
+
+ if (TREE_CODE (expr) != CALL_EXPR)
+ return NULL_TREE;
+
+ fn = get_callee_fndecl (expr);
+ arglist = TREE_OPERAND (expr, 1);
+ switch (DECL_FUNCTION_CODE (fn))
+ {
+ case BUILT_IN_POWIF:
+ case BUILT_IN_POWI:
+ case BUILT_IN_POWF:
+ case BUILT_IN_POW:
+ base = TREE_VALUE (arglist);
+ exp = TREE_VALUE (TREE_CHAIN (arglist));
+ if (TREE_CODE (exp) != REAL_CST
+ && TREE_CODE (exp) != INTEGER_CST)
+ return NULL_TREE;
+ break;
+
+ default:;
+ return NULL_TREE;
+ }
+
+ /* We now have a pow or powi builtin function call with a constant
+ exponent. */
+
+ *type_in = get_vectype_for_scalar_type (TREE_TYPE (base));
+ *type_out = NULL_TREE;
+
+ /* Catch squaring. */
+ if ((host_integerp (exp, 0)
+ && TREE_INT_CST_LOW (exp) == 2)
+ || (TREE_CODE (exp) == REAL_CST
+ && REAL_VALUES_EQUAL (TREE_REAL_CST (exp), dconst2)))
+ return build2 (MULT_EXPR, TREE_TYPE (base), base, base);
+
+ /* Catch square root. */
+ if (TREE_CODE (exp) == REAL_CST
+ && REAL_VALUES_EQUAL (TREE_REAL_CST (exp), dconsthalf))
+ {
+ tree newfn = mathfn_built_in (TREE_TYPE (base), BUILT_IN_SQRT);
+ tree newarglist = build_tree_list (NULL_TREE, base);
+ return build_function_call_expr (newfn, newarglist);
+ }
+
+ return NULL_TREE;
+ }
+
+
/* Function vect_recog_widen_sum_pattern
Try to find the following pattern:
Index: tree-vectorizer.h
===================================================================
*** tree-vectorizer.h (revision 119016)
--- tree-vectorizer.h (working copy)
*************** extern loop_vec_info vect_analyze_loop (
*** 357,363 ****
Additional pattern recognition functions can (and will) be added
in the future. */
typedef tree (* vect_recog_func_ptr) (tree, tree *, tree *);
! #define NUM_PATTERNS 3
void vect_pattern_recog (loop_vec_info);
--- 357,363 ----
Additional pattern recognition functions can (and will) be added
in the future. */
typedef tree (* vect_recog_func_ptr) (tree, tree *, tree *);
! #define NUM_PATTERNS 4
void vect_pattern_recog (loop_vec_info);
Index: testsuite/gcc.dg/vect/vect-pow-1.c
===================================================================
*** testsuite/gcc.dg/vect/vect-pow-1.c (revision 0)
--- testsuite/gcc.dg/vect/vect-pow-1.c (revision 0)
***************
*** 0 ****
--- 1,14 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -ftree-vectorize -ffast-math -fdump-tree-vect-details" } */
+
+ double x[256];
+
+ void foo(void)
+ {
+ int i;
+ for (i=0; i<256; ++i)
+ x[i] = x[i] * x[i];
+ }
+
+ /* { dg-final { scan-tree-dump "pattern recognized" "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-pow-2.c
===================================================================
*** testsuite/gcc.dg/vect/vect-pow-2.c (revision 0)
--- testsuite/gcc.dg/vect/vect-pow-2.c (revision 0)
***************
*** 0 ****
--- 1,14 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -ftree-vectorize -fno-math-errno -fdump-tree-vect-details" } */
+
+ double x[256];
+
+ void foo(void)
+ {
+ int i;
+ for (i=0; i<256; ++i)
+ x[i] = __builtin_pow (x[i], 0.5);
+ }
+
+ /* { dg-final { scan-tree-dump "pattern recognized" "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
More information about the Gcc-patches
mailing list