[PATCH][RFC] Do some vectorizer-friendly canonicalization before vectorization

Richard Guenther rguenther@suse.de
Tue Nov 21 11:34:00 GMT 2006


On Mon, 20 Nov 2006, Dorit Nuzman wrote:

> >
> > Currently, especially with -funsafe-math-optimizations, we pessimize
> > vectorization by canonicalizing multiplications to calls to pow ().
> > This patch addresses this by doing a different canonicalization (or
> > tree-level expansion dependent on how you view this) before
> > vectorization.
> >
> 
> funny - we just stumbled into such an occurrence - had to avoid using
> -fast-math in order to be able to vectorize an x*x that was converted to
> pow(x,2).

Yes, I investigated why for polyhedron even with vectorization for
sqrt enabled we don't do too much vectorization - we don't just because
of that transformation.

> 
> > This is a simple prototype hooked into the vectorizer and only
> > transforming loop bodies.  The only transformations implemented
> > for this are pow (x, 2) to x * x and pow (x, 0.5) to sqrt (x) because
> > both x * x and sqrt (x) are easy to vectorize.
> >
> > Does this look like a reasonable approach?
> >
> 
> Actually, I envision this type of transformation taking place as part of
> our idiom-recognition pass in the vectorizer. In vect_pattern_recog() we
> already scan all the stmts in the loop, looking for a certain pattern
> (dot-product, widening-multiplication, maybe saturation in the future, and
> possibly pow in this case), and replace it with a new stmt (the
> 'pattern_stmt') that represents/implements the pattern (multiply/sqrt in
> this case). In fact, we don't really replace the original stmts - we add
> the 'pattern_stmt' with its def unused, and just mark the original stmts
> that they were recognized as part of a pattern to be replaced by the
> 'pattern_stmt'. Later on, the vectorizer knows to vectorize the
> pattern_stmt rather than the original stmts. So, if the loop doesn't get
> vectorized, the code doesn't change. This is explained in detail in
> tree-vect-patterns.c.
> 
> If you agree that this transformation fits with the vect_pattern_recog
> approach, what you need to do is basically:
> - update VECT_NUM_PATTERNS
> - add a new function in tree-vect-patterns.c, say -
> "vect_recog_unsafe_math_patterns".
> - add the above function to the initialization of the
> vect_vect_recog_func_ptrs array
> 
> The content of vect_recog_unsafe_math_patterns would basically be your
> maybe_replace_pow_expr function, expect instead of replacing the expr, just
> return it.

Ok, while the "pattern" is not exactly a pattern but only one
instruction, here's the pattern variant.  The sqrt transformation
will only be recognized after someone approves the vectorization
of builtins and I dig out my i386 backend patch to enable the
use of __builtin_ia32_sqrtpd and other SSE intrinsics we have there.

Richard.

2006-11-21  Richard Guenther  <rguenther@suse.de>

	* tree-vectorizer.h (NUM_PATTERNS): Increase.
	* tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add
	vect_recog_pow_pattern.
	(vect_recog_pow_pattern): New function.

	* gcc.dg/vect/vect-pow-1.c: New testcase.
	* gcc.dg/vect/vect-pow-2.c: Likewise.

Index: tree-vect-patterns.c
===================================================================
*** tree-vect-patterns.c	(revision 119016)
--- tree-vect-patterns.c	(working copy)
*************** static bool widened_name_p (tree, tree, 
*** 50,59 ****
  static tree vect_recog_widen_sum_pattern (tree, tree *, tree *);
  static tree vect_recog_widen_mult_pattern (tree, tree *, tree *);
  static tree vect_recog_dot_prod_pattern (tree, tree *, tree *);
  static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
  	vect_recog_widen_mult_pattern,
  	vect_recog_widen_sum_pattern,
! 	vect_recog_dot_prod_pattern};
  
  
  /* Function widened_name_p
--- 50,61 ----
  static tree vect_recog_widen_sum_pattern (tree, tree *, tree *);
  static tree vect_recog_widen_mult_pattern (tree, tree *, tree *);
  static tree vect_recog_dot_prod_pattern (tree, tree *, tree *);
+ static tree vect_recog_pow_pattern (tree, tree *, tree *);
  static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
  	vect_recog_widen_mult_pattern,
  	vect_recog_widen_sum_pattern,
! 	vect_recog_dot_prod_pattern,
! 	vect_recog_pow_pattern};
  
  
  /* Function widened_name_p
*************** vect_recog_widen_mult_pattern (tree last
*** 400,405 ****
--- 402,494 ----
  }
  
  
+ /* Function vect_recog_pow_pattern
+ 
+    Try to find the following pattern:
+ 
+      x = POW (y, N);
+ 
+    with POW being one of pow, powf, powi, powif and N being
+    either 2 or 0.5.
+ 
+    Input:
+ 
+    * LAST_STMT: A stmt from which the pattern search begins.
+ 
+    Output:
+ 
+    * TYPE_IN: The type of the input arguments to the pattern.
+ 
+    * TYPE_OUT: The type of the output of this pattern.
+ 
+    * Return value: A new stmt that will be used to replace the sequence of
+    stmts that constitute the pattern. In this case it will be:
+         x * x
+    or
+ 	sqrt (x)
+ */
+ 
+ static tree
+ vect_recog_pow_pattern (tree last_stmt, tree *type_in, tree *type_out)
+ {
+   tree expr;
+   tree type;
+   tree fn, arglist, base, exp;
+ 
+   if (TREE_CODE (last_stmt) != MODIFY_EXPR)
+     return NULL;
+ 
+   expr = TREE_OPERAND (last_stmt, 1);
+   type = TREE_TYPE (expr);
+ 
+   if (TREE_CODE (expr) != CALL_EXPR)
+     return NULL_TREE;
+ 
+   fn = get_callee_fndecl (expr);
+   arglist = TREE_OPERAND (expr, 1);
+   switch (DECL_FUNCTION_CODE (fn))
+     {
+     case BUILT_IN_POWIF:
+     case BUILT_IN_POWI:
+     case BUILT_IN_POWF:
+     case BUILT_IN_POW:
+       base = TREE_VALUE (arglist);
+       exp = TREE_VALUE (TREE_CHAIN (arglist));
+       if (TREE_CODE (exp) != REAL_CST
+ 	  && TREE_CODE (exp) != INTEGER_CST)
+         return NULL_TREE;
+       break;
+ 
+     default:;
+       return NULL_TREE;
+     }
+ 
+   /* We now have a pow or powi builtin function call with a constant
+      exponent.  */
+ 
+   *type_in = get_vectype_for_scalar_type (TREE_TYPE (base));
+   *type_out = NULL_TREE;
+ 
+   /* Catch squaring.  */
+   if ((host_integerp (exp, 0)
+        && TREE_INT_CST_LOW (exp) == 2)
+       || (TREE_CODE (exp) == REAL_CST
+           && REAL_VALUES_EQUAL (TREE_REAL_CST (exp), dconst2)))
+     return build2 (MULT_EXPR, TREE_TYPE (base), base, base);
+ 
+   /* Catch square root.  */
+   if (TREE_CODE (exp) == REAL_CST
+       && REAL_VALUES_EQUAL (TREE_REAL_CST (exp), dconsthalf))
+     {
+       tree newfn = mathfn_built_in (TREE_TYPE (base), BUILT_IN_SQRT);
+       tree newarglist = build_tree_list (NULL_TREE, base);
+       return build_function_call_expr (newfn, newarglist);
+     }
+ 
+   return NULL_TREE;
+ }
+ 
+ 
  /* Function vect_recog_widen_sum_pattern
  
     Try to find the following pattern:
Index: tree-vectorizer.h
===================================================================
*** tree-vectorizer.h	(revision 119016)
--- tree-vectorizer.h	(working copy)
*************** extern loop_vec_info vect_analyze_loop (
*** 357,363 ****
     Additional pattern recognition functions can (and will) be added
     in the future.  */
  typedef tree (* vect_recog_func_ptr) (tree, tree *, tree *);
! #define NUM_PATTERNS 3
  void vect_pattern_recog (loop_vec_info);
  
  
--- 357,363 ----
     Additional pattern recognition functions can (and will) be added
     in the future.  */
  typedef tree (* vect_recog_func_ptr) (tree, tree *, tree *);
! #define NUM_PATTERNS 4
  void vect_pattern_recog (loop_vec_info);
  
  
Index: testsuite/gcc.dg/vect/vect-pow-1.c
===================================================================
*** testsuite/gcc.dg/vect/vect-pow-1.c	(revision 0)
--- testsuite/gcc.dg/vect/vect-pow-1.c	(revision 0)
***************
*** 0 ****
--- 1,14 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -ftree-vectorize -ffast-math -fdump-tree-vect-details" } */
+ 
+ double x[256];
+ 
+ void foo(void)
+ {
+   int i;
+   for (i=0; i<256; ++i)
+     x[i] = x[i] * x[i];
+ }
+ 
+ /* { dg-final { scan-tree-dump "pattern recognized" "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-pow-2.c
===================================================================
*** testsuite/gcc.dg/vect/vect-pow-2.c	(revision 0)
--- testsuite/gcc.dg/vect/vect-pow-2.c	(revision 0)
***************
*** 0 ****
--- 1,14 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -ftree-vectorize -fno-math-errno -fdump-tree-vect-details" } */
+ 
+ double x[256];
+ 
+ void foo(void)
+ {
+   int i;
+   for (i=0; i<256; ++i)
+     x[i] = __builtin_pow (x[i], 0.5);
+ }
+ 
+ /* { dg-final { scan-tree-dump "pattern recognized" "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */



More information about the Gcc-patches mailing list