[PATCH, i386]: Vectorize conversions for i386

Sun Feb 11 12:58:00 GMT 2007

> Uros Bizjak <ubizjak@gmail.com> wrote on 08/02/2007 16:50:23:
>
> > Hello!
> >
> > This patch implements conversions for i386 and x86_64 targets. Patch
>
> Cool, thanks!
>
> > builds on (yet uncommitted) patch that implements vectorized
conversions
> > infrastructure by Tehila Meyzels
> > (http://gcc.gnu.org/ml/gcc-patches/2007-02/msg00494.html).
> >
>
> I plan to commit it soonish. The vectorization-of-induction patch that I
> committed a couple of days ago actually exposed a couple of problems in
> this patch - I'm fixing it now - will commit the fixed version (and
submit
> the fixes to the list) this weekend.
>

Here is the patch I ended up committing.

The diffs relative to Tehila's patch are as follows:

1) Mark the arguments of the call-expr for renaming:

+      FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
+        {
+          if (TREE_CODE (sym) == SSA_NAME)
+            sym = SSA_NAME_VAR (sym);
+          mark_sym_for_renaming (sym);
+        }

this is necessary because on some targets the builtins we call are not
marked as readonly (e.g. Altivec), which IIUC causes the compiler to create
vdefs/vuses, which resulted in an ICE for some of there not getting
renamed.

2) This bit was missing in the patch I was testing, which caused most of
the failures I saw:

+  /* Supportable by target?  */
+  if (!targetm.vectorize.builtin_conversion (code, vectype_in))
+    {
+      if (vect_print_dump_info (REPORT_DETAILS))
+        fprintf (vect_dump, "op not supported by target.");
+      return false;
+    }

3) Updates to the vectorizer testsuite to reflect that more loops get
vectorized.

Index: testsuite/ChangeLog
===================================================================

--- testsuite/ChangeLog (revision 121815)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,13 @@
+2007-02-11  Tehila Meyzels <tehila@il.ibm.com>
+         Dorit Nuzman  <dorit@il.ibm.com>
+
+     * gcc.dg/vect/vect-intfloat-conversion.c-1:  New test.
+     * gcc.dg/vect/vect-intfloat-conversion.c-2:  New test.
+     * gcc.dg/vect/vect-93.c: Another loop gets vectorized  on powerpc.
+     * gcc.dg/vect/vect-113.c: Likewise.
+
+     * gcc.dg/vect/vect-iv-11.c: A loop gets vectorized.
+
 2007-02-10  Richard Henderson  <rth@redhat.com>

      * lib/target-supports.exp (check_effective_target_tls): Redefine
Index: testsuite/gcc.dg/vect/vect-93.c
===================================================================
--- testsuite/gcc.dg/vect/vect-93.c (revision 121815)
+++ testsuite/gcc.dg/vect/vect-93.c (working copy)
@@ -65,12 +65,21 @@
   return 0;
 }

-/* in main1 */
-/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
+/* 2 loops vectorized in main1, 2 loops vectorized in main:
+   the first loop in main requires vectorization of conversions,
+   the second loop in main requires vectorization of misaliged load:  */
+
+/* main && main1 together: */
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 2 "vect" {
target powerpc*-*-* } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using
peeling" 2 "vect" { target vect_no_align } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using
peeling" 3 "vect" { xfail vect_no_align } } } */

-/* in main */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail
vect_no_align } } } */
+/* in main1: */
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" {
target {! powerpc*-*-* } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" {
target vect_no_align } } } */
+
+/* in main: */
+/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" {
target vect_no_align } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1
"vect" { xfail vect_no_align } } } */
+
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-iv-11.c
===================================================================
--- testsuite/gcc.dg/vect/vect-iv-11.c    (revision 121815)
+++ testsuite/gcc.dg/vect/vect-iv-11.c    (working copy)
@@ -28,5 +28,5 @@
   return 0;
 }

-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail
*-*-* } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-113.c
===================================================================
--- testsuite/gcc.dg/vect/vect-113.c      (revision 121815)
+++ testsuite/gcc.dg/vect/vect-113.c      (working copy)
@@ -11,7 +11,7 @@
   int i;
   float a[N];

-  /* Induction.  */
+  /* Induction and type conversion.  */
   for ( i = 0; i < N; i++)
   {
     a[i] = i;
@@ -32,5 +32,5 @@
   return main1 ();
 }

-/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {
target powerpc*-*-* } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */


Bootstraped on powerpc-linux, and on i386-linux.

dorit

(See attached file: induc.feb11.txt)


> dorit
>
> > Patch was bootstrapped on x86_64-pc-linux-gnu and regression tested for
> > c,c++ and gfortran.
> >
> > This patch also includes two testcases from original patch, changed to
> > account for i386 and x86_64 targets and a testcase that checks
float-int
> > conversions.
> >
> > 2007-02-08  Uros Bizjak  <ubizjak@gmail.com>
> >
> >         * config/i386/i386.c (TARGET_VECTORIZE_BUILTIN_CONVERSION):
> Define.
> >         (ix86_builtin_conversion): New function.
> >
> > testsuite/ChangeLog:
> >
> >         * gcc.dg/vect/vect-intfloat-conversion-1.c: Scan for vectorized
> loop
> >         also for i?86-*-* and x86_64-*-* targets.
> >         * gcc.dg/vect/vect-intfloat-conversion-2.c: Ditto.
> >         * gcc.dg/vect/vect-floatint-conversion-1.c: New.
> >
> > Uros.
> >
> > Index: config/i386/i386.c
> > ===================================================================
> > --- config/i386/i386.c   (revision 121711)
> > +++ config/i386/i386.c   (working copy)
> > @@ -1516,6 +1516,7 @@
> >  static void ix86_init_builtins (void);
> >  static rtx ix86_expand_builtin (tree, rtx, rtx, enum machine_mode,
int);
> >  static tree ix86_builtin_vectorized_function (enum
> > built_in_function, tree, tree);
> > +static tree ix86_builtin_conversion (enum tree_code, tree);
> >  static const char *ix86_mangle_fundamental_type (tree);
> >  static tree ix86_stack_protect_fail (void);
> >  static rtx ix86_internal_arg_pointer (void);
> > @@ -1580,8 +1581,11 @@
> >  #define TARGET_INIT_BUILTINS ix86_init_builtins
> >  #undef TARGET_EXPAND_BUILTIN
> >  #define TARGET_EXPAND_BUILTIN ix86_expand_builtin
> > +
> >  #undef TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
> >  #define TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
> > ix86_builtin_vectorized_function
> > +#undef TARGET_VECTORIZE_BUILTIN_CONVERSION
> > +#define TARGET_VECTORIZE_BUILTIN_CONVERSION ix86_builtin_conversion
> >
> >  #undef TARGET_ASM_FUNCTION_EPILOGUE
> >  #define TARGET_ASM_FUNCTION_EPILOGUE ix86_output_function_epilogue
> > @@ -18056,6 +18060,40 @@
> >    return NULL_TREE;
> >  }
> >
> > +/* Returns a decl of a function that implements conversion of the
> > +   input vector of type TYPE, or NULL_TREE if it is not available.  */
> > +
> > +static tree
> > +ix86_builtin_conversion (enum tree_code code, tree type)
> > +{
> > +  if (TREE_CODE (type) != VECTOR_TYPE)
> > +    return NULL_TREE;
> > +
> > +  switch (code)
> > +    {
> > +    case FLOAT_EXPR:
> > +      switch (TYPE_MODE (type))
> > +   {
> > +   case V4SImode:
> > +     return ix86_builtins[IX86_BUILTIN_CVTDQ2PS];
> > +   default:
> > +     return NULL_TREE;
> > +   }
> > +
> > +    case FIX_TRUNC_EXPR:
> > +      switch (TYPE_MODE (type))
> > +   {
> > +   case V4SFmode:
> > +     return ix86_builtins[IX86_BUILTIN_CVTTPS2DQ];
> > +   default:
> > +     return NULL_TREE;
> > +   }
> > +    default:
> > +      return NULL_TREE;
> > +
> > +    }
> > +}
> > +
> >  /* Store OPERAND to the memory after reload is completed.  This means
> >     that we can't easily use assign_stack_local.  */
> >  rtx
> > Index: testsuite/gcc.dg/vect/vect-intfloat-conversion-1.c
> > ===================================================================
> > --- testsuite/gcc.dg/vect/vect-intfloat-conversion-1.c   (revision 0)
> > +++ testsuite/gcc.dg/vect/vect-intfloat-conversion-1.c   (revision 0)
> > @@ -0,0 +1,38 @@
> > +/* { dg-require-effective-target vect_int } */
> > +
> > +#include <stdarg.h>
> > +#include "tree-vect.h"
> > +
> > +#define N 32
> > +
> > +int main1 ()
> > +{
> > +  int i;
> > +  int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,3,6,9,
> > 12,15,18,21,24,27,30,33,36,39,42,45};
> > +  float fa[N];
> > +
> > +  /* int -> float */
> > +  for (i = 0; i < N; i++)
> > +    {
> > +      fa[i] = (float) ib[i];
> > +    }
> > +
> > +  /* check results:  */
> > +  for (i = 0; i < N; i++)
> > +    {
> > +      if (fa[i] != (float) ib[i])
> > +        abort ();
> > +    }
> > +
> > +  return 0;
> > +}
> > +
> > +int main (void)
> > +{
> > +  check_vect ();
> > +
> > +  return main1 ();
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"
> > { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
> > +/* { dg-final { cleanup-tree-dump "vect" } } */
> > Index: testsuite/gcc.dg/vect/vect-intfloat-conversion-2.c
> > ===================================================================
> > --- testsuite/gcc.dg/vect/vect-intfloat-conversion-2.c   (revision 0)
> > +++ testsuite/gcc.dg/vect/vect-intfloat-conversion-2.c   (revision 0)
> > @@ -0,0 +1,40 @@
> > +/* { dg-require-effective-target vect_int } */
> > +
> > +#include <stdarg.h>
> > +#include "tree-vect.h"
> > +
> > +#define N 32
> > +
> > +int main1 ()
> > +{
> > +  int i;
> > +  int int_arr[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,
> > 3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
> > +  float float_arr[N];
> > +  char char_arr[N];
> > +
> > +  for (i = 0; i < N; i++){
> > +    float_arr[i] = (float) int_arr[i];
> > +    char_arr[i] = 0;
> > +  }
> > +
> > +  /* check results:  */
> > +  for (i = 0; i < N; i++)
> > +    {
> > +      if (float_arr[i] != (float) int_arr[i])
> > +        abort ();
> > +      if (char_arr[i] != 0)
> > +   abort ();
> > +    }
> > +
> > +  return 0;
> > +}
> > +
> > +int main (void)
> > +{
> > +  check_vect ();
> > +
> > +  return main1 ();
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"
> > { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
> > +/* { dg-final { cleanup-tree-dump "vect" } } */
> > Index: testsuite/gcc.dg/vect/vect-floatint-conversion-1.c
> > ===================================================================
> > --- testsuite/gcc.dg/vect/vect-floatint-conversion-1.c   (revision 0)
> > +++ testsuite/gcc.dg/vect/vect-floatint-conversion-1.c   (revision 0)
> > @@ -0,0 +1,40 @@
> > +/* { dg-require-effective-target vect_float } */
> > +
> > +#include <stdarg.h>
> > +#include "tree-vect.h"
> > +
> > +#define N 32
> > +
> > +int
> > +main1 ()
> > +{
> > +  int i;
> > +  float fb[N] = {0.4,3.5,6.6,9.4,12.5,15.6,18.4,21.5,24.6,27.4,30.
> > 5,33.6,36.4,39.5,42.6,45.4,0.5,3.6,6.4,9.5,12.6,15.4,18.5,21.6,24.4,
> > 27.5,30.6,33.4,36.5,39.6,42.4,45.5};
> > +  int ia[N];
> > +
> > +  /* float -> int */
> > +  for (i = 0; i < N; i++)
> > +    {
> > +      ia[i] = (int) fb[i];
> > +    }
> > +
> > +  /* check results:  */
> > +  for (i = 0; i < N; i++)
> > +    {
> > +      if (ia[i] != (int) fb[i])
> > +   abort ();
> > +    }
> > +
> > +  return 0;
> > +}
> > +
> > +int
> > +main (void)
> > +{
> > +  check_vect ();
> > +
> > +  return main1 ();
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"
> > { target i?86-*-* x86_64-*-* } } } */
> > +/* { dg-final { cleanup-tree-dump "vect" } } */
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: induc.feb11.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20070211/3888d2c7/attachment.txt>