This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch, vectorizer] Fix PR tree-optimization/37539 - hang in vect_transform_strided_load
- From: "Richard Guenther" <richard dot guenther at gmail dot com>
- To: "Ira Rosen" <IRAR at il dot ibm dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Thu, 18 Sep 2008 13:13:24 +0200
- Subject: Re: [patch, vectorizer] Fix PR tree-optimization/37539 - hang in vect_transform_strided_load
- References: <OF32CCABF8.035FF919-ONC22574C8.0033A3FF-C22574C8.003B288E@il.ibm.com>
On Thu, Sep 18, 2008 at 12:46 PM, Ira Rosen <IRAR@il.ibm.com> wrote:
>
> Hi,
>
> In case of loops with multiple types the vectorizer generates "copies" of
> statements of the bigger type. Regular vectorized statements are stored in
> vectorized_stmt field of struct _stmt_vec_info of the original scalar
> statement, and the copies are stored in related_stmt field of the same
> structure. If there is more than one copy, they are kept in the chain of
> related_stmt fields. In loops with strided loads that contain loads from
> the same location, we generate only one vector statement for several scalar
> loads (for the loads with the same memory ref). The combination of all
> those cases caused that the same vector statement was stored twice in the
> related statements chain (once for each original scalar load), causing
> infinite loop in vect_transform_strided_load.
>
> This patch prevents inserting vector statements in related chains of
> several scalar loads, we now create only one such chain and all the scalar
> loads have access to it through their vectorized_stmt.
>
> Even though the loop in the testcase is not vectorizable with 4.3 (because
> of unsupported extract operations for x86 and unsupported type conversion
> on power), the problems exists in 4.3 too.
>
> Bootstrapped with vectorization enabled and now testing on x86_64-linux.
> O.K. to apply to 4.3 branch and trunk once the testing completes?
Ok, but ...
> Thanks,
> Ira
>
> ChangeLog:
>
> PR tree-optimization/37539
> * tree-vect-transform.c (vect_transform_strided_load): Save vector
> statement in related statement field only for the first load of the
> group.
> of loads with the same data reference.
>
> testsuite/ChangeLog:
>
> PR tree-optimization/37539
> * gcc.dg/vect/pr37539.c: New test.
>
> Index: tree-vect-transform.c
> ===================================================================
> --- tree-vect-transform.c (revision 140444)
> +++ tree-vect-transform.c (working copy)
> @@ -5947,17 +5947,24 @@ vect_transform_strided_load (gimple stmt
> STMT_VINFO_VEC_STMT (vinfo_for_stmt (next_stmt)) = new_stmt;
> else
> {
> - gimple prev_stmt =
> - STMT_VINFO_VEC_STMT (vinfo_for_stmt (next_stmt));
> - gimple rel_stmt =
> - STMT_VINFO_RELATED_STMT (vinfo_for_stmt (prev_stmt));
> - while (rel_stmt)
> - {
> - prev_stmt = rel_stmt;
> - rel_stmt = STMT_VINFO_RELATED_STMT (vinfo_for_stmt
> (rel_stmt));
> - }
> - STMT_VINFO_RELATED_STMT (vinfo_for_stmt (prev_stmt)) =
> new_stmt;));
extra parens here? ...
> + if (!DR_GROUP_SAME_DR_STMT (vinfo_for_stmt (next_stmt)))
> + {
> + gimple prev_stmt =
> + STMT_VINFO_VEC_STMT (vinfo_for_stmt (next_stmt));))
and here.
> + gimple rel_stmt =
> + STMT_VINFO_RELATED_STMT (vinfo_for_stmt (prev_stmt));
> + while (rel_stmt)
> + {
> + prev_stmt = rel_stmt;
> + rel_stmt =
> + STMT_VINFO_RELATED_STMT (vinfo_for_stmt
> (rel_stmt));;));
> + }
> +
> + STMT_VINFO_RELATED_STMT (vinfo_for_stmt (prev_stmt)) =
> + new_stmt;
> + }
> }
> +
> next_stmt = DR_GROUP_NEXT_DR (vinfo_for_stmt (next_stmt));))
and here.
> gap_count = 1;
> /* If NEXT_STMT accesses the same DR as the previous statement,
> Index: testsuite/gcc.dg/vect/pr37539.c
> ===================================================================
> --- testsuite/gcc.dg/vect/pr37539.c (revision 0)
> +++ testsuite/gcc.dg/vect/pr37539.c (revision 0)
> @@ -0,0 +1,45 @@
> +/* { dg-require-effective-target vect_int } */
> +
> +#include <stdarg.h>
> +#include "tree-vect.h"
> +
> +__attribute__ ((noinline)) void
> +ayuv2yuyv_ref (int *d, int *src, int n)
> +{
> + char *dest = (char *)d;
> + int i;
> +
> + for(i=0;i<n/2;i++){
> + dest[i*4 + 0] = (src[i*2 + 0])>>16;
> + dest[i*4 + 1] = (src[i*2 + 1])>>8;;
> + dest[i*4 + 2] = (src[i*2 + 0])>>16;
> + dest[i*4 + 3] = (src[i*2 + 0])>>0;;
> + }
> +
> + /* Check results. */
> + for(i=0;i<n/2;i++){
> + if (dest[i*4 + 0] != (src[i*2 + 0])>>16}
> + || dest[i*4 + 1] != (src[i*2 + 1])>>8
> + || dest[i*4 + 2] != (src[i*2 + 0])>>16
> + || dest[i*4 + 3] != (src[i*2 + 0])>>0)
> + abort();
> + }
> +}
> +}
> +int main ());
> +{
> + int d[256], src[128], i;
> +
> + for (i = 0; i < 128; i++)
> + src[i] = i;
> +
> + ayuv2yuyv_ref(d, src, 128);
> +
> + return 0;
> +}
> +}
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2
> "vect" { target vect_strided_wide } } } */
> +/* { dg-final { cleanup-tree-dump "vect" } } */
> +
> +
> +
>
>
>