This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch, vectorizer] Fix PR tree-optimization/36648


On Mon, Jun 30, 2008 at 11:00 AM, Ira Rosen <IRAR@il.ibm.com> wrote:
>
> Hi,
>
> This patch fixes a bug in calculation of number of prolog loop iterations
> (used to align data accesses) in case of non-unit-stride access.
>
> Bootstrapped with vectorization enabled and tested on x86_64-linux. O.K.
> for 4.3 branch and mainline?

Ok.

Thanks,
Richard.

> Thanks,
> Ira
>
> ChangeLogs:
>
>      PR tree-optimization/36648
>      * tree-vect-transform.c (vect_do_peeling_for_loop_bound): Divide
> number
>      of prolog iterations by step. Fix the comment.
>
>
>      PR tree-optimization/36648
>      * g++.dg/vect/pr36648.cc: New testcase.
>
> Index: tree-vect-transform.c
> ===================================================================
> --- tree-vect-transform.c       (revision 137251)
> +++ tree-vect-transform.c       (working copy)
> @@ -6725,16 +6725,14 @@ vect_do_peeling_for_loop_bound (loop_vec
>    Else, compute address misalignment in bytes:
>      addr_mis = addr & (vectype_size - 1)
>
> -   prolog_niters = min ( LOOP_NITERS , (VF - addr_mis/elem_size)&(VF-1) )
> -
> -   (elem_size = element type size; an element is the scalar element
> -       whose type is the inner type of the vectype)
> -
> -   For interleaving,
> -
> -   prolog_niters = min ( LOOP_NITERS ,
> -                        (VF/group_size -
> addr_mis/elem_size)&(VF/group_size-1) )
> -        where group_size is the size of the interleaved group.
> +   prolog_niters = min (LOOP_NITERS, ((VF -
> addr_mis/elem_size)&(VF-1))/step))
> +
> +   (elem_size = element type size; an element is the scalar element whose
> type
> +   is the inner type of the vectype)
> +
> +   When the step of the data-ref in the loop is not 1 (as in interleaved
> data
> +   and SLP), the number of iterations of the prolog must be divided by the
> step)
> +   (which is equal to the size of interleaved group).
>
>    The above formulas assume that VF == number of elements in the vector.
> This
>    may not hold when there are multiple-types in the loop.
> @@ -6756,18 +6754,12 @@ vect_gen_niters_for_prolog_loop (loop_ve
>   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>   int vectype_align = TYPE_ALIGN (vectype) / BITS_PER_UNIT;
>   tree niters_type = TREE_TYPE (loop_niters);
> -  int group_size = 1;
> +  int step = 1;
>   int element_size = GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (DR_REF (dr))));
>   int nelements = TYPE_VECTOR_SUBPARTS (vectype);
>
>   if (STMT_VINFO_STRIDED_ACCESS (stmt_info))
> -    {
> -      /* For interleaved access element size must be multiplied by the
> size of
> -        the interleaved group.  */
> -      group_size = DR_GROUP_SIZE (vinfo_for_stmt (
> -                                              DR_GROUP_FIRST_DR
> (stmt_info)));
> -      element_size *= group_size;
> -    }
> +    step = DR_GROUP_SIZE (vinfo_for_stmt (DR_GROUP_FIRST_DR
> (stmt_info)));)));
>
>   pe = loop_preheader_edge (loop);
> @@ -6778,8 +6770,9 @@ vect_gen_niters_for_prolog_loop (loop_ve
>
>       if (vect_print_dump_info (REPORT_DETAILS))
>         fprintf (vect_dump, "known alignment = %d.", byte_misalign);
> -      iters = build_int_cst (niters_type,
> -                            (nelements -
> elem_misalign)&(nelements/group_size-1));
> +
> +      iters = build_int_cst (niters_type,
> +                     (((nelements - elem_misalign) & (nelements - 1)) /
> step));));
>     }
>   else
>     {
> Index: testsuite/g++.dg/vect/pr36648.cc
> ===================================================================
> --- testsuite/g++.dg/vect/pr36648.cc    (revision 0)
> +++ testsuite/g++.dg/vect/pr36648.cc    (revision 0)
> @@ -0,0 +1,24 @@
> +/* { dg-require-effective-target vect_float } */
> +
> +struct vector
> +{
> +  vector() : x(0), y(0), z(0) { }
> +  float x,y,z;
> +};
> +};
> +struct Foo
> +{
> +  int dummy;
> +  /* Misaligned access.  */
> +  vector array_of_vectors[4];
> +};
> +};
> +Foo foo;
> +
> +int main() { }
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } }*/
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1
> "vect" } } */
> +/* { dg-final { cleanup-tree-dump "vect" } } */
> +
> +
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]