This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix PR50969
On Fri, 2012-02-03 at 14:41 +0100, Richard Guenther wrote:
> On Fri, Feb 3, 2012 at 2:24 PM, William J. Schmidt
> <wschmidt@linux.vnet.ibm.com> wrote:
> > This fixes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969 by slightly
> > raising the cost of vector permutes on powerpc64 VSX targets (and
> > ensuring those costs are correctly used). This reverses the performance
> > loss for 168.wupwise, and gives a slight boost to 433.milc as well.
> >
> > In the long run, we will want to model VSX permutes differently, since
> > the real issue is that only one floating-point pipe can hold a permute
> > at a time. Thus the present patch can be overly conservative when
> > permutes are rare compared with other vector instructions.
> >
> > Bootstrapped and regtested on powerpc64-linux-gnu with no failures. OK
> > for trunk?
>
> Note this makes permutes artificially cheap for AMD K8, K10 and
> Bulldozer. Can you change config/i386/i386.c:ix86_builtin_vectorization_cost
> to return ix86_cost->vec_stmt_cost instead of one for vec_perm?
> The cost is otherwise only queried by SLP vectorization it seems.
Sure, will do.
>
> Otherwise this looks ok. Please give other maintainers a chance to
> chime in (other cost hooks might need similar adjustments).
I'll give this until at least late Monday before committing. Thanks for
the quick response!
Bill
>
> Thanks,
> Richard.
>
> > Thanks,
> > Bill
> >
> >
> > 2012-02-03 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
> >
> > PR tree-optimization/50969
> > * tree-vect-stmts.c (vect_model_store_cost): Correct statement cost to
> > use vec_perm rather than vector_stmt.
> > (vect_model_load_cost): Likewise.
> > * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Revise
> > cost of vec_perm for TARGET_VSX.
> >
> >
> > Index: gcc/tree-vect-stmts.c
> > ===================================================================
> > --- gcc/tree-vect-stmts.c (revision 183871)
> > +++ gcc/tree-vect-stmts.c (working copy)
> > @@ -882,7 +882,7 @@ vect_model_store_cost (stmt_vec_info stmt_info, in
> > {
> > /* Uses a high and low interleave operation for each needed permute. */
> > inside_cost = ncopies * exact_log2(group_size) * group_size
> > - * vect_get_stmt_cost (vector_stmt);
> > + * vect_get_stmt_cost (vec_perm);
> >
> > if (vect_print_dump_info (REPORT_COST))
> > fprintf (vect_dump, "vect_model_store_cost: strided group_size = %d .",
> > @@ -988,7 +988,7 @@ vect_model_load_cost (stmt_vec_info stmt_info, int
> > {
> > /* Uses an even and odd extract operations for each needed permute. */
> > inside_cost = ncopies * exact_log2(group_size) * group_size
> > - * vect_get_stmt_cost (vector_stmt);
> > + * vect_get_stmt_cost (vec_perm);
> >
> > if (vect_print_dump_info (REPORT_COST))
> > fprintf (vect_dump, "vect_model_load_cost: strided group_size = %d .",
> > Index: gcc/config/rs6000/rs6000.c
> > ===================================================================
> > --- gcc/config/rs6000/rs6000.c (revision 183871)
> > +++ gcc/config/rs6000/rs6000.c (working copy)
> > @@ -3540,9 +3540,13 @@ rs6000_builtin_vectorization_cost (enum vect_cost_
> > case vec_to_scalar:
> > case scalar_to_vec:
> > case cond_branch_not_taken:
> > - case vec_perm:
> > return 1;
> >
> > + case vec_perm:
> > + if (!TARGET_VSX)
> > + return 1;
> > + return 2;
> > +
> > case cond_branch_taken:
> > return 3;
> >
> >
> >
>