This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR50969


On Fri, Feb 3, 2012 at 2:24 PM, William J. Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> This fixes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969 by slightly
> raising the cost of vector permutes on powerpc64 VSX targets (and
> ensuring those costs are correctly used). ?This reverses the performance
> loss for 168.wupwise, and gives a slight boost to 433.milc as well.
>
> In the long run, we will want to model VSX permutes differently, since
> the real issue is that only one floating-point pipe can hold a permute
> at a time. ?Thus the present patch can be overly conservative when
> permutes are rare compared with other vector instructions.
>
> Bootstrapped and regtested on powerpc64-linux-gnu with no failures. ?OK
> for trunk?

Note this makes permutes artificially cheap for AMD K8, K10 and
Bulldozer.  Can you change config/i386/i386.c:ix86_builtin_vectorization_cost
to return ix86_cost->vec_stmt_cost instead of one for vec_perm?
The cost is otherwise only queried by SLP vectorization it seems.

Otherwise this looks ok.  Please give other maintainers a chance to
chime in (other cost hooks might need similar adjustments).

Thanks,
Richard.

> Thanks,
> Bill
>
>
> 2012-02-03 ?Bill Schmidt ?<wschmidt@linux.vnet.ibm.com>
>
> ? ? ? ?PR tree-optimization/50969
> ? ? ? ?* tree-vect-stmts.c (vect_model_store_cost): Correct statement cost to
> ? ? ? ?use vec_perm rather than vector_stmt.
> ? ? ? ?(vect_model_load_cost): Likewise.
> ? ? ? ?* config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Revise
> ? ? ? ?cost of vec_perm for TARGET_VSX.
>
>
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> --- gcc/tree-vect-stmts.c ? ? ? (revision 183871)
> +++ gcc/tree-vect-stmts.c ? ? ? (working copy)
> @@ -882,7 +882,7 @@ vect_model_store_cost (stmt_vec_info stmt_info, in
> ? ? {
> ? ? ? /* Uses a high and low interleave operation for each needed permute. ?*/
> ? ? ? inside_cost = ncopies * exact_log2(group_size) * group_size
> - ? ? ? ?* vect_get_stmt_cost (vector_stmt);
> + ? ? ? ?* vect_get_stmt_cost (vec_perm);
>
> ? ? ? if (vect_print_dump_info (REPORT_COST))
> ? ? ? ? fprintf (vect_dump, "vect_model_store_cost: strided group_size = %d .",
> @@ -988,7 +988,7 @@ vect_model_load_cost (stmt_vec_info stmt_info, int
> ? ? {
> ? ? ? /* Uses an even and odd extract operations for each needed permute. ?*/
> ? ? ? inside_cost = ncopies * exact_log2(group_size) * group_size
> - ? ? ? * vect_get_stmt_cost (vector_stmt);
> + ? ? ? * vect_get_stmt_cost (vec_perm);
>
> ? ? ? if (vect_print_dump_info (REPORT_COST))
> ? ? ? ? fprintf (vect_dump, "vect_model_load_cost: strided group_size = %d .",
> Index: gcc/config/rs6000/rs6000.c
> ===================================================================
> --- gcc/config/rs6000/rs6000.c ?(revision 183871)
> +++ gcc/config/rs6000/rs6000.c ?(working copy)
> @@ -3540,9 +3540,13 @@ rs6000_builtin_vectorization_cost (enum vect_cost_
> ? ? ? case vec_to_scalar:
> ? ? ? case scalar_to_vec:
> ? ? ? case cond_branch_not_taken:
> - ? ? ?case vec_perm:
> ? ? ? ? return 1;
>
> + ? ? ?case vec_perm:
> + ? ? ? if (!TARGET_VSX)
> + ? ? ? ? return 1;
> + ? ? ? return 2;
> +
> ? ? ? case cond_branch_taken:
> ? ? ? ? return 3;
>
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]