This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Allow fwprop to undo vectorization harm (PR68961)
On Tue, Jul 12, 2016 at 10:58 AM, Richard Biener <rguenther@suse.de> wrote:
> On Sun, 10 Jul 2016, Uros Bizjak wrote:
>
>> On Wed, Jul 6, 2016 at 3:18 PM, Richard Biener <rguenther@suse.de> wrote:
>>
>> >> > 2016-07-04 Richard Biener <rguenther@suse.de>
>> >> >
>> >> > PR rtl-optimization/68961
>> >> > * fwprop.c (propagate_rtx): Allow SUBREGs of VEC_CONCAT and CONCAT
>> >> > to simplify to a non-constant.
>> >> >
>> >> > * gcc.target/i386/pr68961.c: New testcase.
>> >>
>> >> Thanks, LGTM.
>> >
>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, it causes
>> >
>> > FAIL: gcc.target/i386/sse2-load-multi.c scan-assembler-times movup 2
>> >
>> > as the peephole created for that testcase no longer applies as fwprop
>> > does
>> >
>> > In insn 10, replacing
>> > (vec_concat:V2DF (vec_select:DF (reg:V2DF 91)
>> > (parallel [
>> > (const_int 0 [0])
>> > ]))
>> > (mem:DF (reg/f:DI 95) [0 S8 A128]))
>> > with (vec_concat:V2DF (reg:DF 93 [ MEM[(const double *)&a + 8B] ])
>> > (mem:DF (reg/f:DI 95) [0 S8 A128]))
>> > Changed insn 10
>> >
>> > resulting in
>> >
>> > movsd a+8(%rip), %xmm0
>> > movhpd a+16(%rip), %xmm0
>> >
>> > again rather than movupd.
>> >
>> > Uros, there is probably a missing peephole for the new form - can you
>> > fix this as a followup or should I hold on this patch for a bit longer?
>>
>> No, please proceed with the patch, I'll fix this fallout with a
>> followup patch in a couple of days.
>
> Applied as r238238. Is the following x86 change ok then which
> adjusts the vectorizer vector construction cost to sth more sensible?
> I have adjusted the generic implementation in targhooks.c this way
> a few weeks ago already.
>
> Thanks,
> Richard.
>
> 2016-07-12 Richard Biener <rguenther@suse.de>
>
> * targhooks.c (default_builtin_vectorization_cost): Adjust
> vec_construct cost.
> * config/i386/i386.c (ix86_builtin_vectorization_cost): Likewise.
Looks OK to me, but let's also give Intel chance to comment.
Uros.
> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c (revision 237196)
> +++ gcc/config/i386/i386.c (working copy)
> @@ -49503,8 +49520,6 @@ static int
> ix86_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
> tree vectype, int)
> {
> - unsigned elements;
> -
> switch (type_of_cost)
> {
> case scalar_stmt:
> @@ -49546,8 +49561,7 @@ ix86_builtin_vectorization_cost (enum ve
> return ix86_cost->vec_stmt_cost;
>
> case vec_construct:
> - elements = TYPE_VECTOR_SUBPARTS (vectype);
> - return ix86_cost->vec_stmt_cost * (elements / 2 + 1);
> + return ix86_cost->vec_stmt_cost * (TYPE_VECTOR_SUBPARTS (vectype) - 1);
>
> default:
> gcc_unreachable ();