[i386] scalar ops that preserve the high part of a vector

Marc Glisse marc.glisse@inria.fr
Sun Dec 2 12:30:00 GMT 2012


On Sun, 2 Dec 2012, Uros Bizjak wrote:

> On Sat, Dec 1, 2012 at 6:27 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>
>> here is a patch. If it is accepted, I'll extend it to other vm patterns
>> (mul, div, min, max are likely candidates, but I need to check the doc). It
>> passed bootstrap+testsuite on x86_64-linux.
>>
>>
>> 2012-12-01  Marc Glisse  <marc.glisse@inria.fr>
>>
>>         PR target/54855
>> gcc/
>>         * config/i386/sse.md (<sse>_vm<plusminus_insn><mode>3): Rewrite
>>         pattern.
>>         * config/i386/i386-builtin-types.def: New function types.
>>         * config/i386/i386.c (ix86_expand_args_builtin): Likewise.
>>         (bdesc_args) <__builtin_ia32_addss, __builtin_ia32_subss,
>>         __builtin_ia32_addsd, __builtin_ia32_subsd>: Change prototype.
>>         * config/i386/xmmintrin.h: Adapt to new builtin prototype.
>>         * config/i386/emmintrin.h: Likewise.
>>         * doc/extend.texi (X86 Built-in Functions): Document changed
>> prototype.
>>
>> testsuite/
>>         * gcc.target/i386/pr54855-1.c: New testcase.
>>         * gcc.target/i386/pr54855-2.c: New testcase.
>
> Yes, the approach looks correct to me, but I wonder why we have
> different representations for v4sf and v2df cases? I'd say that we
> should canonicalize patterns somewhere in the middle end (probably to
> vec_merge variant, as IMO vec_dup looks like degenerated vec_merge
> variant), otherwise we will have pattern explosion.

(I assume s/vec_dup/vec_concat/ above)

Note that this comes from ix86_expand_vector_set, which purposedly uses 
VEC_CONCAT for V2DF and VEC_MERGE for V4SF. It is true that we could use 
the VEC_MERGE version more widely, but this code that selects the most 
appropriate pattern depending on the mode seems good to me. And I wouldn't 
call the few extra entries in sse.md an explosion quite yet...

(also, using VEC_DUPLICATE is quite artificial, in the special case where 
we set the first element of the vector, a subreg should work as well)


> However, the patch is too late for 4.8,

That's fine, I can hold it for 4.9. I'd like to finalize the patch now 
while it is fresh though (I would still redo a quick bootstrap+testsuite 
before commit when trunk re-opens).

Thanks,

-- 
Marc Glisse



More information about the Gcc-patches mailing list