V3 [PATCH] i386: Use scalar operand in floating point vec_dup patterns
H.J. Lu
hjl.tools@gmail.com
Sun Oct 21 23:13:00 GMT 2018
On 10/21/18, H.J. Lu <hjl.tools@gmail.com> wrote:
> On 10/17/18, H.J. Lu <hjl.tools@gmail.com> wrote:
>> Since vector registers are also used for scalar floating point values,
>> we can use scalar operand in floating point vec_dup patterns, which
>> enables combiner to generate
>>
>> (set (reg:V8SF 84)
>> (vec_duplicate:V8SF (mem/c:SF (symbol_ref:DI ("y")))))
>>
>> For AVX512 broadcast instructions from integer register operand, we only
>> need to broadcast integer to integer vectors.
>>
>> gcc/
>>
>> PR target/87537
>> * config/i386/i386-builtin-types.def: Replace
>> CODE_FOR_avx2_vec_dupv4sf, CODE_FOR_avx2_vec_dupv8sf and
>> CODE_FOR_avx2_vec_dupv4df with CODE_FOR_vec_dupv4sf,
>> CODE_FOR_vec_dupv8sf and CODE_FOR_vec_dupv4df, respectively.
>> * config/i386/i386.c (expand_vec_perm_1): Replace
>> gen_avx512f_vec_dupv16sf_1, gen_avx2_vec_dupv8sf_1 and
>> gen_avx512f_vec_dupv8df_1 with gen_avx512f_vec_dupv16sf,
>> gen_vec_dupv8sf and gen_avx512f_vec_dupv8df, respectively.
>> Duplicate them from scalar operand.
>> * config/i386/i386.md (SF to DF splitter): Replace
>> gen_avx512f_vec_dupv16sf_1 with gen_avx512f_vec_dupv16sf.
>> * config/i386/sse.md (VF48_AVX512VL): New.
>> (avx2_vec_dup<mode>): Removed.
>> (avx2_vec_dupv8sf_1): Likewise.
>> (avx512f_vec_dup<mode>_1): Likewise.
>> (avx2_vec_dupv4df): Likewise.
>> (<avx512>_vec_dup<mode><mask_name>:V48_AVX512VL): Likewise.
>> (<avx512>_vec_dup<mode><mask_name>:VF48_AVX512VL): New.
>> (<avx512>_vec_dup<mode><mask_name>:VI48_AVX512VL): Likewise.
>> (<mask_codefor><avx512>_vec_dup_gpr<mode><mask_name>): Replace
>> V48_AVX512VL with VI48_AVX512VL.
>> (*avx_vperm_broadcast_<mode>): Replace gen_avx2_vec_dupv8sf with
>> gen_vec_dupv8sf.
>>
>> gcc/testsuite/
>>
>> PR target/87537
>> * gcc.target/i386/avx2-vbroadcastss_ps256-1.c: Updated.
>> * gcc.target/i386/avx512vl-vbroadcast-3.c: Likewise.
>
> Here is the updated patch. I added const_vector_duplicate_operand to
> handle constant vector broadcast from memory. OK for trunk?
Here is the updated patch with a testcase for const_vector_duplicate_operand.
We should split
(set (reg:V16SF 86)
(const_vector:V16SF
[(const_double:SF 2.0e+0 [0x0.8p+2]) repeated x16])
to
(set (reg:V16SF 86)
(vec_duplicate:V16SF (mem/u/c:SF (symbol_ref/u:DI ("*.LC1")))))
only before register allocation and we shouldn't split special SSE constants.
OK for trunk?
Thanks.
--
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-i386-Use-scalar-operand-in-floating-point-vec_dup-pa.patch
Type: text/x-patch
Size: 23277 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20181021/6e06cb06/attachment.bin>
More information about the Gcc-patches
mailing list