This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 1/2, x86] Add palignr support for AVX2.
- From: Evgeny Stupachenko <evstupac at gmail dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Biener <rguenther at suse dot de>, Uros Bizjak <ubizjak at gmail dot com>
- Date: Thu, 10 Jul 2014 19:29:37 +0400
- Subject: Re: [PATCH 1/2, x86] Add palignr support for AVX2.
- Authentication-results: sourceware.org; auth=none
- References: <CAOvf_xx3-VpgN8YDxJBPvzzNGNykUPoLdU6xThW_QBN7byy5rw at mail dot gmail dot com> <535E909A dot 7040205 at redhat dot com> <CAOvf_xygddE6yxkp=+SZtVz+bvVciH5YfH1Cwy1WxdPHCgrbtg at mail dot gmail dot com> <535EC233 dot 7000500 at redhat dot com> <CAOvf_xxq6PNN2KLpA8yoB2jjgmy-UAOvij_ghgsKPkZnsu2Rkg at mail dot gmail dot com> <537A23F7 dot 2060601 at redhat dot com> <CAOvf_xwYJpQ0H9=MrLzUd40JDksZ8Tfsks2=8ZW4MfR_wCjU1A at mail dot gmail dot com> <53BAB164 dot 7000006 at redhat dot com>
On Mon, Jul 7, 2014 at 6:40 PM, Richard Henderson <rth@redhat.com> wrote:
> On 07/03/2014 02:53 AM, Evgeny Stupachenko wrote:
>> -expand_vec_perm_palignr (struct expand_vec_perm_d *d)
>> +expand_vec_perm_palignr (struct expand_vec_perm_d *d, int insn_num)
>
> insn_num might as well be "bool avx2", since it's only ever set to two values.
Agree. However:
after the alignment, one operand permutation could be just move and
take 2 instructions for AVX2 as well
for AVX2 there could be other cases when the scheme takes 4 or 5 instructions
we can leave it for potential avx512 extension
>
>> - /* Even with AVX, palignr only operates on 128-bit vectors. */
>> - if (!TARGET_SSSE3 || GET_MODE_SIZE (d->vmode) != 16)
>> + /* SSSE3 is required to apply PALIGNR on 16 bytes operands. */
>> + if (GET_MODE_SIZE (d->vmode) == 16)
>> + {
>> + if (!TARGET_SSSE3)
>> + return false;
>> + }
>> + /* AVX2 is required to apply PALIGNR on 32 bytes operands. */
>> + else if (GET_MODE_SIZE (d->vmode) == 32)
>> + {
>> + if (!TARGET_AVX2)
>> + return false;
>> + }
>> + /* Other sizes are not supported. */
>> + else
>> return false;
>
> And you'd better check it up here because...
>
Correct. The following should resolve the issue:
/* For AVX2 we need more than 2 instructions when the alignment
by itself does not produce the desired permutation. */
if (TARGET_AVX2 && insn_num <= 2)
return false;
>> + /* For SSSE3 we need 1 instruction for palignr plus 1 for one
>> + operand permutaoin. */
>> + if (insn_num == 2)
>> + {
>> + ok = expand_vec_perm_1 (&dcopy);
>> + gcc_assert (ok);
>> + }
>> + /* For AVX2 we need 2 instructions for the shift: vpalignr and
>> + vperm plus 4 instructions for one operand permutation. */
>> + else if (insn_num == 6)
>> + {
>> + ok = expand_vec_perm_vpshufb2_vpermq (&dcopy);
>> + gcc_assert (ok);
>> + }
>> + else
>> + ok = false;
>> return ok;
>
> ... down here you'll simply ICE from the gcc_assert.
>
>
> r~