This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH X86, PR62128] Rotate pattern for AVX2
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Evgeny Stupachenko <evstupac at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Henderson <rth at redhat dot com>, "H.J. Lu" <hjl dot tools at gmail dot com>, Jakub Jelinek <jakub at redhat dot com>
- Date: Tue, 30 Sep 2014 20:21:06 +0200
- Subject: Re: [PATCH X86, PR62128] Rotate pattern for AVX2
- Authentication-results: sourceware.org; auth=none
- References: <CAOvf_xytRo3-vFOdbOuDibk3LCsWrAMsqe1Vd_uSg+QwA71+XA at mail dot gmail dot com> <CAFULd4axXUAuf655zX0NTr13NmDSHF-jqZnZkQAQ-VYqefpqCQ at mail dot gmail dot com> <CAFULd4Y5-mUyU0NcXg+vtidbPuGCM=Tw20M5=k88kouCGPMk-g at mail dot gmail dot com>
On Tue, Sep 30, 2014 at 8:08 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Sep 30, 2014 at 7:06 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Tue, Sep 30, 2014 at 6:47 PM, Evgeny Stupachenko <evstupac@gmail.com> wrote:
>>
>>> Patch resubmitted from https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01400.html
>>>
>>> The patch fix PR62128 and "gcc.target/i386/pr52252-atom.c" in
>>> core-avx2 make check.
>>> The test in pr62128 is exactly TEST 22 from
>>> gcc.dg/torture/vshuf-v32qi.c. It will check if the pattern is correct
>>> or not.
>>> The patch developed similar to define_insn_and_split
>>> "*avx_vperm_broadcast_<mode>".
>>> The patch passed x86 bootstrap and make check (+2 new passes for
>>> -march=core-avx2).
>>> Is it ok?
>
> Please try following (totally untested) expander:
As usual, the wrong version was pasted. This should read:
--cut here--
(define_expand "avx2_rotate<mode>_perm"
[(set (match_operand:V_256 0 "register_operand")
(vec_select:V_256
(match_operand:V_256 1 "register_operand")
(match_parallel 2 "palignr_operand"
[(match_operand 3 "const_int_operand" "n")])))]
"TARGET_AVX2"
{
int shift = INTVAL (operands[3]) * <ssescalarsize>;
rtx insn;
rtx op1 = gen_lowpart (V4DImode, operands[1]);
rtx t2 = gen_reg_rtx (V4DImode);
emit_insn (gen_avx2_permv2ti (t2, op1, op1, GEN_INT (33)));
op0 = gen_lowpart (V2TImode, operands[0]);
op1 = gen_lowpart (V2TImode, operands[1]);
t2 = gen_lowpart (V2TImode, t2);
if (shift < GET_MODE_SIZE (TImode))
insn = gen_avx2_palignrv2ti (op0, t2, op1, GEN_INT (shift)));
else
insn = gen_avx2_palignrv2ti (op0, op1, t2, GEN_INT (shift - 16)));
emit_insn (insn);
DONE;
}
--cut here--
Uros.