This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 2/2, x86] Add palignr support for AVX2.
- From: Evgeny Stupachenko <evstupac at gmail dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Biener <rguenther at suse dot de>, Uros Bizjak <ubizjak at gmail dot com>, "H.J. Lu" <hjl dot tools at gmail dot com>
- Date: Thu, 5 Jun 2014 01:23:59 +0400
- Subject: Re: [PATCH 2/2, x86] Add palignr support for AVX2.
- Authentication-results: sourceware.org; auth=none
- References: <CAOvf_xyiA5uaZGHd+86Z6X_6=02pRQ7Nc48nbMrHRuyj+kj_kQ at mail dot gmail dot com> <535FBC20 dot 1000400 at redhat dot com> <CAOvf_xzXNYBAAMdZr8d-6PLnQnvJyZaDaZ7LSXnoBDy7opmuPw at mail dot gmail dot com> <535FE3CF dot 2020005 at redhat dot com> <CAOvf_xxqpnta9SToYjSY+=WXfcTZApnCrMDr4RXJZAPohWeJbg at mail dot gmail dot com> <CAOvf_xwRM17xGzaLoqxHXJ9U=iWJMq26ZyC4f1sp3_TUctnTVA at mail dot gmail dot com> <537A2A91 dot 3000809 at redhat dot com> <CAOvf_xwPk6-XTCpvkruL2jkXk_-weZo1TFpFdqA1MTr1q9VEhg at mail dot gmail dot com> <538F6DCC dot 5000402 at redhat dot com>
Thanks. Moving pattern down helps. Now make check for the following
patch passed:
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 2ef1384..8266f3e 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1417,6 +1417,22 @@
return true;
})
+;; Return true if OP is a parallel for a palignr permute.
+(define_predicate "palignr_operand"
+ (and (match_code "parallel")
+ (match_code "const_int" "a"))
+{
+ int elt = INTVAL (XVECEXP (op, 0, 0));
+ int i, nelt = XVECLEN (op, 0);
+
+ /* Check that an order in the permutation is suitable for palignr.
+ For example, {5 6 7 0 1 2 3 4} is "palignr 5, xmm, xmm". */
+ for (i = 1; i < nelt; ++i)
+ if (INTVAL (XVECEXP (op, 0, i)) != ((elt + i) % nelt))
+ return false;
+ return true;
+})
+
;; Return true if OP is a proper third operand to vpblendw256.
(define_predicate "avx2_pblendw_operand"
(match_code "const_int")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index c91626b..d907353 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -14551,6 +14551,35 @@
(set_attr "prefix" "vex")
(set_attr "mode" "<sseinsnmode>")])
+(define_insn "*ssse3_palignr<mode>_perm"
+ [(set (match_operand:V_128 0 "register_operand" "=x,x")
+ (vec_select:V_128
+ (match_operand:V_128 1 "register_operand" "0,x")
+ (match_parallel 2 "palignr_operand"
+ [(match_operand 3 "const_int_operand" "n, n")])))]
+ "TARGET_SSSE3"
+{
+ enum machine_mode imode = GET_MODE_INNER (GET_MODE (operands[0]));
+ operands[2] = GEN_INT (INTVAL (operands[3]) * GET_MODE_SIZE (imode));
+
+ switch (which_alternative)
+ {
+ case 0:
+ return "palignr\t{%2, %1, %0|%0, %1, %2}";
+ case 1:
+ return "vpalignr\t{%2, %1, %1, %0|%0, %1, %1, %2}";
+ default:
+ gcc_unreachable ();
+ }
+}
+ [(set_attr "isa" "noavx,avx")
+ (set_attr "type" "sseishft")
+ (set_attr "atom_unit" "sishuf")
+ (set_attr "prefix_data16" "1,*")
+ (set_attr "prefix_extra" "1")
+ (set_attr "length_immediate" "1")
+ (set_attr "prefix" "orig,vex")])
+
(define_expand "avx_vinsertf128<mode>"
[(match_operand:V_256 0 "register_operand")
(match_operand:V_256 1 "register_operand")
On Wed, Jun 4, 2014 at 11:04 PM, Richard Henderson <rth@redhat.com> wrote:
> On 06/04/2014 10:06 AM, Evgeny Stupachenko wrote:
>> Is it ok to use the following pattern?
>>
>> patch passed bootstrap and make check, but one test failed:
>> gcc/testsuite/gcc.target/i386/vect-rebuild.c
>> It failed on /* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */
>> which is now palignr. However, both palignr and permilpd costs 1 tick
>> and take 6 bytes in the opcode.
>> I vote for modifying the test to scan for palignr:
>> /* { dg-final { scan-assembler-times "\tv?palignr\[ \t\]" 1 } } */
>>
>> 2014-06-04 Evgeny Stupachenko <evstupac@gmail.com>
>>
>> * config/i386/sse.md (*ssse3_palignr<mode>_perm): New.
>> * config/i386/predicates.md (palignr_operand): New.
>> Indicates if permutation is suitable for palignr instruction.
>
> Surely permilpd avoids some sort of reformatting penalty when actually using
> doubles.
>
> If you move this pattern down below the other vec_select patterns, we'll prefer
> the others for matching masks.
>
>
> r~