[PATCH] Add AVX512 k-mask intrinsics
H.J. Lu
hjl.tools@gmail.com
Mon Dec 5 17:19:00 GMT 2016
On Mon, Dec 5, 2016 at 6:59 AM, Andrew Senkevich
<andrew.n.senkevich@gmail.com> wrote:
> 2016-12-02 21:31 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>> On Fri, Dec 2, 2016 at 6:44 PM, Andrew Senkevich
>> <andrew.n.senkevich@gmail.com> wrote:
>>> 2016-11-11 22:14 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>>>> On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich
>>>> <andrew.n.senkevich@gmail.com> wrote:
>>>>> 2016-11-11 20:56 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>>>>>> On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>>>>>> On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
>>>>>>> <andrew.n.senkevich@gmail.com> wrote:
>>>>>>>> 2016-11-11 17:34 GMT+03:00 Uros Bizjak <ubizjak@gmail.com>:
>>>>>>>>> Some quick remarks:
>>>>>>>>>
>>>>>>>>> +(define_insn "kmovb"
>>>>>>>>> + [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
>>>>>>>>> + (unspec:QI
>>>>>>>>> + [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>>>>>>>>> + UNSPEC_KMOV))]
>>>>>>>>> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
>>>>>>>>> + "@
>>>>>>>>> + kmovb\t{%k1, %0|%0, %k1}
>>>>>>>>> + kmovb\t{%1, %0|%0, %1}";
>>>>>>>>> + [(set_attr "mode" "QI")
>>>>>>>>> + (set_attr "type" "mskmov")
>>>>>>>>> + (set_attr "prefix" "vex")])
>>>>>>>>> +
>>>>>>>>> +(define_insn "kmovd"
>>>>>>>>> + [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
>>>>>>>>> + (unspec:SI
>>>>>>>>> + [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>>>>>>>>> + UNSPEC_KMOV))]
>>>>>>>>> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>>>>>>>>> + "@
>>>>>>>>> + kmovd\t{%k1, %0|%0, %k1}
>>>>>>>>> + kmovd\t{%1, %0|%0, %1}";
>>>>>>>>> + [(set_attr "mode" "SI")
>>>>>>>>> + (set_attr "type" "mskmov")
>>>>>>>>> + (set_attr "prefix" "vex")])
>>>>>>>>> +
>>>>>>>>> +(define_insn "kmovq"
>>>>>>>>> + [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
>>>>>>>>> + (unspec:DI
>>>>>>>>> + [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>>>>>>>>> + UNSPEC_KMOV))]
>>>>>>>>> + "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>>>>>>>>> + "@
>>>>>>>>> + kmovq\t{%k1, %0|%0, %k1}
>>>>>>>>> + kmovq\t{%1, %0|%0, %1}
>>>>>>>>> + kmovq\t{%1, %0|%0, %1}";
>>>>>>>>> + [(set_attr "mode" "DI")
>>>>>>>>> + (set_attr "type" "mskmov")
>>>>>>>>> + (set_attr "prefix" "vex")])
>>>>>>>>>
>>>>>>>>> - kmovd (and existing kmovw) should be using register_operand for
>>>>>>>>> opreand 0. In this case, there is no need for MEM_P checks at all.
>>>>>>>>> - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
>>>>>>>>> - please put these definitions above corresponding *mov??_internal patterns.
>>>>>>>>
>>>>>>>> Do you mean put below *mov??_internal patterns? Attached corrected such way.
>>>>>>>
>>>>>>> No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
>>>>>>> etc. It doesn't matter if they are above or below their respective
>>>>>>> *mov??_internal patterns, as long as they are positioned in some
>>>>>>> consistent way. IOW, new patterns shouldn't be grouped together, as is
>>>>>>> the case with your patch.
>>>>>>
>>>>>> +(define_insn "kmovb"
>>>>>> + [(set (match_operand:QI 0 "register_operand" "=k,k")
>>>>>> + (unspec:QI
>>>>>> + [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>>>>>> + UNSPEC_KMOV))]
>>>>>> + "TARGET_AVX512DQ && !MEM_P (operands[1])"
>>>>>>
>>>>>> There is no need for !MEM_P, this will prevent memory operand, which
>>>>>> is allowed by constraint "m".
>>>>>>
>>>>>> +(define_insn "kmovq"
>>>>>> + [(set (match_operand:DI 0 "register_operand" "=k,k,km")
>>>>>> + (unspec:DI
>>>>>> + [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>>>>>> + UNSPEC_KMOV))]
>>>>>> + "TARGET_AVX512BW && !MEM_P (operands[1])"
>>>>>>
>>>>>> Operand 0 should have "nonimmediate_operand" predicate. And here you
>>>>>> need && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
>>>>>> mem->mem moves.
>>>>>
>>>>> Changed according your comments and attached.
>>>>
>>>> Still not good.
>>>>
>>>> +(define_insn "kmovd"
>>>> + [(set (match_operand:SI 0 "register_operand" "=k,k")
>>>> + (unspec:SI
>>>> + [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>>>> + UNSPEC_KMOV))]
>>>> + "TARGET_AVX512BW && !MEM_P (operands[1])"
>>>>
>>>> Remove !MEM_P in the above pattern.
>>>>
>>>> (define_insn "kmovw"
>>>> - [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k")
>>>> + [(set (match_operand:HI 0 "register_operand" "=k,k")
>>>> (unspec:HI
>>>> [(match_operand:HI 1 "nonimmediate_operand" "r,km")]
>>>> UNSPEC_KMOV))]
>>>> - "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F"
>>>> + "TARGET_AVX512F && !MEM_P (operands[1])"
>>>>
>>>> Also remove !MEM_P here.
>>>>
>>>> +(define_insn "kadd<mode>"
>>>> + [(set (match_operand:SWI1248x 0 "register_operand" "=r,&r,!k")
>>>> + (plus:SWI1248x
>>>> + (not:SWI1248x
>>>> + (match_operand:SWI1248x 1 "register_operand" "r,0,k"))
>>>> + (match_operand:SWI1248x 2 "register_operand" "r,r,k")))
>>>> + (clobber (reg:CC FLAGS_REG))]
>>>> + "TARGET_AVX512F"
>>>> +{
>>>> + switch (which_alternative)
>>>> + {
>>>> + case 0:
>>>> + return "add\t{%k2, %k1, %k0|%k0, %k1, %k2}";
>>>> + case 1:
>>>> + return "#";
>>>> + case 2:
>>>> + if (TARGET_AVX512BW && <MODE>mode == DImode)
>>>> + return "kaddq\t{%2, %1, %0|%0, %1, %2}";
>>>> + else if (TARGET_AVX512BW && <MODE>mode == SImode)
>>>> + return "kaddd\t{%2, %1, %0|%0, %1, %2}";
>>>> + else if (TARGET_AVX512DQ && <MODE>mode == QImode)
>>>> + return "kaddb\t{%2, %1, %0|%0, %1, %2}";
>>>> + else
>>>> + return "kaddw\t{%2, %1, %0|%0, %1, %2}";
>>>> +
>>>>
>>>> The above pattern is wrong. Is there really a NOT RTX present,
>>>> implying effectively a kaddn?
>>>>
>>>> If this is plain add, then you need to change other add patterns, see
>>>> how logic patterns are amended with "k" constraint, added pattern
>>>> should look like *k<logic><mode> pattern.
>>>>
>>>> (define_insn "kandn<mode>"
>>>> - [(set (match_operand:SWI12 0 "register_operand" "=r,&r,!k")
>>>> - (and:SWI12
>>>> - (not:SWI12
>>>> - (match_operand:SWI12 1 "register_operand" "r,0,k"))
>>>> - (match_operand:SWI12 2 "register_operand" "r,r,k")))
>>>> + [(set (match_operand:SWI1248x 0 "register_operand" "=r,&r,!k")
>>>> + (and:SWI1248x
>>>> + (not:SWI1248x
>>>> + (match_operand:SWI1248x 1 "register_operand" "r,0,k"))
>>>> + (match_operand:SWI1248x 2 "register_operand" "r,r,k")))
>>>> (clobber (reg:CC FLAGS_REG))]
>>>> "TARGET_AVX512F"
>>>> {
>>>> @@ -8319,10 +8358,50 @@
>>>> case 1:
>>>> return "#";
>>>> case 2:
>>>> - if (TARGET_AVX512DQ && <MODE>mode == QImode)
>>>> + if (TARGET_AVX512BW && <MODE>mode == DImode)
>>>> + return "kandnq\t{%2, %1, %0|%0, %1, %2}";
>>>> + else if (TARGET_AVX512BW && <MODE>mode == SImode)
>>>> + return "kandnd\t{%2, %1, %0|%0, %1, %2}";
>>>> + else if (TARGET_AVX512DQ && <MODE>mode == QImode)
>>>> return "kandnb\t{%2, %1, %0|%0, %1, %2}";
>>>> else
>>>> return "kandnw\t{%2, %1, %0|%0, %1, %2}";
>>>>
>>>> The above should use SWI1248_AVX512BW mode iterator, see
>>>> *k<logic><mode> pattern.
>>>
>>> I split this patch after last updates in md files, here is the first
>>> part which doesn't change md files.
>>> Regtested on x86_64-linux-gnu. Is this part ok?
>>
>> There is no point to scan for kmovX insn in e.g.:
>>
>> +/* { dg-final { scan-assembler-times "kmovq" 2 } } */
>> +
>> +#include <immintrin.h>
>> +
>> +void
>> +avx512bw_test ()
>> +{
>> + __mmask64 k1, k2, k3;
>> + volatile __m512i x = _mm512_setzero_si512 ();
>> +
>> + __asm__( "kmovq %1, %0" : "=k" (k1) : "r" (1) );
>> + __asm__( "kmovq %1, %0" : "=k" (k2) : "r" (2) );
>>
>> since you emit it from inline asm.
>>
>> Please remove these pointles kmovX scan-asm-times directives from the
>> testcases, and please also remove it from avx512f-kandnw-1.c
>> testcase.
>>
>> The patch is OK with this change.
>
> Attached fixed with updated ChangeLogs.
>
> HJ, could you commit please?
>
Done.
--
H.J.
More information about the Gcc-patches
mailing list