[PATCH] Add AVX512 k-mask intrinsics

Andrew Senkevich andrew.n.senkevich@gmail.com
Tue Jan 17 13:03:00 GMT 2017


2017-01-17 15:30 GMT+03:00 Kirill Yukhin <kirill.yukhin@gmail.com>:
> Hi Anrey,
> On 17 Jan 14:04, Andrew Senkevich wrote:
>> 2017-01-17 1:55 GMT+03:00 Jakub Jelinek <jakub@redhat.com>:
>> > On Tue, Jan 17, 2017 at 01:30:11AM +0300, Andrew Senkevich wrote:
>> >> here is one more part of intrinsics for k-mask registers shifts:
>> >
>> > The software developer manuals describe KSHIFT{L,R}* like:
>> > KSHIFTLW
>> > COUNT <- imm8[7:0]
>> > DEST[MAX_KL-1:0] <- 0
>> > IF COUNT <=15
>> > THEN DEST[15:0] <- SRC1[15:0] << COUNT;
>> > FI;
>> >
>> > What is the behavior when src1 == dest, like:
>> >   kshiftld $3, %k3, %k3
>> > ?  Is it just a bug in the SDM and will it actually do the expected thing
>> > (set %k3 to %k3 << 3 and clear just the upper bits), or do we need
>> > an early-clobber on the destination to make sure GCC never emits these
>> > insns with the same register as both input and output?
>>
>> Indeed, it should be different registers, how to do it?
> Are you sure?
>
> I've played a bit w/ SDE. And looks like operands are not early clobber:
> TID0: INS 0x00000000004003ee             AVX512VEX kmovd k0, eax
> TID0:   k0 := 00000000_ffffffff
> ...
> TID0: INS 0x00000000004003f4             AVX512VEX kshiftlw k0, k0, 0x3
> TID0:   k0 := 00000000_0000fff8
>
> You can see that same dest and source works just fine.

Hmm, I looked only on what ICC generates, and it was not correct way.

Thanks Kirill!


--
WBR,
Andrew



More information about the Gcc-patches mailing list