This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix __mmask* types on many AVX512 intrinsics
- From: Jeff Law <law at redhat dot com>
- To: Jakub Jelinek <jakub at redhat dot com>, Kirill Yukhin <kirill dot yukhin at gmail dot com>, Uros Bizjak <ubizjak at gmail dot com>, Grazvydas Ignotas <notasas at gmail dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Wed, 11 Jul 2018 13:59:46 -0600
- Subject: Re: [PATCH] Fix __mmask* types on many AVX512 intrinsics
- References: <1530811827-12303-1-git-send-email-notasas@gmail.com> <20180705182819.GL7166@tucnak> <CANOLnOM2ZAsZ8RjpsjuDAQBRyyFj1TeVOVgjXhOGY74XE69ZWw@mail.gmail.com> <20180706104707.GM7166@tucnak> <20180707081550.GQ7166@tucnak>
On 07/07/2018 02:15 AM, Jakub Jelinek wrote:
> Hi!
>
> On Fri, Jul 06, 2018 at 12:47:07PM +0200, Jakub Jelinek wrote:
>> On Thu, Jul 05, 2018 at 11:57:26PM +0300, Grazvydas Ignotas wrote:
>>> I think it would be more efficient if you took care of it. I won't
>>> have time for at least a few days anyway.
>
> Here is the complete patch, I found two further issues where
> the __mmask mismatch was in between the return type and what was used
> in the rest of the intrinsic, so not caught by my earlier greps.
>
> I've added (except for the avx512bitalg which seems to have no runtime
> test coverage whatsoever) tests that cover the real bugs and further
> fixed the avx512*-vpcmp{,u}b-2.c test because (rel) << i triggered UB
> if i could go up to 63.
>
> I don't have AVX512* hw, so I've just bootstrapped/regtested the patch
> normally on i686-linux and x86_64-linux AVX2 hw and tried the affected
> tests without the config/i386/ changes and with them under SDE.
> The patch should fix these FAILs:
>
> FAIL: gcc.target/i386/avx512bw-vpcmpb-2.c execution test
> FAIL: gcc.target/i386/avx512bw-vpcmpub-2.c execution test
> FAIL: gcc.target/i386/avx512f-vinsertf32x4-3.c execution test
> FAIL: gcc.target/i386/avx512f-vinserti32x4-3.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpgeb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpgeub-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpgeuw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpgew-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpleb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpleub-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpleuw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmplew-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpltb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpltub-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpltuw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpltw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpneqb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpnequb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpnequw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpneqw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpub-2.c execution test
>
> Ok for trunk?
>
> I guess we want to backport it soon, but would appreciate somebody testing
> it on real AVX512-{BW,VL} hw before doing the backports.
>
> Another thing to consider is whether we shouldn't add those grep/sed checks
> I've been doing (at least the easy ones that don't cross-check the
> i386-builtins.def against the uses in the intrin headers) to config/i386/t-*
> some way.
>
> 2018-07-07 Jakub Jelinek <jakub@redhat.com>
>
> * config/i386/avx512bitalgintrin.h (_mm512_mask_bitshuffle_epi64_mask):
> Use __mmask64 type instead of __mmask8 for __M argument.
> * config/i386/avx512fintrin.h (_mm512_mask_xor_epi64,
> _mm512_maskz_xor_epi64): Use __mmask8 type instead of __mmask16 for
> __U argument.
> (_mm512_mask_cmpneq_epi64_mask): Use __mmask8 type instead of
> __mmask16 for __M argument.
> (_mm512_maskz_insertf32x4, _mm512_maskz_inserti32x4,
> _mm512_mask_insertf32x4, _mm512_mask_inserti32x4): Cast last argument
> to __mmask16 instead of __mmask8.
> * config/i386/avx512vlintrin.h (_mm_mask_add_ps, _mm_maskz_add_ps,
> _mm256_mask_add_ps, _mm256_maskz_add_ps, _mm_mask_sub_ps,
> _mm_maskz_sub_ps, _mm256_mask_sub_ps, _mm256_maskz_sub_ps,
> _mm256_maskz_cvtepi32_ps, _mm_maskz_cvtepi32_ps): Use __mmask8 type
> instead of __mmask16 for __U argument.
> * config/i386/avx512vlbwintrin.h (_mm_mask_cmp_epi8_mask): Use
> __mmask16 instead of __mmask8 for __U argument.
> (_mm256_mask_cmp_epi8_mask): Use __mmask32 instead of __mmask16 for
> __U argument.
> (_mm256_cmp_epi8_mask): Use __mmask32 return type instead of
> __mmask16.
> (_mm_mask_cmp_epu8_mask): Use __mmask16 instead of __mmask8 for __U
> argument.
> (_mm256_mask_cmp_epu8_mask): Use __mmask32 instead of __mmask16 for
> __U argument.
> (_mm256_cmp_epu8_mask): Use __mmask32 return type instead of
> __mmask16.
> (_mm_mask_cmp_epi16_mask): Cast last argument to __mmask8 instead
> of __mmask16.
> (_mm256_mask_cvtepi8_epi16): Use __mmask16 instead of __mmask32 for
> __U argument.
> (_mm_mask_cvtepi8_epi16): Use __mmask8 instead of __mmask32 for
> __U argument.
> (_mm256_mask_cvtepu8_epi16): Use __mmask16 instead of __mmask32 for
> __U argument.
> (_mm_mask_cvtepu8_epi16): Use __mmask8 instead of __mmask32 for
> __U argument.
> (_mm256_mask_cmpneq_epu8_mask, _mm256_mask_cmplt_epu8_mask,
> _mm256_mask_cmpge_epu8_mask, _mm256_mask_cmple_epu8_mask): Change
> return type as well as __M argument type and all casts from __mmask8
> to __mmask32.
> (_mm256_mask_cmpneq_epu16_mask, _mm256_mask_cmplt_epu16_mask,
> _mm256_mask_cmpge_epu16_mask, _mm256_mask_cmple_epu16_mask): Change
> return type as well as __M argument type and all casts from __mmask8
> to __mmask16.
> (_mm256_mask_cmpneq_epi8_mask, _mm256_mask_cmplt_epi8_mask,
> _mm256_mask_cmpge_epi8_mask, _mm256_mask_cmple_epi8_mask): Change
> return type as well as __M argument type and all casts from __mmask8
> to __mmask32.
> (_mm256_mask_cmpneq_epi16_mask, _mm256_mask_cmplt_epi16_mask,
> _mm256_mask_cmpge_epi16_mask, _mm256_mask_cmple_epi16_mask): Change
> return type as well as __M argument type and all casts from __mmask8
> to __mmask16.
> * config/i386/avx512vbmi2vlintrin.h (_mm_mask_shrdi_epi32,
> _mm_mask_shldi_epi32): Cast last argument to __mmask8 instead of
> __mmask16.
>
> * gcc.target/i386/avx512bw-vpcmpb-2.c (CMP): Use SIZE macro instead
> of hardcoding size. Cast (rel) to MASK_TYPE.
> * gcc.target/i386/avx512bw-vpcmpub-2.c (CMP): Likewise.
> * gcc.target/i386/avx512f-vinserti32x4-3.c: New test.
> * gcc.target/i386/avx512f-vinsertf32x4-3.c: New test.
> * gcc.target/i386/avx512vl-vpcmpnequb-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpgeub-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpleb-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpgeb-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpltb-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpltub-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpleub-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpneqb-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpnequw-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpgeuw-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmplew-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpgew-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpltw-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpltuw-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpleuw-2.c: New test.
> * gcc.target/i386/avx512vl-vpcmpneqw-2.c: New test.
>
> 2018-07-07 Grazvydas Ignotas <notasas@gmail.com>
>
> * config/i386/avx512bwintrin.h: (_mm512_mask_cmp_epi8_mask,
> _mm512_mask_cmp_epu8_mask): Use __mmask64 type instead of __mmask32
> for __U argument.
>
> * gcc.target/i386/avx512bw-vpcmpb-2.c (SIZE): Define to
> (AVX512F_LEN / 8) instead of (AVX512F_LEN / 16).
> * gcc.target/i386/avx512bw-vpcmpub-2.c (SIZE): Likewise.
OK.
FWIW, we have plenty of avx512 machines available in beaker.
You can do queries based on the cpuflags. Select "Key/Value" for the
table. "CPUFLAGS" for the Keyvalue "contains" for Operation and
"avx512" for Value.
Jeff
Do a search on Key/Value for CPUFLAGS contains avx512.
jeff