The avx512fintrin.h header sometimes uses different implementations depending on whether __OPTIMIZE__ is defined, but many functions are missing if __OPTIMIZE__ is not defined. Here is a trivial test case: #include <immintrin.h> __mmask16 foo(__m512 a, __m512 b) { return _mm512_cmplt_ps_mask(a, b); } On Compiler Explorer: https://godbolt.org/z/83jP63 I ran into this with _mm512_cmplt_ps_mask, but it looks like this all the _mm512_cmp*_{pd,ps}_mask functions have the same problem.
Created attachment 48865 [details] gcc11-pr96174.patch Untested fix.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:12d69dbfff9dd5ad4a30b20d1636f5cab6425e8c commit r11-2104-g12d69dbfff9dd5ad4a30b20d1636f5cab6425e8c Author: Jakub Jelinek <jakub@redhat.com> Date: Wed Jul 15 11:34:44 2020 +0200 fix _mm512_{,mask_}cmp*_p[ds]_mask at -O0 [PR96174] The _mm512_{,mask_}cmp_p[ds]_mask and also _mm_{,mask_}cmp_s[ds]_mask intrinsics have an argument which must have a constant passed to it and so use an inline version only for ifdef __OPTIMIZE__ and have a #define for -O0. But the _mm512_{,mask_}cmp*_p[ds]_mask intrinsics don't need a constant argument, they are essentially the first set with the constant added to them implicitly based on the comparison name, and so there is no #define version for them (correctly). But their inline versions are defined in between the first and s[ds] set and so inside of ifdef __OPTIMIZE__, which means that with -O0 they aren't defined at all. This patch fixes that by moving those after the #ifdef __OPTIMIZE #else use #define #endif block. 2020-07-15 Jakub Jelinek <jakub@redhat.com> PR target/96174 * config/i386/avx512fintrin.h (_mm512_cmpeq_pd_mask, _mm512_mask_cmpeq_pd_mask, _mm512_cmplt_pd_mask, _mm512_mask_cmplt_pd_mask, _mm512_cmple_pd_mask, _mm512_mask_cmple_pd_mask, _mm512_cmpunord_pd_mask, _mm512_mask_cmpunord_pd_mask, _mm512_cmpneq_pd_mask, _mm512_mask_cmpneq_pd_mask, _mm512_cmpnlt_pd_mask, _mm512_mask_cmpnlt_pd_mask, _mm512_cmpnle_pd_mask, _mm512_mask_cmpnle_pd_mask, _mm512_cmpord_pd_mask, _mm512_mask_cmpord_pd_mask, _mm512_cmpeq_ps_mask, _mm512_mask_cmpeq_ps_mask, _mm512_cmplt_ps_mask, _mm512_mask_cmplt_ps_mask, _mm512_cmple_ps_mask, _mm512_mask_cmple_ps_mask, _mm512_cmpunord_ps_mask, _mm512_mask_cmpunord_ps_mask, _mm512_cmpneq_ps_mask, _mm512_mask_cmpneq_ps_mask, _mm512_cmpnlt_ps_mask, _mm512_mask_cmpnlt_ps_mask, _mm512_cmpnle_ps_mask, _mm512_mask_cmpnle_ps_mask, _mm512_cmpord_ps_mask, _mm512_mask_cmpord_ps_mask): Move outside of __OPTIMIZE__ guarded section. * gcc.target/i386/avx512f-vcmppd-3.c: New test. * gcc.target/i386/avx512f-vcmpps-3.c: New test.
The releases/gcc-10 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:9a9e1ed88614b96944d2e5e92e932f65dcf2d920 commit r10-8500-g9a9e1ed88614b96944d2e5e92e932f65dcf2d920 Author: Jakub Jelinek <jakub@redhat.com> Date: Wed Jul 15 11:34:44 2020 +0200 fix _mm512_{,mask_}cmp*_p[ds]_mask at -O0 [PR96174] The _mm512_{,mask_}cmp_p[ds]_mask and also _mm_{,mask_}cmp_s[ds]_mask intrinsics have an argument which must have a constant passed to it and so use an inline version only for ifdef __OPTIMIZE__ and have a #define for -O0. But the _mm512_{,mask_}cmp*_p[ds]_mask intrinsics don't need a constant argument, they are essentially the first set with the constant added to them implicitly based on the comparison name, and so there is no #define version for them (correctly). But their inline versions are defined in between the first and s[ds] set and so inside of ifdef __OPTIMIZE__, which means that with -O0 they aren't defined at all. This patch fixes that by moving those after the #ifdef __OPTIMIZE #else use #define #endif block. 2020-07-15 Jakub Jelinek <jakub@redhat.com> PR target/96174 * config/i386/avx512fintrin.h (_mm512_cmpeq_pd_mask, _mm512_mask_cmpeq_pd_mask, _mm512_cmplt_pd_mask, _mm512_mask_cmplt_pd_mask, _mm512_cmple_pd_mask, _mm512_mask_cmple_pd_mask, _mm512_cmpunord_pd_mask, _mm512_mask_cmpunord_pd_mask, _mm512_cmpneq_pd_mask, _mm512_mask_cmpneq_pd_mask, _mm512_cmpnlt_pd_mask, _mm512_mask_cmpnlt_pd_mask, _mm512_cmpnle_pd_mask, _mm512_mask_cmpnle_pd_mask, _mm512_cmpord_pd_mask, _mm512_mask_cmpord_pd_mask, _mm512_cmpeq_ps_mask, _mm512_mask_cmpeq_ps_mask, _mm512_cmplt_ps_mask, _mm512_mask_cmplt_ps_mask, _mm512_cmple_ps_mask, _mm512_mask_cmple_ps_mask, _mm512_cmpunord_ps_mask, _mm512_mask_cmpunord_ps_mask, _mm512_cmpneq_ps_mask, _mm512_mask_cmpneq_ps_mask, _mm512_cmpnlt_ps_mask, _mm512_mask_cmpnlt_ps_mask, _mm512_cmpnle_ps_mask, _mm512_mask_cmpnle_ps_mask, _mm512_cmpord_ps_mask, _mm512_mask_cmpord_ps_mask): Move outside of __OPTIMIZE__ guarded section. * gcc.target/i386/avx512f-vcmppd-3.c: New test. * gcc.target/i386/avx512f-vcmpps-3.c: New test. (cherry picked from commit 12d69dbfff9dd5ad4a30b20d1636f5cab6425e8c)
I've checked it in as obvious to both trunk and 10.2.
The releases/gcc-9 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:fdcb6dae610aba75e23c1fd2d31b491691e54091 commit r9-8904-gfdcb6dae610aba75e23c1fd2d31b491691e54091 Author: Jakub Jelinek <jakub@redhat.com> Date: Wed Jul 15 11:34:44 2020 +0200 fix _mm512_{,mask_}cmp*_p[ds]_mask at -O0 [PR96174] The _mm512_{,mask_}cmp_p[ds]_mask and also _mm_{,mask_}cmp_s[ds]_mask intrinsics have an argument which must have a constant passed to it and so use an inline version only for ifdef __OPTIMIZE__ and have a #define for -O0. But the _mm512_{,mask_}cmp*_p[ds]_mask intrinsics don't need a constant argument, they are essentially the first set with the constant added to them implicitly based on the comparison name, and so there is no #define version for them (correctly). But their inline versions are defined in between the first and s[ds] set and so inside of ifdef __OPTIMIZE__, which means that with -O0 they aren't defined at all. This patch fixes that by moving those after the #ifdef __OPTIMIZE #else use #define #endif block. 2020-07-15 Jakub Jelinek <jakub@redhat.com> PR target/96174 * config/i386/avx512fintrin.h (_mm512_cmpeq_pd_mask, _mm512_mask_cmpeq_pd_mask, _mm512_cmplt_pd_mask, _mm512_mask_cmplt_pd_mask, _mm512_cmple_pd_mask, _mm512_mask_cmple_pd_mask, _mm512_cmpunord_pd_mask, _mm512_mask_cmpunord_pd_mask, _mm512_cmpneq_pd_mask, _mm512_mask_cmpneq_pd_mask, _mm512_cmpnlt_pd_mask, _mm512_mask_cmpnlt_pd_mask, _mm512_cmpnle_pd_mask, _mm512_mask_cmpnle_pd_mask, _mm512_cmpord_pd_mask, _mm512_mask_cmpord_pd_mask, _mm512_cmpeq_ps_mask, _mm512_mask_cmpeq_ps_mask, _mm512_cmplt_ps_mask, _mm512_mask_cmplt_ps_mask, _mm512_cmple_ps_mask, _mm512_mask_cmple_ps_mask, _mm512_cmpunord_ps_mask, _mm512_mask_cmpunord_ps_mask, _mm512_cmpneq_ps_mask, _mm512_mask_cmpneq_ps_mask, _mm512_cmpnlt_ps_mask, _mm512_mask_cmpnlt_ps_mask, _mm512_cmpnle_ps_mask, _mm512_mask_cmpnle_ps_mask, _mm512_cmpord_ps_mask, _mm512_mask_cmpord_ps_mask): Move outside of __OPTIMIZE__ guarded section. * gcc.target/i386/avx512f-vcmppd-3.c: New test. * gcc.target/i386/avx512f-vcmpps-3.c: New test. (cherry picked from commit 12d69dbfff9dd5ad4a30b20d1636f5cab6425e8c)