This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/88461] AVX512: gcc should keep value in kN registers if possible
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 12 Dec 2018 10:32:17 +0000
- Subject: [Bug target/88461] AVX512: gcc should keep value in kN registers if possible
- Auto-submitted: auto-generated
- References: <bug-88461-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88461
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Target|X86_64 |x86_64-*-*, i?86-*-*
Status|UNCONFIRMED |NEW
Last reconfirmed| |2018-12-12
Ever confirmed|0 |1
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
<bb 2> [local count: 1073741824]:
_13 = MEM[(const __m128i * {ref-all})data_3(D)];
_11 = VIEW_CONVERT_EXPR<vector(8) short int>(_13);
_12 = __builtin_ia32_ptestnmw128 (_11, _11, 255);
_1 = (int) _12;
_10 = __builtin_ia32_kshiftlihi (_1, 1);
_14 = a_6(D) & 65535;
_5 = _10 & 255;
_4 = (int) _5;
_9 = __builtin_ia32_kandnhi (_4, _14);
m_7 = (__mmask8) _9;
_8 = (int) m_7;
return _8;
probably an artifact of C promoting __mmask8 to int:
__m128i v = _mm_load_si128 ((const __m128i * {ref-all}) data);
__mmask8 m = _mm_testn_epi16_mask (v, v);
__m128i v = _mm_load_si128 ((const __m128i * {ref-all}) data);
__mmask8 m = _mm_testn_epi16_mask (v, v);
m = (__mmask8) _kshiftli_mask16 ((int) m, 1);
m = (__mmask8) _mm512_kandn ((int) m, (int) (__mmask16) a);
return (int) m;
and
;; Function _kshiftli_mask16 (null)
;; enabled by -tree-original
{
return (__mmask16) __builtin_ia32_kshiftlihi ((int) __A, (int) (unsigned
char) __B);
btw, why are you using mask16 intrinsics on mask8 types? When using
kshiftli_mask8 and kandn_mask8 I get
vmovdqa64 (%rdi), %xmm0
kmovb %esi, %k3
vptestnmw %xmm0, %xmm0, %k1
kshiftlb $1, %k1, %k0
kandnb %k3, %k0, %k2
kmovb %k2, %eax