[Bug target/93670] ICE for _mm256_extractf32x4_ps (unrecognized insn)

cvs-commit at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Feb 13 22:33:00 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93670

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:20ac13c895c5abe7a350de0b664abf190aa28a16

commit r9-8224-g20ac13c895c5abe7a350de0b664abf190aa28a16
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Feb 12 11:58:35 2020 +0100

    i386: Fix up vec_extract_lo* patterns [PR93670]

    The VEXTRACT* insns have way too many different CPUID feature flags (ATT
    syntax)
    vextractf128 $imm, %ymm, %xmm/mem           AVX
    vextracti128 $imm, %ymm, %xmm/mem           AVX2
    vextract{f,i}32x4 $imm, %ymm, %xmm/mem {k}{z}       AVX512VL+AVX512F
    vextract{f,i}32x4 $imm, %zmm, %xmm/mem {k}{z}       AVX512F
    vextract{f,i}64x2 $imm, %ymm, %xmm/mem {k}{z}       AVX512VL+AVX512DQ
    vextract{f,i}64x2 $imm, %zmm, %xmm/mem {k}{z}       AVX512DQ
    vextract{f,i}32x8 $imm, %zmm, %ymm/mem {k}{z}       AVX512DQ
    vextract{f,i}64x4 $imm, %zmm, %ymm/mem {k}{z}       AVX512F

    As the testcase shows and the patch too, we didn't get it right in all
    cases.

    The first hunk is about avx512vl_vextractf128v8s[if] incorrectly
    requiring TARGET_AVX512DQ.  The corresponding insn is the first
    vextract{f,i}32x4 above, so it requires VL+F, and the builtins have it
    correct (TARGET_AVX512VL implies TARGET_AVX512F):
    BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v8sf,
"__builtin_ia32_extractf32x4_256_mask", IX86_BUILTIN_EXTRACTF32X4_256, UNKNOWN,
(int) V4SF_FTYPE_V8SF_INT_V4SF_UQI)
    BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v8si,
"__builtin_ia32_extracti32x4_256_mask", IX86_BUILTIN_EXTRACTI32X4_256, UNKNOWN,
(int) V4SI_FTYPE_V8SI_INT_V4SI_UQI)
    We only need TARGET_AVX512DQ for avx512vl_vextractf128v4d[if].

    The second hunk is about vec_extract_lo_v16s[if]{,_mask}.  These are using
    the vextract{f,i}32x8 insns (AVX512DQ above), but we weren't requiring
that,
    but instead incorrectly && 1 for non-masked and && (64 == 64 &&
TARGET_AVX512VL)
    for masked insns.  This is extraction from ZMM, so it doesn't need VL for
    anything.  The hunk actually only requires TARGET_AVX512DQ when the insn
    is masked, if it is not masked, when TARGET_AVX512DQ isn't available we can
    use vextract{f,i}64x4 instead which is available already in TARGET_AVX512F
    and does the same thing, extracts the low 256 bits from 512 bits vector
    (often we split it into just nothing, but there are some special cases like
    when using xmm16+ when we can't without AVX512VL).

    The last hunk is about vec_extract_lo_v8s[if]{,_mask}.  The non-_mask
    suffixed ones are ok already and just split into nothing (lowpart subreg).
    The masked ones were incorrectly requiring TARGET_AVX512VL and
    TARGET_AVX512DQ, when we only need TARGET_AVX512VL.

    2020-02-12  Jakub Jelinek  <jakub@redhat.com>

        PR target/93670
        * config/i386/sse.md (VI48F_256_DQ): New mode iterator.
        (avx512vl_vextractf128<mode>): Use it instead of VI48F_256.  Remove
        TARGET_AVX512DQ from condition.
        (vec_extract_lo_<mode><mask_name>): Use <mask_avx512dq_condition>
        instead of <mask_mode512bit_condition> in condition.  If
        TARGET_AVX512DQ is false, emit vextract*64x4 instead of
        vextract*32x8.
        (vec_extract_lo_<mode><mask_name>): Drop <mask_avx512dq_condition>
        from condition.

        * gcc.target/i386/avx512vl-pr93670.c: New test.


More information about the Gcc-bugs mailing list