Bug 105791 - [13 Regression] ICE: in extract_insn, at recog.cc:2791 (unrecognizable insn) with -mxop
Summary: [13 Regression] ICE: in extract_insn, at recog.cc:2791 (unrecognizable insn) ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 13.0
: P3 normal
Target Milestone: 13.0
Assignee: Not yet assigned to anyone
URL:
Keywords: ice-on-valid-code
Depends on:
Blocks:
 
Reported: 2022-05-31 11:47 UTC by Zdenek Sojka
Modified: 2022-06-07 06:51 UTC (History)
1 user (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu
Build:
Known to work: 12.1.1
Known to fail: 13.0
Last reconfirmed: 2022-05-31 00:00:00


Attachments
reduced testcase (145 bytes, text/plain)
2022-05-31 11:47 UTC, Zdenek Sojka
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2022-05-31 11:47:11 UTC
Created attachment 53057 [details]
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O -mxop testcase.c 
testcase.c: In function 'foo':
testcase.c:11:1: error: unrecognizable insn:
   11 | }
      | ^
(insn 36 35 40 2 (set (reg:V1TI 90 [ <retval> ])
        (if_then_else:V1TI (reg:V1TI 115)
            (reg:V1TI 116)
            (reg:V1TI 117))) "testcase.c":10:48 -1
     (nil))
during RTL pass: vregs
testcase.c:11:1: internal compiler error: in extract_insn, at recog.cc:2791
0x76f806 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
        /repo/gcc-trunk/gcc/rtl-error.cc:108
0x76f882 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
        /repo/gcc-trunk/gcc/rtl-error.cc:116
0x75e550 extract_insn(rtx_insn*)
        /repo/gcc-trunk/gcc/recog.cc:2791
0x1019419 instantiate_virtual_regs_in_insn
        /repo/gcc-trunk/gcc/function.cc:1611
0x1019419 instantiate_virtual_regs
        /repo/gcc-trunk/gcc/function.cc:1985
0x1019419 execute
        /repo/gcc-trunk/gcc/function.cc:2034
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r13-861-20220531001632-g0f4df800b15-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/13.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r13-861-20220531001632-g0f4df800b15-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.0.0 20220531 (experimental) (GCC)
Comment 1 Roger Sayle 2022-05-31 13:59:37 UTC
I believe this would be fixed by:
https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595382.html
but Richard Biener insists that the middle-end doesn't/shouldn't create VEC_COND_EXPR if they are not natively supported by the target.
Comment 2 Roger Sayle 2022-05-31 14:39:25 UTC
Doh! V1TI needs to be added to V_128_256.  I'll spin a patch.
Comment 3 GCC Commits 2022-06-02 17:48:28 UTC
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:37e4e7f77d8f7b7e911bf611a0f8edbc3a850c7a

commit r13-961-g37e4e7f77d8f7b7e911bf611a0f8edbc3a850c7a
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Thu Jun 2 18:46:37 2022 +0100

    PR target/105791: Add V1TI to V_128_256 for xop_pcmov_v1ti on x86_64.
    
    This patch resolves PR target/105791 which is a regression that was
    accidentally introduced for my workaround to PR tree-optimization/10566.
    (a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it
    shouldn't).  The latest issues is that by providing a vcond_mask_v1tiv1ti
    pattern in sse.md, the backend now calls ix86_expand_sse_movcc with
    V1TImode operands, which has a special case for TARGET_XOP to generate
    a vpcmov instruction.  Unfortunately, there wasn't previously a V1TImode
    variant, xop_pcmov_v1ti, so we'd ICE.
    
    This is easily fixed by adding V1TImode (and V2TImode) to V_128_256
    which is only used for defining XOP's vpcmov instruction.  This in turn
    requires V1TI (and V2TI) to be supported by <avxsizesuffix> (though
    the use if <avxsizesuffix> in the names xop_pcmov_<mode><avxsizesuffix>
    seems unnecessary; the mode makes the name unique).
    
    2022-06-02  Roger Sayle  <roger@nextmovesoftware.com>
    
    gcc/ChangeLog
            PR target/105791
            * config/i386/sse.md (V_128_256):Add V1TI and V2TI.
            (define_mode_attr avxsizesuffix): Add support for V1TI and V2TI.
    
    gcc/testsuite/ChangeLog
            PR target/105791
            * gcc.target/i386/pr105791.c: New test case.
Comment 4 Roger Sayle 2022-06-04 09:16:23 UTC
This should now be fixed on mainline.  Sorry for the breakage.
Comment 5 GCC Commits 2022-06-07 06:51:17 UTC
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:c4320bde42c6497b701e2e6b8f1c5069bed19818

commit r13-998-gc4320bde42c6497b701e2e6b8f1c5069bed19818
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Tue Jun 7 07:49:40 2022 +0100

    Recognize vpcmov in combine with -mxop on x86.
    
    By way of an apology for causing PR target/105791, where I'd overlooked
    the need to support V1TImode in TARGET_XOP's vpcmov instruction, this
    patch further improves support for TARGET_XOP's vpcmov instruction, by
    recognizing it in combine.
    
    Currently, the test case:
    
    typedef int v4si __attribute__ ((vector_size (16)));
    v4si foo(v4si c, v4si t, v4si f)
    {
        return (c&t)|(~c&f);
    }
    
    on x86_64 with -O2 -mxop generates:
            vpxor   %xmm2, %xmm1, %xmm1
            vpand   %xmm0, %xmm1, %xmm1
            vpxor   %xmm2, %xmm1, %xmm0
            ret
    
    but with this patch now generates:
            vpcmov  %xmm0, %xmm2, %xmm1, %xmm0
            ret
    
    On its own, the new combine splitter works fine on TARGET_64BIT, but
    alas with -m32 combine incorrectly thinks the replacement instruction
    is more expensive, as IF_THEN_ELSE isn't currently/correctly handled
    in ix86_rtx_costs.  So to avoid the need for a target selector in the
    new tescase, I've updated ix86_rtx_costs to report that AMD's vpcmov
    has a latency of two cycles [it's now an obsolete instruction set
    extension and there's unlikely to ever be a processor where this
    instruction has a different timing], and while there I also added
    rtx_costs for x86_64's integer conditional move instructions (which
    have single cycle latency).
    
    2022-06-07  Roger Sayle  <roger@nextmovesoftware.com>
    
    gcc/ChangeLog
            * config/i386/i386.cc (ix86_rtx_costs): Add a new case for
            IF_THEN_ELSE, and provide costs for TARGET_XOP's vpcmov and
            TARGET_CMOVE's (scalar integer) conditional moves.
            * config/i386/sse.md (define_split): Recognize XOP's vpcmov
            from its equivalent (canonical) pxor;pand;pxor sequence.
    
    gcc/testsuite/ChangeLog
            * gcc.target/i386/xop-pcmov3.c: New test case.