Created attachment 43319 [details] reduced testcase Output: $ aarch64-unknown-linux-gnu-gcc testcase.c -static -O $ ./a.out qemu: uncaught target signal 6 (Aborted) - core dumped Aborted The most-significant 64bit are zeros, but should be ones. $ aarch64-unknown-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-aarch64/bin/aarch64-unknown-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-257303-checking-yes-rtl-df-extra-aarch64/bin/../libexec/gcc/aarch64-unknown-linux-gnu/8.0.1/lto-wrapper Target: aarch64-unknown-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --with-sysroot=/usr/aarch64-unknown-linux-gnu --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=aarch64-unknown-linux-gnu --with-ld=/usr/bin/aarch64-unknown-linux-gnu-ld --with-as=/usr/bin/aarch64-unknown-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-257303-checking-yes-rtl-df-extra-aarch64 Thread model: posix gcc version 8.0.1 20180201 (experimental) (GCC)
Confirmed
I think combine is at fault here. The faulty assembly is: foo: ldrb w0, [sp, 16] neg x0, x0, lsl 4 csetm x3, ne <snip> The csetm is supposed to set the top 64 bits of the TImode value (in x3) to ones, based on the CC content, but the CC-setting instruction has disappeared. The correct sequence (by hacking the assembly output and checking that it passes) would be: ldrb w0, [sp, 16] negs x0, x0, lsl 4 // NEG sets the condition code as well. csetm x3, ne There's a few complex combinations going on, but before combine we have (some RTL edited to tone down verbosity): (insn 14 12 17 2 (set (reg:DI 87) (lshiftrt:DI (reg:DI 103) (const_int 60 [0x3c]))) "wrong.c":9 676 {*aarch64_lshr_sisd_or_int_di3} (insn 19 17 20 2 (parallel [ (set (reg:CC 66 cc) (compare:CC (reg:DI 85) (reg:DI 105))) (set (reg:DI 90) (minus:DI (reg:DI 85) (reg:DI 105))) ]) "wrong.c":9 270 {subdi3_compare1} (insn 20 19 23 2 (set (reg:DI 91) (minus:DI (minus:DI (reg:DI 85) (reg:DI 87)) (ltu:DI (reg:CC 66 cc) (const_int 0 [0])))) "wrong.c":9 327 {*subdi3_carryin} where insn 19 sets the CC and insn 20 uses it. After combine we have: (insn 14 12 17 2 (set (reg:DI 87) (lshiftrt:DI (reg:DI 103) (const_int 60 [0x3c]))) "wrong.c":9 676 {*aarch64_lshr_sisd_or_int_di3} (note 19 17 20 2 NOTE_INSN_DELETED) (insn 20 19 23 2 (set (reg:DI 91) (neg:DI (ne:DI (reg:CC 66 cc) (const_int 0 [0])))) "wrong.c":9 441 {cstoredi_neg} with insn 20 still using the CC but the CC setting instruction being deleted. There's a number of exciting things going on in combine so my analysis may not be entirely accurate, but I see things like: Trying 14 -> 20: 14: r87:DI=r103:DI 0>>0x3c 20: r91:DI=r85:DI-r87:DI-ltu(cc:CC,0) REG_DEAD r87:DI REG_DEAD r85:DI REG_DEAD cc:CC REG_EQUAL -r87:DI-ltu(cc:CC,0) Successfully matched this instruction: (set (reg:DI 91) (neg:DI (ltu:DI (reg:CC 66 cc) (const_int 0 [0])))) and Trying 19 -> 26: 19: cc:CC=cmp(r103:DI,0) REG_DEAD r103:DI 26: {cc:CC_C=zero_extend(r90:DI)+zero_extend(r111:DI)!=zero_extend(r90:DI+r111:DI);r95:DI=r90:DI+r111:DI;} REG_DEAD r111:DI REG_DEAD r90:DI Successfully matched this instruction: (parallel [ (set (reg:CC_C 66 cc) (ne:CC_C (plus:TI (zero_extend:TI (reg:DI 90)) (zero_extend:TI (reg:DI 111 [ b ]))) (zero_extend:TI (plus:DI (reg:DI 90) (reg:DI 111 [ b ]))))) (set (reg:DI 95) (plus:DI (reg:DI 90) (reg:DI 111 [ b ]))) ]) deferring deletion of insn with uid = 19. So insn 19 ends up being combined into a subsequent instruction and ends up being deleted even though insn 20 depends on it's CC-setting effects
Confirmed. Does not happen with a compiler a few months old.
This starts to fail with r249850 (don't know yet what the problem is).
Author: segher Date: Tue Feb 13 22:12:55 2018 New Revision: 257644 URL: https://gcc.gnu.org/viewcvs?rev=257644&root=gcc&view=rev Log: combine: Update links correctly for new I2 (PR84169) If there is a LOG_LINK between two insns, this means those two insns can be combined, as far as dataflow is concerned. There never should be a LOG_LINK between two unrelated insns. If there is one, combine will try to combine the insns without doing all the needed checks if the earlier destination is used before the later insn, etc. Unfortunately we do not update the LOG_LINKs correctly in some cases. This patch fixes at least some of those cases. PR rtl-optimization/84169 * combine.c (try_combine): New variable split_i2i3. Set it to true if we generated a parallel as new i3 and we split that to new i2 and i3 instructions. Handle split_i2i3 similar to swap_i2i3: scan the LOG_LINKs of i3 to see which of those need to link to i2 now. Link those to i2, not i1. Partially rewrite this scan code. gcc/testsuite/ PR rtl-optimization/84169 * gcc.c-torture/execute/pr84169.c: New. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr84169.c Modified: trunk/gcc/ChangeLog trunk/gcc/combine.c trunk/gcc/testsuite/ChangeLog
Fixed.
Yeah, not planning to do backports, although the core problem is ancient.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:c23c899aedf11069e992eed7358802b262d62f98 commit r10-7607-gc23c899aedf11069e992eed7358802b262d62f98 Author: Jakub Jelinek <jakub@redhat.com> Date: Tue Apr 7 21:30:12 2020 +0200 combine: Fix split_i2i3 ICE [PR94291] The following testcase ICEs on armv7hl-linux-gnueabi. try_combine is called on: (gdb) p debug_rtx (i3) (insn 20 12 22 2 (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp) (const_int -4 [0xfffffffffffffffc])) [1 x+0 S4 A32]) (reg:SI 125)) "pr94291.c":7:8 241 {*arm_movsi_insn} (expr_list:REG_DEAD (reg:SI 125) (nil))) (gdb) p debug_rtx (i2) (insn 12 7 20 2 (parallel [ (set (reg:CC 100 cc) (compare:CC (reg:SI 121 [ <retval> ]) (const_int 0 [0]))) (set (reg:SI 125) (reg:SI 121 [ <retval> ])) ]) "pr94291.c":7:8 248 {*movsi_compare0} (expr_list:REG_UNUSED (reg:CC 100 cc) (nil))) and tries to recognize cc = r121 cmp 0; [sfp-4] = r121 parallel, but that isn't recognized, so it splits it into two: split_i2i3 [sfp-4] = r121 followed by cc = r121 cmp 0 which is recognized, but ICEs because the code below insist that the SET_DEST of newi2pat (or first set in PARALLEL thereof) must be a REG or SUBREG of REG, but it is a MEM in this case. I don't see any condition that would guarantee that, perhaps for the swap_i2i3 case it was somehow guaranteed. As the code just wants to update LOG_LINKS and LOG_LINKS are only for registers, not for MEM or anything else, the patch just doesn't update those if it isn't a REG or SUBREG of REG. 2020-04-07 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/94291 PR rtl-optimization/84169 * combine.c (try_combine): For split_i2i3, don't assume SET_DEST must be a REG or SUBREG of REG; if it is not one of these, don't update LOG_LINKs. * gcc.dg/pr94291.c: New test.