Created attachment 37365 [details] reduced testcase Output (qemu userspace emulation): $ aarch64-unknown-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-aarch64/bin/aarch64-unknown-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-232232-checking-yes-rtl-df-nographite-aarch64/bin/../libexec/gcc/aarch64-unknown-linux-gnu/6.0.0/lto-wrapper Target: aarch64-unknown-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-checking=yes,rtl,df --without-cloog --without-ppl --without-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=aarch64-unknown-linux-gnu --with-ld=/usr/bin/aarch64-unknown-linux-gnu-ld --with-as=/usr/bin/aarch64-unknown-linux-gnu-as --with-sysroot=/usr/aarch64-unknown-linux-gnu --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-232232-checking-yes-rtl-df-nographite-aarch64 Thread model: posix gcc version 6.0.0 20160111 (experimental) (GCC) $ aarch64-unknown-linux-gnu-gcc -O testcase.c $ ./a.out 46000000000000000100000000000000 qemu: uncaught target signal 6 (Aborted) - core dumped Aborted Tested revisions: trunk r232232 - FAIL 5-branch r231950 - FAIL 4_9-branch r231949 - OK 4_8-branch r224828 - OK
Dup. *** This bug has been marked as a duplicate of bug 69306 ***
(In reply to H.J. Lu from comment #1) > typedef unsigned long int uint64_t; > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ It should be long long. "unsigned long int" is 64bit on aarch64 (contrary to x32)
Looks like aarch64 backend bug to me. unsigned __int128 f1 (unsigned __int128 x, unsigned __int128 y) { return x + y; } unsigned __int128 f2 (unsigned __int128 x, unsigned __int128 y) { return x - y; } unsigned __int128 f3 (unsigned __int128 x, unsigned long long y) { return x + y; } unsigned __int128 f4 (unsigned __int128 x, unsigned long long y) { return x - y; } unsigned __int128 f5 (unsigned __int128 x, unsigned long long y) { return x + (unsigned long long) -y; } This case is f5 from the above, where the value should be negated first in DImode and then added to the TImode value. But combiner sees (insn 7 4 10 2 (set (reg:DI 79) (neg:DI (reg/v:DI 77 [ y ]))) pr69305-1.c:5 331 {negdi2} (expr_list:REG_DEAD (reg/v:DI 77 [ y ]) (nil))) (insn 10 7 11 2 (parallel [ (set (reg:CC_NZ 66 cc) (compare:CC_NZ (plus:DI (reg:DI 79) (reg:DI 85 [ x ])) (const_int 0 [0]))) (set (reg:DI 81) (plus:DI (reg:DI 79) (reg:DI 85 [ x ]))) ]) pr69305-1.c:5 100 {adddi3_compare0} (expr_list:REG_DEAD (reg:DI 85 [ x ]) (expr_list:REG_DEAD (reg:DI 79) (nil)))) (insn 11 10 25 2 (set (reg:DI 82) (plus:DI (geu:DI (reg:CC 66 cc) (const_int 0 [0])) (reg:DI 86 [ x+8 ]))) pr69305-1.c:5 452 {*csinc2di_insn} (expr_list:REG_DEAD (reg:DI 86 [ x+8 ]) (expr_list:REG_DEAD (reg:CC 66 cc) (nil)))) and attempts to merge insn 7 into insn 10 by having a minus in there instead, and that pattern is then recognized as subdi3_compare0. But probably csinc2di_insn actually relies on the preceeding insn being result of adddi3_compare instead of subdi3_compare. Seems on arm32 or x86_64 we have a DImode (arm32) or TImode pattern for the addition that is only lowered later on, so this is something that really can't happen on those targets.
Reduced testcase: __attribute__((noinline, noclone)) unsigned __int128 foo (unsigned __int128 x, unsigned long long y) { return x + (unsigned long long) -y; } int main () { if (foo (70, 0) != 70) __builtin_abort (); return 0; }
Confirmed as well. If combine changed the plus-compare into a minus-compare, shouldn't it also go into the condition code usage and update that too though?
(In reply to ktkachov from comment #5) > Confirmed as well. > If combine changed the plus-compare into a minus-compare, shouldn't it also > go into the condition code usage and update that too though? scratch that, I was thinking of something else
The patterns are just weird. (insn 10 7 11 2 (parallel [ (set (reg:CC_NZ 66 cc) (compare:CC_NZ (plus:DI (reg:DI 79) (reg:DI 85 [ x ])) (const_int 0 [0]))) (set (reg:DI 81) (plus:DI (reg:DI 79) (reg:DI 85 [ x ]))) ]) pr69305-1.c:5 100 {adddi3_compare0} (expr_list:REG_DEAD (reg:DI 85 [ x ]) (expr_list:REG_DEAD (reg:DI 79) (nil)))) (insn 11 10 25 2 (set (reg:DI 82) (plus:DI (geu:DI (reg:CC 66 cc) (const_int 0 [0])) (reg:DI 86 [ x+8 ]))) pr69305-1.c:5 452 {*csinc2di_insn} (expr_list:REG_DEAD (reg:DI 86 [ x+8 ]) (expr_list:REG_DEAD (reg:CC 66 cc) (nil)))) This suggests that that you compare (a + b) cmp 0 and add 1 in the second insn if (a + b) u>= 0, which is always, while what you are actually interested in is whether (a + b) u< a (or equivalently (a + b) u< b), i.e. if there is an overflow in the addition. E.g. on x86_64 (though, only after split2) it is represented that way: (insn 33 13 34 2 (parallel [ (set (reg:CCC 17 flags) (compare:CCC (plus:DI (reg:DI 0 ax [97]) (reg:DI 38 r9 [orig:90 x ] [90])) (reg:DI 0 ax [97]))) (set (reg:DI 0 ax [95]) (plus:DI (reg:DI 0 ax [97]) (reg:DI 38 r9 [orig:90 x ] [90]))) ]) pr69305-1.c:5 306 {*adddi3_cc_overflow_1} (nil)) (insn 34 33 20 2 (parallel [ (set (reg:DI 1 dx [+8 ]) (plus:DI (plus:DI (ltu:DI (reg:CC 17 flags) (const_int 0 [0])) (reg:DI 1 dx [+8 ])) (reg:DI 39 r10 [ x+8 ]))) (clobber (reg:CC 17 flags)) ]) pr69305-1.c:5 284 {adddi3_carry} (nil))
(In reply to Jakub Jelinek from comment #7) > The patterns are just weird. All that comes from the addti3 expander in aarch64.md If I delete it the testcase doesn't abort. I'll have a closer look there.
(In reply to ktkachov from comment #8) > (In reply to Jakub Jelinek from comment #7) > > The patterns are just weird. > > All that comes from the addti3 expander in aarch64.md > If I delete it the testcase doesn't abort. > I'll have a closer look there. Every approach I've tried to implement addti3 properly (that is, comparing (a + b) u< a instead of (a + b) u>= 0) doesn't give better code than 4.9 which didn't have the custom expanders. Richard, you added these expanders. Any ideas on how to implement them properly and better than the fallback? Or should we just delete them?
Mine.
Created attachment 37391 [details] Extended adddi3_compare pattern (In reply to Richard Henderson from comment #10) > Mine. oops - I had just started looking at this one too. :-) I have uploaded a possible patch, although I have not got very far in testing it yet. Maybe it will be of some use ? Cheers Nick
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01829.html
Author: rth Date: Thu Jan 28 17:48:22 2016 New Revision: 232936 URL: https://gcc.gnu.org/viewcvs?rev=232936&root=gcc&view=rev Log: PR target/69305 * config/aarch64/aarch64-modes.def (CC_Cmode): New * config/aarch64/aarch64-protos.h: Update. * config/aarch64/aarch64.c (aarch64_zero_extend_const_eq): New. (aarch64_select_cc_mode): Add check for use of CC_Cmode. (aarch64_get_condition_code_1): Handle CC_Cmode. * config/aarch64/aarch64.md (addti3): Use adddi3_compareC. (*add<mode>3_compareC_cconly_imm): New. (*add<mode>3_compareC_cconly): New. (*add<mode>3_compareC_imm): New. (add<mode>3_compareC): New. (add<mode>3_carryin, *addsi3_carryin_uxtw): Sort compare operand to be first. Use aarch64_carry_operation. (*add<mode>3_carryin_alt1, *addsi3_carryin_alt1_uxtw): Remove. (*add<mode>3_carryin_alt2, *addsi3_carryin_alt2_uxtw): Remove. (*add<mode>3_carryin_alt3, *addsi3_carryin_alt3_uxtw): Remove. (subti3): Use subdi3_compare1. (*sub<mode>3_compare0): Rename from sub<mode>3_compare0. (sub<mode>3_compare1): New. (*sub<mode>3_carryin0, *subsi3_carryin_uxtw): New. (*sub<mode>3_carryin): Use aarch64_borrow_operation. (*subsi3_carryin_uxtw): Likewise. (*ngc<mode>, *ngcsi_uxtw): Likewise. (*sub<mode>3_carryin_alt, *subsi3_carryin_alt_uxtw): New. * config/aarch64/iterators.md (DWI): New. * config/aarch64/predicates.md (aarch64_carry_operation): New. (aarch64_borrow_operation): New. Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64-modes.def trunk/gcc/config/aarch64/aarch64-protos.h trunk/gcc/config/aarch64/aarch64.c trunk/gcc/config/aarch64/aarch64.md trunk/gcc/config/aarch64/iterators.md trunk/gcc/config/aarch64/predicates.md trunk/gcc/testsuite/gcc.target/aarch64/ccmp_1.c trunk/gcc/testsuite/gcc.target/aarch64/tst_3.c
Fixed for gcc-6.
Author: rth Date: Mon Feb 1 07:06:53 2016 New Revision: 233031 URL: https://gcc.gnu.org/viewcvs?rev=233031&root=gcc&view=rev Log: PR target/69305 * config/aarch64/aarch64-modes.def (CC_Cmode): New * config/aarch64/aarch64-protos.h: Update. * config/aarch64/aarch64.c (aarch64_zero_extend_const_eq): New. (aarch64_select_cc_mode): Add check for use of CC_Cmode. (aarch64_get_condition_code_1): Handle CC_Cmode. * config/aarch64/aarch64.md (addti3): Use adddi3_compareC. (*add<mode>3_compareC_cconly_imm): New. (*add<mode>3_compareC_cconly): New. (*add<mode>3_compareC_imm): New. (add<mode>3_compareC): New. (add<mode>3_carryin, *addsi3_carryin_uxtw): Sort compare operand to be first. Use aarch64_carry_operation. (*add<mode>3_carryin_alt1, *addsi3_carryin_alt1_uxtw): Remove. (*add<mode>3_carryin_alt2, *addsi3_carryin_alt2_uxtw): Remove. (*add<mode>3_carryin_alt3, *addsi3_carryin_alt3_uxtw): Remove. (subti3): Use subdi3_compare1. (*sub<mode>3_compare0): Rename from sub<mode>3_compare0. (sub<mode>3_compare1): New. (*sub<mode>3_carryin0, *subsi3_carryin_uxtw): New. (*sub<mode>3_carryin): Use aarch64_borrow_operation. (*subsi3_carryin_uxtw): Likewise. (*ngc<mode>, *ngcsi_uxtw): Likewise. (*sub<mode>3_carryin_alt, *subsi3_carryin_alt_uxtw): New. * config/aarch64/iterators.md (DWI): New. * config/aarch64/predicates.md (aarch64_carry_operation): New. (aarch64_borrow_operation): New. Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/aarch64/aarch64-modes.def branches/gcc-5-branch/gcc/config/aarch64/aarch64-protos.h branches/gcc-5-branch/gcc/config/aarch64/aarch64.c branches/gcc-5-branch/gcc/config/aarch64/aarch64.md branches/gcc-5-branch/gcc/config/aarch64/iterators.md branches/gcc-5-branch/gcc/config/aarch64/predicates.md
Fixed for gcc-5 too.