x86 intrinsics _addcarry_u32 and _addcarry_u64 generate silly code. For example, the following function to get the result of a 64-bit addition (the XOR is to the output clearer): u64 testcarry(u64 a, u64 b, u64 c, u64 d) { u64 result0, result1; _addcarry_u64(_addcarry_u64(0, a, c, &result0), b, d, &result1); return result0 ^ result1; } This is the code generated with -O1, -O2 and -O3: xor r8d, r8d add r8b, -1 adc rdx, rdi setc r8b mov rax, rdx add r8b, -1 adc rcx, rsi xor rax, rcx ret The first sillyness is that _addcarry_u64 does not optimize a compile-time constant 0 being the first carry parameter. Instead of "adc", it should just use "add". The second sillyness is with the use of r8b to store the carry flag, then using "add r8b, -1" to put the result back into carry. Instead, the code should be something like this: add rdx, rdi mov rax, rdx adc rcx, rsi xor rax, rcx ret Naturally, for something this simple, I'd use unsigned __int128, but this came up in large number math.
Can you provide the full preprocessed source?
Self-contained: typedef unsigned long long u64; u64 testcarry(u64 a, u64 b, u64 c, u64 d) { u64 result0, result1; __builtin_ia32_addcarryx_u64(__builtin_ia32_addcarryx_u64(0, a, c, &result0), b, d, &result1); return result0 ^ result1; }
These things would normally be taken care of by combine, but combine of course does not know what the UNSPEC_ADD_CARRY's mean. Does this need to be an unspec at all?
(In reply to Segher Boessenkool from comment #3) > Does this need to be an unspec at all? Of course not. We are looking to replace unspecs with standard RTXes. Do you have any recommendation on how we can represent this carry-setting insn to satisfy combine?
Combine can handle most RTL expressions, although it sometimes simplifies (or "simplifies") more than you want. I think in this case what is already done in *add<mode>3_cc_overflow will work well, but I do not know x86 CC modes terribly well. You'll need to experiment a bit to find something that works well in all interesting cases.
Created attachment 36258 [details] Proposed patch This patch changes expanders of carry-handling builtins to use addqi3_cconly_overflow and canonicalizes carry hanling insn to what combine can process: (define_insn "addcarry<mode>" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI48 (plus:SWI48 (match_operator:SWI48 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (match_operand:SWI48 1 "nonimmediate_operand" "%0")) (match_operand:SWI48 2 "nonimmediate_operand" "rm")) (match_dup 1))) (set (match_operand:SWI48 0 "register_operand" "=r") (plus:SWI48 (plus:SWI48 (match_op_dup 4 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] "ix86_binary_operator_ok (PLUS, <MODE>mode, operands)" "adc{<imodesuffix>}\t{%2, %0|%0, %2}" [(set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "<MODE>")]) and (define_insn "subborrow<mode>" [(set (reg:CCC FLAGS_REG) (compare:CCC (match_operand:SWI48 1 "nonimmediate_operand" "0") (plus:SWI48 (match_operator:SWI48 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (match_operand:SWI48 2 "nonimmediate_operand" "rm")))) (set (match_operand:SWI48 0 "register_operand" "=r") (minus:SWI48 (minus:SWI48 (match_dup 1) (match_op_dup 4 [(match_dup 3) (const_int 0)])) (match_dup 2)))] "ix86_binary_operator_ok (MINUS, <MODE>mode, operands)" "sbb{<imodesuffix>}\t{%2, %0|%0, %2}" [(set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "<MODE>")]) The patch also rewrites expander to fix a bug, where carry-clobbering insns were emitted inside carry-flag def-use chain. For the testcase, patched gcc generates: testcarry_u64: addq %rdi, %rdx adcq %rsi, %rcx movq %rdx, %rax xorq %rcx, %rax ret
Author: uros Date: Thu Aug 27 18:29:37 2015 New Revision: 227271 URL: https://gcc.gnu.org/viewcvs?rev=227271&root=gcc&view=rev Log: PR target/67317 * config/i386/i386.md (*add<mode>3_cc): Remove insn pattern. (addqi3_cc): Ditto. (UNSPEC_ADD_CARRY): Remove. (addqi3_cconly_overflow): New expander. (*add<dwi>3_doubleword): Split to add<mode>3_cconly_overflow. Adjust for changed add<mode>3_carry. (*neg<dwi>2_doubleword): Adjust for changed add<mode>3_carry. (*sub<dwi>3_doubleword): Adjust for changed sub<mode>3_carry. (<plusminus_insn><mode>3_carry): Remove expander. (*<plusminus_insn><mode>3_carry): Split insn pattern to add<mode>3_carry and sub<mode>3_carry. (plusminus_carry_mnemonic): Remove code attribute. (add<mode>3_carry): Canonicalize insn pattern. (*addsi3_carry_zext): Ditto. (sub<mode>3_carry): Ditto. (*subsi3_carry_zext): Ditto. (adcx<mode>3): Remove insn pattern. (addcarry<mode>): New insn pattern. (subborrow<mode>): Ditto. * config/i386/i386.c (ix86_expand_strlensi_unroll_1): Use gen_addqi3_cconly_overflow instead of gen_addqi3_cc. (ix86_expand_builtin) <case IX86_BUILTIN_SBB32, case IX86_BUILTIN_SBB64, case IX86_BUILTIN_ADDCARRY32, case IX86_BUILTIN_ADDCARRY64>: Use CODE_FOR_subborrowsi, CODE_FOR_subborrowdi, CODE_FOR_addcarrysi and CODE_FOR_addcarrydi. Rewrite expander to not clobber carry flag chains. testsuite/ChangeLog: PR target/67317 * gcc.target/i386/pr67317-1.c: New test. * gcc.target/i386/pr67317-2.c: Ditto. * gcc.target/i386/pr67317-3.c: Ditto. * gcc.target/i386/pr67317-4.c: Ditto. * gcc.target/i386/adx-addcarryx32-1.c: Also scan for adcl. * gcc.target/i386/adx-addcarryx32-2.c: Also scan for adcq. Added: trunk/gcc/testsuite/gcc.target/i386/pr67317-1.c trunk/gcc/testsuite/gcc.target/i386/pr67317-2.c trunk/gcc/testsuite/gcc.target/i386/pr67317-3.c trunk/gcc/testsuite/gcc.target/i386/pr67317-4.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.md trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c trunk/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c
Author: uros Date: Wed Sep 2 15:06:56 2015 New Revision: 227405 URL: https://gcc.gnu.org/viewcvs?rev=227405&root=gcc&view=rev Log: Backport from mainline: 2015-08-27 Uros Bizjak <ubizjak@gmail.com> PR target/67317 * config/i386/i386.md (*add<mode>3_cc): Remove insn pattern. (addqi3_cc): Ditto. (UNSPEC_ADD_CARRY): Remove. (addqi3_cconly_overflow): New expander. (*add<dwi>3_doubleword): Split to add<mode>3_cconly_overflow. Adjust for changed add<mode>3_carry. (*neg<dwi>2_doubleword): Adjust for changed add<mode>3_carry. (*sub<dwi>3_doubleword): Adjust for changed sub<mode>3_carry. (<plusminus_insn><mode>3_carry): Remove expander. (*<plusminus_insn><mode>3_carry): Split insn pattern to add<mode>3_carry and sub<mode>3_carry. (plusminus_carry_mnemonic): Remove code attribute. (add<mode>3_carry): Canonicalize insn pattern. (*addsi3_carry_zext): Ditto. (sub<mode>3_carry): Ditto. (*subsi3_carry_zext): Ditto. (adcx<mode>3): Remove insn pattern. (addcarry<mode>): New insn pattern. (subborrow<mode>): Ditto. * config/i386/i386.c (ix86_expand_strlensi_unroll_1): Use gen_addqi3_cconly_overflow instead of gen_addqi3_cc. (ix86_expand_builtin) <case IX86_BUILTIN_SBB32, case IX86_BUILTIN_SBB64, case IX86_BUILTIN_ADDCARRY32, case IX86_BUILTIN_ADDCARRY64>: Use CODE_FOR_subborrowsi, CODE_FOR_subborrowdi, CODE_FOR_addcarrysi and CODE_FOR_addcarrydi. Rewrite expander to not clobber carry flag chains. testsuite/ChangeLog: Backport from mainline: 2015-08-27 Uros Bizjak <ubizjak@gmail.com> PR target/67317 * gcc.target/i386/pr67317-1.c: New test. * gcc.target/i386/pr67317-2.c: Ditto. * gcc.target/i386/pr67317-3.c: Ditto. * gcc.target/i386/pr67317-4.c: Ditto. * gcc.target/i386/adx-addcarryx32-1.c: Also scan for adcl. * gcc.target/i386/adx-addcarryx32-2.c: Also scan for adcq. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr67317-1.c branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr67317-2.c branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr67317-3.c branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr67317-4.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/i386.c branches/gcc-5-branch/gcc/config/i386/i386.md branches/gcc-5-branch/gcc/testsuite/ChangeLog branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c
Fixed for 5.3+.
I'm not sure why Bugzilla did not return this result when I searched for the instructions earlier. Adding search terms that might help others find this: ADCX adcx ADOX adox