[Bug middle-end/112600] Failed to optimize saturating addition using __builtin_add_overflow
ubizjak at gmail dot com
gcc-bugzilla@gcc.gnu.org
Wed Jun 5 20:42:51 GMT 2024
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #11 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jonathan Wakely from comment #0)
> These two implementations of C++26 saturating addition
> (std::add_sat<unsigned>) have equivalent behaviour:
>
> unsigned
> add_sat(unsigned x, unsigned y) noexcept
> {
> unsigned z;
> if (!__builtin_add_overflow(x, y, &z))
> return z;
> return -1u;
> }
[...]
> For -O3 on x86_64 GCC uses a branch for the first one:
>
> add_sat(unsigned int, unsigned int):
> add edi, esi
> jc .L3
> mov eax, edi
> ret
> .L3:
> or eax, -1
> ret
The reason for failed if-conversion to cmove is due to the "weird" compare
arguments, the consequence of addsi3_cc_overflow_1 definition:
(insn 9 4 10 2 (parallel [
(set (reg:CCC 17 flags)
(compare:CCC (plus:SI (reg:SI 106)
(reg:SI 107))
(reg:SI 106)))
(set (reg:SI 104)
(plus:SI (reg:SI 106)
(reg:SI 107)))
]) "sadd.c":7:12 477 {addsi3_cc_overflow_1}
(expr_list:REG_DEAD (reg:SI 107)
(expr_list:REG_DEAD (reg:SI 106)
(nil))))
the noce_try_cmove path fails in noce_emit_cmove:
Breakpoint 1, noce_emit_cmove (if_info=0x7fffffffd750, x=0x7fffe9fe4e40,
code=LTU, cmp_a=0x7fffe9fe4a20, cmp_b=0x7fffe9feb9a8, vfalse=0x7fffe9fe49d8,
vtrue=0x7fffe9e09480, cc_cmp=0x0, rev_cc_cmp=0x0) at
../../git/gcc/gcc/ifcvt.cc:1774
1774 return NULL_RTX;
(gdb) list
1766 /* Don't even try if the comparison operands are weird
1767 except that the target supports cbranchcc4. */
1768 if (! general_operand (cmp_a, GET_MODE (cmp_a))
1769 || ! general_operand (cmp_b, GET_MODE (cmp_b)))
1770 {
1771 if (!have_cbranchcc4
1772 || GET_MODE_CLASS (GET_MODE (cmp_a)) != MODE_CC
1773 || cmp_b != const0_rtx)
1774 return NULL_RTX;
1775 }
1776
1777 target = emit_conditional_move (x, { code, cmp_a, cmp_b, VOIDmode
},
1778 vtrue, vfalse, GET_MODE (x),
(gdb) bt
#0 noce_emit_cmove (if_info=0x7fffffffd750, x=0x7fffe9fe4e40, code=LTU,
cmp_a=0x7fffe9fe4a20, cmp_b=0x7fffe9feb9a8, vfalse=0x7fffe9fe49d8,
vtrue=0x7fffe9e09480, cc_cmp=0x0, rev_cc_cmp=0x0) at
../../git/gcc/gcc/ifcvt.cc:1774
#1 0x00000000020d995b in noce_try_cmove (if_info=0x7fffffffd750) at
../../git/gcc/gcc/ifcvt.cc:1884
#2 0x00000000020dec37 in noce_process_if_block (if_info=0x7fffffffd750) at
../../git/gcc/gcc/ifcvt.cc:4149
#3 0x00000000020e0248 in noce_find_if_block (test_bb=0x7fffe9fb5d80,
then_edge=0x7fffe9fd7cc0, else_edge=0x7fffe9fd7c60, pass=1)
at ../../git/gcc/gcc/ifcvt.cc:4716
#4 0x00000000020e08e9 in find_if_header (test_bb=0x7fffe9fb5d80, pass=1) at
../../git/gcc/gcc/ifcvt.cc:4921
#5 0x00000000020e3255 in if_convert (after_combine=true) at
../../git/gcc/gcc/ifcvt.cc:6068
(gdb) p debug_rtx (cmp_a)
(plus:SI (reg:SI 106)
(reg:SI 107))
$1 = void
(gdb) p debug_rtx (cmp_b)
(reg:SI 106)
$2 = void
The above cmp_a RTX fails general_operand check.
Please note that similar testcase:
unsigned
sub_sat(unsigned x, unsigned y)
{
unsigned z;
return __builtin_sub_overflow(x, y, &z) ? 0 : z;
}
results in the expected:
subl %esi, %edi # 52 [c=4 l=2] *subsi_3/0
movl $0, %eax # 53 [c=4 l=5] *movsi_internal/0
cmovnb %edi, %eax # 54 [c=4 l=3] *movsicc_noc/0
ret # 50 [c=0 l=1] simple_return_internal
due to:
(insn 9 4 10 2 (parallel [
(set (reg:CC 17 flags)
(compare:CC (reg:SI 106)
(reg:SI 107)))
(set (reg:SI 104)
(minus:SI (reg:SI 106)
(reg:SI 107)))
]) "sadd.c":28:12 416 {*subsi_3}
(expr_list:REG_DEAD (reg:SI 107)
(expr_list:REG_DEAD (reg:SI 106)
(nil))))
So, either addsi3_cc_overflow_1 RTX is not correct, or noce_emit_cmove should
be improved to handle the above "weird" operand form.
Let's ask Jakub.
More information about the Gcc-bugs
mailing list