bool f(unsigned a, unsigned b) { return (b != 0) && (a >= b); } This can be optimized to `return (b != 0) & (a >= b);`, which is itself optimized to `return (b - 1) > a;`. GCC outputs code equivalent to `return (b != 0) & (a >= b);` (at least on x86) whereas if that code is compiled it would output `return (b - 1) > a;`, while LLVM has no trouble directly outputting the optimal code.
aarch64 GCC is able to compile it to: f(unsigned int, unsigned int): cmp w1, 0 ccmp w1, w0, 2, ne cset w0, ls ret While aarch64 LLVM does: sub w8, w1, #1 cmp w8, w0 cset w0, lo ret depending on the pipeline, they might be the same or the ccmp might be better slightly.
Confirmed, the issue is GCC does not even handle: bool f(unsigned a, unsigned b) { bool t = (b != 0); bool t1 = (a >= b); return t & t1; } I suspect this is a fold-const.cc which has not been moved over to match.pd yet.
On generic, what opimizes this is: /* y == XXX_MIN || x < y --> x <= y - 1 */ (simplify (bit_ior:c (eq:s @1 min_value) (lt:s @0 @1)) (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)) && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1))) (le @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); })))) /* y != XXX_MIN && x >= y --> x > y - 1 */ (simplify (bit_and:c (ne:s @1 min_value) (ge:s @0 @1)) (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)) && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1))) (gt @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); })))) in match.pd when & is used instead of &&.
--- gcc/match.pd.jj 2022-06-15 12:52:04.640981511 +0200 +++ gcc/match.pd 2022-06-15 15:28:55.916225336 +0200 @@ -2379,14 +2379,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* y == XXX_MIN || x < y --> x <= y - 1 */ (simplify - (bit_ior:c (eq:s @1 min_value) (lt:s @0 @1)) + (bit_ior:c (eq:s @1 min_value) (lt:cs @0 @1)) (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)) && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1))) (le @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); })))) /* y != XXX_MIN && x >= y --> x > y - 1 */ (simplify - (bit_and:c (ne:s @1 min_value) (ge:s @0 @1)) + (bit_and:c (ne:s @1 min_value) (ge:cs @0 @1)) (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)) && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1))) (gt @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); })))) fixes this.
(In reply to Jakub Jelinek from comment #4) > --- gcc/match.pd.jj 2022-06-15 12:52:04.640981511 +0200 > +++ gcc/match.pd 2022-06-15 15:28:55.916225336 +0200 > @@ -2379,14 +2379,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > /* y == XXX_MIN || x < y --> x <= y - 1 */ > (simplify > - (bit_ior:c (eq:s @1 min_value) (lt:s @0 @1)) > + (bit_ior:c (eq:s @1 min_value) (lt:cs @0 @1)) > (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)) > && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1))) > (le @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); })))) > > /* y != XXX_MIN && x >= y --> x > y - 1 */ > (simplify > - (bit_and:c (ne:s @1 min_value) (ge:s @0 @1)) > + (bit_and:c (ne:s @1 min_value) (ge:cs @0 @1)) > (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)) > && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1))) > (gt @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); })))) > > fixes this. But doesn't that regress bool f(unsigned a, unsigned b) { return (b != 0) & (a >= b); }
(In reply to Richard Earnshaw from comment #5) > (In reply to Jakub Jelinek from comment #4) > > --- gcc/match.pd.jj 2022-06-15 12:52:04.640981511 +0200 > > +++ gcc/match.pd 2022-06-15 15:28:55.916225336 +0200 > > @@ -2379,14 +2379,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > > /* y == XXX_MIN || x < y --> x <= y - 1 */ > > (simplify > > - (bit_ior:c (eq:s @1 min_value) (lt:s @0 @1)) > > + (bit_ior:c (eq:s @1 min_value) (lt:cs @0 @1)) > > (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)) > > && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1))) > > (le @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); })))) > > > > /* y != XXX_MIN && x >= y --> x > y - 1 */ > > (simplify > > - (bit_and:c (ne:s @1 min_value) (ge:s @0 @1)) > > + (bit_and:c (ne:s @1 min_value) (ge:cs @0 @1)) > > (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)) > > && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1))) > > (gt @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); })))) > > > > fixes this. > > But doesn't that regress > > bool f(unsigned a, unsigned b) > { > return (b != 0) & (a >= b); > } Ignore that - I'm confusing reports.
Created attachment 53146 [details] gcc13-pr105983.patch Full untested patch.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:9642d07c35f14b9917cd115e8a9f0210fbcdcf4f commit r13-1134-g9642d07c35f14b9917cd115e8a9f0210fbcdcf4f Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Jun 16 14:37:06 2022 +0200 match.pd: Improve y == MIN || x < y optimization [PR105983] On the following testcase, we only optimize bar where this optimization is performed at GENERIC folding time, but on GIMPLE it doesn't trigger anymore, as we actually don't see (bit_and (ne @1 min_value) (ge @0 @1)) but (bit_and (ne @1 min_value) (le @1 @0)) genmatch handles :c modifier not just on commutative operations, but also comparisons and in that case it means it swaps the comparison. 2022-06-16 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/105983 * match.pd (y == XXX_MIN || x < y -> x <= y - 1, y != XXX_MIN && x >= y -> x > y - 1): Use :cs instead of :s on non-equality comparisons. * gcc.dg/tree-ssa/pr105983.c: New test.
Fixed.