Bug 105983 - Failure to optimize (b != 0) && (a >= b) as well as the same pattern with binary and
Summary: Failure to optimize (b != 0) && (a >= b) as well as the same pattern with bin...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 13.0
: P3 enhancement
Target Milestone: ---
Assignee: Jakub Jelinek
URL:
Keywords: missed-optimization
Depends on:
Blocks: 19987
  Show dependency treegraph
 
Reported: 2022-06-14 21:37 UTC by Gabriel Ravier
Modified: 2023-09-21 12:35 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2022-06-14 00:00:00


Attachments
gcc13-pr105983.patch (709 bytes, patch)
2022-06-15 14:50 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gabriel Ravier 2022-06-14 21:37:26 UTC
bool f(unsigned a, unsigned b)
{
    return (b != 0) && (a >= b);
}

This can be optimized to `return (b != 0) & (a >= b);`, which is itself optimized to `return (b - 1) > a;`. GCC outputs code equivalent to `return (b != 0) & (a >= b);` (at least on x86) whereas if that code is compiled it would output `return (b - 1) > a;`, while LLVM has no trouble directly outputting the optimal code.
Comment 1 Andrew Pinski 2022-06-14 21:41:48 UTC
aarch64 GCC is able to compile it to:
f(unsigned int, unsigned int):
        cmp     w1, 0
        ccmp    w1, w0, 2, ne
        cset    w0, ls
        ret

While aarch64 LLVM does:
        sub     w8, w1, #1
        cmp     w8, w0
        cset    w0, lo
        ret

depending on the pipeline, they might be the same or the ccmp might be better slightly.
Comment 2 Andrew Pinski 2022-06-14 21:45:56 UTC
Confirmed, the issue is GCC does not even handle:
bool f(unsigned a, unsigned b)
{
    bool t = (b != 0);
    bool t1 = (a >= b);
    return t & t1;
}

I suspect this is a fold-const.cc which has not been moved over to match.pd yet.
Comment 3 Jakub Jelinek 2022-06-15 12:26:18 UTC
On generic, what opimizes this is:
/* y == XXX_MIN || x < y --> x <= y - 1 */
(simplify
 (bit_ior:c (eq:s @1 min_value) (lt:s @0 @1))
  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
       && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
  (le @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }))))

/* y != XXX_MIN && x >= y --> x > y - 1 */
(simplify
 (bit_and:c (ne:s @1 min_value) (ge:s @0 @1))
  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
       && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
  (gt @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }))))
in match.pd when & is used instead of &&.
Comment 4 Jakub Jelinek 2022-06-15 13:32:28 UTC
--- gcc/match.pd.jj	2022-06-15 12:52:04.640981511 +0200
+++ gcc/match.pd	2022-06-15 15:28:55.916225336 +0200
@@ -2379,14 +2379,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* y == XXX_MIN || x < y --> x <= y - 1 */
 (simplify
- (bit_ior:c (eq:s @1 min_value) (lt:s @0 @1))
+ (bit_ior:c (eq:s @1 min_value) (lt:cs @0 @1))
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
        && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
   (le @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }))))
 
 /* y != XXX_MIN && x >= y --> x > y - 1 */
 (simplify
- (bit_and:c (ne:s @1 min_value) (ge:s @0 @1))
+ (bit_and:c (ne:s @1 min_value) (ge:cs @0 @1))
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
        && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
   (gt @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }))))

fixes this.
Comment 5 Richard Earnshaw 2022-06-15 14:13:56 UTC
(In reply to Jakub Jelinek from comment #4)
> --- gcc/match.pd.jj	2022-06-15 12:52:04.640981511 +0200
> +++ gcc/match.pd	2022-06-15 15:28:55.916225336 +0200
> @@ -2379,14 +2379,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  
>  /* y == XXX_MIN || x < y --> x <= y - 1 */
>  (simplify
> - (bit_ior:c (eq:s @1 min_value) (lt:s @0 @1))
> + (bit_ior:c (eq:s @1 min_value) (lt:cs @0 @1))
>    (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
>         && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
>    (le @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }))))
>  
>  /* y != XXX_MIN && x >= y --> x > y - 1 */
>  (simplify
> - (bit_and:c (ne:s @1 min_value) (ge:s @0 @1))
> + (bit_and:c (ne:s @1 min_value) (ge:cs @0 @1))
>    (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
>         && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
>    (gt @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }))))
> 
> fixes this.

But doesn't that regress

bool f(unsigned a, unsigned b)
{
    return (b != 0) & (a >= b);
}
Comment 6 Richard Earnshaw 2022-06-15 14:20:12 UTC
(In reply to Richard Earnshaw from comment #5)
> (In reply to Jakub Jelinek from comment #4)
> > --- gcc/match.pd.jj	2022-06-15 12:52:04.640981511 +0200
> > +++ gcc/match.pd	2022-06-15 15:28:55.916225336 +0200
> > @@ -2379,14 +2379,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >  
> >  /* y == XXX_MIN || x < y --> x <= y - 1 */
> >  (simplify
> > - (bit_ior:c (eq:s @1 min_value) (lt:s @0 @1))
> > + (bit_ior:c (eq:s @1 min_value) (lt:cs @0 @1))
> >    (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
> >         && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
> >    (le @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }))))
> >  
> >  /* y != XXX_MIN && x >= y --> x > y - 1 */
> >  (simplify
> > - (bit_and:c (ne:s @1 min_value) (ge:s @0 @1))
> > + (bit_and:c (ne:s @1 min_value) (ge:cs @0 @1))
> >    (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
> >         && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
> >    (gt @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }))))
> > 
> > fixes this.
> 
> But doesn't that regress
> 
> bool f(unsigned a, unsigned b)
> {
>     return (b != 0) & (a >= b);
> }

Ignore that - I'm confusing reports.
Comment 7 Jakub Jelinek 2022-06-15 14:50:21 UTC
Created attachment 53146 [details]
gcc13-pr105983.patch

Full untested patch.
Comment 8 GCC Commits 2022-06-16 12:37:48 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:9642d07c35f14b9917cd115e8a9f0210fbcdcf4f

commit r13-1134-g9642d07c35f14b9917cd115e8a9f0210fbcdcf4f
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Jun 16 14:37:06 2022 +0200

    match.pd: Improve y == MIN || x < y optimization [PR105983]
    
    On the following testcase, we only optimize bar where this optimization
    is performed at GENERIC folding time, but on GIMPLE it doesn't trigger
    anymore, as we actually don't see
      (bit_and (ne @1 min_value) (ge @0 @1))
    but
      (bit_and (ne @1 min_value) (le @1 @0))
    genmatch handles :c modifier not just on commutative operations, but
    also comparisons and in that case it means it swaps the comparison.
    
    2022-06-16  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/105983
            * match.pd (y == XXX_MIN || x < y -> x <= y - 1,
            y != XXX_MIN && x >= y -> x > y - 1): Use :cs instead of :s
            on non-equality comparisons.
    
            * gcc.dg/tree-ssa/pr105983.c: New test.
Comment 9 Jakub Jelinek 2022-06-16 12:39:02 UTC
Fixed.