Bug 100874 - [12 Regression] slight missed optimization with min<a,CST>-CST on aarch64 (subs_compare_2.c)
Summary: [12 Regression] slight missed optimization with min<a,CST>-CST on aarch64 (su...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 12.0
: P3 normal
Target Milestone: 12.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization, testsuite-fail
: 100873 (view as bug list)
Depends on:
Blocks:
 
Reported: 2021-06-02 09:32 UTC by Andrew Pinski
Modified: 2022-02-15 18:15 UTC (History)
2 users (show)

See Also:
Host:
Target: aarch64-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Pinski 2021-06-02 09:32:35 UTC
Take:
int f(int a) { return a < 4 ? a - 4 : 0; }
int f1(int a) { int x = a - 4; if (a < 4) return x; return 0; }
int f2(int a) { int t = a < 4 ? a : 4; return t - 4; }

In GCC 11, we produce:
f(int):
        mov     w1, 4
        cmp     w0, w1
        csel    w0, w0, w1, le
        sub     w0, w0, #4
        ret
f1(int):
        subs    w0, w0, 4
        csel    w0, w0, wzr, lt
        ret
f2(int):
        mov     w1, 4
        cmp     w0, w1
        csel    w0, w0, w1, le
        sub     w0, w0, #4
        ret

On the trunk all three give the same code gen (due to PHI-OPT being improved) but of what f and f2 used to give.
All three should produce what f1 had produed instead.
This is gcc.target/aarch64/subs_compare_2.c
Comment 1 Andrew Pinski 2021-06-02 09:33:58 UTC
Note clang even does produce the same code gen for all three:
https://godbolt.org/z/hq357jdMd

Also I put how we could fix this in https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571699.html
Comment 2 Andrew Pinski 2021-06-02 09:35:24 UTC
*** Bug 100873 has been marked as a duplicate of this bug. ***
Comment 3 Andrew Pinski 2021-06-02 09:39:58 UTC
Note here is a 4th variant of the function:
int f1a(int a) { int x = a - 4; return  (a < 4) ? x : 0; }
Comment 4 GCC Commits 2022-02-15 18:10:22 UTC
The trunk branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:8e84b2b37a541b27feea69769fc314d534464ebd

commit r12-7249-g8e84b2b37a541b27feea69769fc314d534464ebd
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Tue Feb 15 18:09:35 2022 +0000

    aarch64: Fix subs_compare_2.c regression [PR100874]
    
    subs_compare_2.c tests that we can use a SUBS+CSEL sequence for:
    
    unsigned int
    foo (unsigned int a, unsigned int b)
    {
      unsigned int x = a - 4;
      if (a < 4)
        return x;
      else
        return 0;
    }
    
    As Andrew notes in the PR, this is effectively MIN (x, 4) - 4,
    and it is now recognised as such by phiopt.  Previously it was
    if-converted in RTL instead.
    
    I tried to look for ways to generalise this to other situations
    and to other ?:-style operations, not just max and min.  However,
    for general ?: we tend to push an outer â- CSTâ into the arms of
    the ?: -- at least if one of them simplifies -- so I didn't find
    any useful abstraction.
    
    This patch therefore adds a pattern specifically for
    max/min(a,cst)-cst.  I'm not thrilled at having to do this,
    but it seems like the least worst fix in the circumstances.
    Also, max(a,cst)-cst for unsigned a is a useful saturating
    subtraction idiom and so is arguably worth its own code
    for that reason.
    
    gcc/
            PR target/100874
            * config/aarch64/aarch64-protos.h (aarch64_maxmin_plus_const):
            Declare.
            * config/aarch64/aarch64.cc (aarch64_maxmin_plus_const): New function.
            * config/aarch64/aarch64.md (*aarch64_minmax_plus): New pattern.
    
    gcc/testsuite/
            * gcc.target/aarch64/max_plus_1.c: New test.
            * gcc.target/aarch64/max_plus_2.c: Likewise.
            * gcc.target/aarch64/max_plus_3.c: Likewise.
            * gcc.target/aarch64/max_plus_4.c: Likewise.
            * gcc.target/aarch64/max_plus_5.c: Likewise.
            * gcc.target/aarch64/max_plus_6.c: Likewise.
            * gcc.target/aarch64/max_plus_7.c: Likewise.
            * gcc.target/aarch64/min_plus_1.c: Likewise.
            * gcc.target/aarch64/min_plus_2.c: Likewise.
            * gcc.target/aarch64/min_plus_3.c: Likewise.
            * gcc.target/aarch64/min_plus_4.c: Likewise.
            * gcc.target/aarch64/min_plus_5.c: Likewise.
            * gcc.target/aarch64/min_plus_6.c: Likewise.
            * gcc.target/aarch64/min_plus_7.c: Likewise.
Comment 5 Richard Sandiford 2022-02-15 18:15:43 UTC
Fixed.