Bug 96674 - Failure to optimize combination of comparisons to dec+compare
Summary: Failure to optimize combination of comparisons to dec+compare
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 11.0
: P3 enhancement
Target Milestone: 11.0
Assignee: Not yet assigned to anyone
URL:
Keywords: easyhack, missed-optimization
Depends on:
Blocks: 19987
  Show dependency treegraph
 
Reported: 2020-08-18 10:30 UTC by Gabriel Ravier
Modified: 2023-09-03 18:59 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-08-25 00:00:00


Attachments
[PATCH] Optimize combination of comparisons to dec+compare (1.18 KB, patch)
2021-01-11 22:53 UTC, Eugene Rozenfeld
Details | Diff
Optimize combination of comparisons to dec+compare (1.29 KB, patch)
2021-01-14 21:06 UTC, Eugene Rozenfeld
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gabriel Ravier 2020-08-18 10:30:46 UTC
bool f(unsigned a, unsigned b)
{
    return (b == 0) | (a < b);
}

This can be optimized to `return a <= (b - 1);`. This transformation is done by LLVM, but not by GCC.
Comment 1 Eugene Rozenfeld 2021-01-11 22:53:18 UTC
Created attachment 49940 [details]
[PATCH] Optimize combination of comparisons to dec+compare

The patch has been approved by Richard Biener.
Comment 2 Gabriel Ravier 2021-01-12 11:52:04 UTC
Isn't __attribute__((noipa)) usually used instead of __attribute__((noinline)) ?
Comment 3 Eugene Rozenfeld 2021-01-12 23:12:46 UTC
Both are used but it looks like __attribute__((noinline)) is used more frequently.
Under gcc/testsuite there are 1537 instances of __attribute__((noipa)) and 3794 instances of __attribute__((noinline)).
Comment 4 Gabriel Ravier 2021-01-12 23:34:29 UTC
I'd assume those are for older test cases: __attribute__((noipa)) makes more sense (at least to me) considering it's made specifically to prevent inter-procedural optimization (which __attribute__((noinline)) does not, afaik) which is helpful for testing specific optimizations on specific functions and seeing whether the optimizations make those functions non-functional without potentially having problems with inter-procedural optimization doing stuff like constant-propagation or anything that could interfere with the proper testing of a specific optimization.
Comment 5 Jakub Jelinek 2021-01-12 23:45:39 UTC
/* x < y || x == XXX_MIN --> x <= y - 1 */
(simplify
 (bit_ior (eq @1 min_value) (lt @0 @1))
  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
       && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
  (le @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }))))
The comment doesn't match what the simplification implements (x == XXX_MIN should be y == XXX_MIN).
Furthermore, bit_ior is commutative and for the optimization no specific order is needed, so probably bit_ior:c is needed.  Also, the optimization doesn't seem to be worth if either eq or lt has multiple uses, so both should have :s
suffixes.  When x < y || y == min can be simplified into x <= y - 1, can't
its negation, i.e. x >= y && y != min be simplified into x > y - 1 ?
And agree on the noipa attribute, most of the tests you're citing just predate the noipa attribute.  We had noinline for many years, later added noclone and have been using noinline, noclone and when we started adding further IPA optimizations, noipa has been added.
Comment 6 Eugene Rozenfeld 2021-01-14 21:06:02 UTC
Created attachment 49969 [details]
Optimize combination of comparisons to dec+compare
Comment 7 Eugene Rozenfeld 2021-01-14 21:07:11 UTC
Thank you for the feedback, Gabriel and Jakub. I re-worked the patch based on your suggestions. I attached the new patch and also sent it to gcc-patches.
Comment 8 Uroš Bizjak 2021-01-14 21:30:54 UTC
Comment on attachment 49969 [details]
Optimize combination of comparisons to dec+compare

>+/* y == XXX_MIN || x < y --> x <= y - 1 */

Can we use TYPE_MIN instead of XXX_MIN?
Comment 9 Eugene Rozenfeld 2021-01-15 00:32:24 UTC
I used XXX_MIN for consistency with comments on other patterns. If TYPE_MIN is preferable, the change should be made in all of those comments as well.
Comment 10 GCC Commits 2021-01-20 15:32:22 UTC
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:49e8c14ef6f1f968602a04c8499a672182590e87

commit r11-6817-g49e8c14ef6f1f968602a04c8499a672182590e87
Author: Eugene Rozenfeld <erozen@microsoft.com>
Date:   Wed Dec 9 16:44:25 2020 -0800

    Optimize combination of comparisons to dec+compare
    
    This patch adds patterns for optimizing
    x < y || y == XXX_MIN to x <= y-1
    x >= y && y != XXX_MIN to x > y-1
    if y is an integer with TYPE_OVERFLOW_WRAPS.
    
    This fixes pr96674.
    
    Tested on x86_64-pc-linux-gnu.
    
    For this function
    
    bool f(unsigned a, unsigned b)
    {
        return (b == 0) | (a < b);
    }
    
    the code without the patch is
    
    test   esi,esi
    sete   al
    cmp    esi,edi
    seta   dl
    or     eax,edx
    ret
    
    the code with the patch is
    
    sub    esi,0x1
    cmp    esi,edi
    setae  al
    ret
    
            PR tree-optimization/96674
    gcc/
            * match.pd: New patterns: x < y || y == XXX_MIN --> x <= y - 1
            x >= y && y != XXX_MIN --> x > y - 1
    
    gcc/testsuite
            * gcc.dg/pr96674.c: New tests.
Comment 11 Jeffrey A. Law 2021-06-02 18:59:10 UTC
Resolved by Eugene's patch on the trunk.