Bug 91029 - missed optimization regarding value of modulo operation
Summary: missed optimization regarding value of modulo operation
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 9.1.0
: P3 normal
Target Milestone: 11.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: VRP
  Show dependency treegraph
 
Reported: 2019-06-28 12:57 UTC by Bruno Haible
Modified: 2023-11-26 13:16 UTC (History)
2 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu
Build: x86_64-pc-linux-gnu
Known to work:
Known to fail: 9.1.0
Last reconfirmed: 2019-07-01 00:00:00


Attachments
gcc11-pr91029-2.patch (921 bytes, patch)
2020-11-19 12:22 UTC, Jakub Jelinek
Details | Diff
op2_range implementation (624 bytes, patch)
2020-11-19 15:32 UTC, Andrew Macleod
Details | Diff
gcc11-pr91029.patch (921 bytes, patch)
2020-11-19 19:55 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Bruno Haible 2019-06-28 12:57:35 UTC
GCC's optimizers apparently don't know that, for b an integer >= 0, a % b > 0 implies that a >= 0 and a % b < 0 implies that a <= 0.

Test case:
============= foo.c =============
int xx;

void f (int i)
{
  if ((i % 7) == 3)
    xx = (i < 0);
}
=================================
$ gcc -O2 -m32 -S foo.c && fgrep -v .cfi foo.s
        .file   "foo.c"
        .text
        .p2align 4
        .globl  f
        .type   f, @function
f:
.LFB0:
        movl    4(%esp), %ecx
        movl    $-1840700269, %edx
        movl    %ecx, %eax
        imull   %edx
        movl    %ecx, %eax
        sarl    $31, %eax
        addl    %ecx, %edx
        sarl    $2, %edx
        subl    %eax, %edx
        leal    0(,%edx,8), %eax
        subl    %edx, %eax
        movl    %ecx, %edx
        subl    %eax, %edx
        cmpl    $3, %edx
        je      .L4
        ret
        .p2align 4,,10
        .p2align 3
.L4:
        shrl    $31, %ecx
        movl    %ecx, xx
        ret
.LFE0:
        .size   f, .-f
        .comm   xx,4,4
        .ident  "GCC: (GNU) 9.1.0"
        .section        .note.GNU-stack,"",@progbits

The two instructions after .L4 could be optimized to
        movl    $0, xx
by observing that for i < 0, i % 7 is <= 0 and therefore never == 3.
Comment 1 Bruno Haible 2019-06-28 12:58:55 UTC
Related to bug 66552.
Comment 2 Richard Biener 2019-07-01 07:06:31 UTC
Confirmed.
Comment 3 GCC Commits 2020-11-17 22:01:02 UTC
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:1e27e7a582a9b86bcf86f5c103cd947672746e97

commit r11-5111-g1e27e7a582a9b86bcf86f5c103cd947672746e97
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Tue Nov 17 14:47:58 2020 -0500

    recognize implied ranges for modulo.
    
    implement op1_range for modulo with implied positive and negative ranges.
    
            gcc/
            PR tree-optimization/91029
            * range-op.cc (operator_trunc_mod::op1_range): New.
            gcc/testsuite/
            * gcc.dg/pr91029.c: New.
Comment 4 Andrew Macleod 2020-11-17 23:07:25 UTC
I adjusted range-ops to properly calculate those ranges for 'a' in 
    LHS = a % b
when LHS and b match what was described.

I also added a test case that confirms the conditions are tracked and branches folded.
Comment 5 Bruno Haible 2020-11-18 00:50:46 UTC
Nice! Thank you.
Comment 6 GCC Commits 2020-11-18 21:17:39 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:71c9d2b088c9d409a1bd3b50523ac4623a5bf1b4

commit r11-5150-g71c9d2b088c9d409a1bd3b50523ac4623a5bf1b4
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Nov 18 22:13:06 2020 +0100

    vrp: Fix operator_trunc_mod::op1_range [PR97888]
    
    As mentioned in the PR, in (x % y) >= 0 && y >= 0, we can't deduce
    x's range to be x >= 0, as e.g. -7 % 7 is 0.  But we can deduce it
    from (x % y) > 0.  The patch also fixes up the comments.
    
    2020-11-18  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/91029
            PR tree-optimization/97888
            * range-op.cc (operator_trunc_mod::op1_range): Only set op1
            range to >= 0 if lhs is > 0, rather than >= 0.  Fix up comments.
    
            * gcc.dg/pr91029.c: Add comment with PR number.
            (f2): Use > 0 rather than >= 0.
            * gcc.c-torture/execute/pr97888-1.c: New test.
            * gcc.c-torture/execute/pr97888-2.c: New test.
Comment 7 Jakub Jelinek 2020-11-18 22:23:49 UTC
Actually, if (a % b) > 0 && b >= 0, can't we infer that a > 0?  For a == 0 the
(a % b) expression would be 0.
Similarly, if say (a % b) > 2 && b >= 0, can't we infer that a > 2, generally
(a % b) > x && x >= 0 && b >= 0 implies a > x
(a % b) < x && x <= 0 && b >= 0 implies a < x

Also, what is the reason to require that b >= 0 in all of this?
Isn't a % -b == a % b (except for b equal to INT_MIN, in that case
a % INT_MIN is a == INT_MIN ? 0 : a, but that also satisfies a % INT_MIN > 0
implies a > 0, a % INT_MIN < 0 implies a < 0, or say a % INT_MIN > 30 implies a > 30 or a % INT_MIN < -42 implies a < -42.

So, shouldn't the rules be
(a % b) > x && x >= 0 implies a > x
(a % b) < x && x <= 0 implies a < x
(a % b) > x && x >= 0 implies b > x || b < -x
(a % b) < x && x <= 0 implies b > -x || b < x
?
Comment 8 Bruno Haible 2020-11-19 01:16:26 UTC
> what is the reason to require that b >= 0 in all of this?

In the 1990ies there were portability problems with a%b, b < 0. ANSI C said that the result was machine-dependent if a < 0 or b < 0. Fortunately the result is formally specified now, since ISO C 99.

You're right: Since GCC emits the instructions for the % operation, and supposedly in compliance with ISO C and ISO C++, it can assume that negative operands behave as specified.

> So, shouldn't the rules be

Yes, these 4 rules look correct.
Comment 9 Jakub Jelinek 2020-11-19 12:22:03 UTC
Created attachment 49595 [details]
gcc11-pr91029-2.patch

Untested patch implementing the op1 rules.  Dunno what to do for op2, one needs to create a union of two ranges and I have no idea how to create that.
Comment 10 Andrew Macleod 2020-11-19 15:32:09 UTC
Created attachment 49597 [details]
op2_range implementation

I think this does what you want for op2_range.

I tried it with:

void f1 (int i, int j)
{
  if ((i % j) == 3)
    {
      xx = ((j <= 3) && (j >= -3));
      if (xx)
        kill ();
    }
}

and it eliminates the call. the listing shows:
    if (_1 == 3)
      goto <bb 3>; [INV]
    else
      goto <bb 5>; [INV]

_1 : int [-2147483647, +INF]
2->3  (T) _1 :  int [3, 3]
2->3  (T) i_8(D) :      int [3, +INF]
2->3  (T) j_9(D) :      int [-INF, -4][4, +INF]
2->5  (F) _1 :  int [-2147483647, 2][4, +INF]

so its got int [-INF, -4][4, +INF] for the range which I think is correct?

I got a headache trying to write a testcase for the different cases, and I'd probably get it wrong like I did with the initial implementation..  So I leave this with you to do what you will :-) 

I get all tangled up with the negatives.
Comment 11 Jakub Jelinek 2020-11-19 19:55:08 UTC
Created attachment 49599 [details]
gcc11-pr91029.patch

Here is what I meant.
Comment 12 GCC Commits 2020-11-19 23:03:04 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:d3f293348768667c07770e433ff00af51fee73a2

commit r11-5186-gd3f293348768667c07770e433ff00af51fee73a2
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Nov 20 00:02:21 2020 +0100

    ranger: Improve a % b operand ranges [PR91029]
    
    As mentioned in the PR, the previous PR91029 patch was testing
    op2 >= 0 which is unnecessary, even negative op2 values will work the same,
    furthermore, from if a % b > 0 we can deduce a > 0 rather than just a >= 0
    (0 % b would be 0), and it actually valid even for other constants than 0,
    a % b > 5 means a > 5 (a % b has the same sign as a and a in [0, 5] would
    result in a % b in [0, 5].  Also, we can deduce a range for the other
    operand, if we know
    a % b >= 20, then b must be (in absolute value for signed modulo) > 20,
    for a % [0, 20] the result would be [0, 19].
    
    2020-11-19  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/91029
            * range-op.cc (operator_trunc_mod::op1_range): Don't require signed
            types, nor require that op2 >= 0.  Implement (a % b) >= x && x > 0
            implies a >= x and (a % b) <= x && x < 0 implies a <= x.
            (operator_trunc_mod::op2_range): New method.
    
            * gcc.dg/tree-ssa/pr91029-1.c: New test.
            * gcc.dg/tree-ssa/pr91029-2.c: New test.