91029 – missed optimization regarding value of modulo operation

Bug 91029 - missed optimization regarding value of modulo operation

Summary: missed optimization regarding value of modulo operation

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	9.1.0

Importance:	P3 normal
Target Milestone:	11.0
Assignee:	Not yet assigned to anyone

URL:
Keywords:	missed-optimization

Depends on:
Blocks:	VRP
	Show dependency tree / graph

Reported:	2019-06-28 12:57 UTC by Bruno Haible
Modified:	2023-11-26 13:16 UTC (History)
CC List:	2 users (show)

See Also:
Host:	x86_64-pc-linux-gnu
Target:	x86_64-pc-linux-gnu
Build:	x86_64-pc-linux-gnu
Known to work:
Known to fail:	9.1.0
Last reconfirmed:	2019-07-01 00:00:00

Attachments
gcc11-pr91029-2.patch (921 bytes, patch) 2020-11-19 12:22 UTC, Jakub Jelinek	Details \| Diff
op2_range implementation (624 bytes, patch) 2020-11-19 15:32 UTC, Andrew Macleod	Details \| Diff
gcc11-pr91029.patch (921 bytes, patch) 2020-11-19 19:55 UTC, Jakub Jelinek	Details \| Diff
Show Obsolete (1) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Bruno Haible 2019-06-28 12:57:35 UTC

GCC's optimizers apparently don't know that, for b an integer >= 0, a % b > 0 implies that a >= 0 and a % b < 0 implies that a <= 0.

Test case:
============= foo.c =============
int xx;

void f (int i)
{
  if ((i % 7) == 3)
    xx = (i < 0);
}
=================================
$ gcc -O2 -m32 -S foo.c && fgrep -v .cfi foo.s
        .file   "foo.c"
        .text
        .p2align 4
        .globl  f
        .type   f, @function
f:
.LFB0:
        movl    4(%esp), %ecx
        movl    $-1840700269, %edx
        movl    %ecx, %eax
        imull   %edx
        movl    %ecx, %eax
        sarl    $31, %eax
        addl    %ecx, %edx
        sarl    $2, %edx
        subl    %eax, %edx
        leal    0(,%edx,8), %eax
        subl    %edx, %eax
        movl    %ecx, %edx
        subl    %eax, %edx
        cmpl    $3, %edx
        je      .L4
        ret
        .p2align 4,,10
        .p2align 3
.L4:
        shrl    $31, %ecx
        movl    %ecx, xx
        ret
.LFE0:
        .size   f, .-f
        .comm   xx,4,4
        .ident  "GCC: (GNU) 9.1.0"
        .section        .note.GNU-stack,"",@progbits

The two instructions after .L4 could be optimized to
        movl    $0, xx
by observing that for i < 0, i % 7 is <= 0 and therefore never == 3.

Comment 1 Bruno Haible 2019-06-28 12:58:55 UTC

Related to bug 66552.

Comment 2 Richard Biener 2019-07-01 07:06:31 UTC

Confirmed.

Comment 3 GCC Commits 2020-11-17 22:01:02 UTC

The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:1e27e7a582a9b86bcf86f5c103cd947672746e97

commit r11-5111-g1e27e7a582a9b86bcf86f5c103cd947672746e97
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Tue Nov 17 14:47:58 2020 -0500

    recognize implied ranges for modulo.
    
    implement op1_range for modulo with implied positive and negative ranges.
    
            gcc/
            PR tree-optimization/91029
            * range-op.cc (operator_trunc_mod::op1_range): New.
            gcc/testsuite/
            * gcc.dg/pr91029.c: New.

Comment 4 Andrew Macleod 2020-11-17 23:07:25 UTC

I adjusted range-ops to properly calculate those ranges for 'a' in 
    LHS = a % b
when LHS and b match what was described.

I also added a test case that confirms the conditions are tracked and branches folded.

Comment 5 Bruno Haible 2020-11-18 00:50:46 UTC

Nice! Thank you.

Comment 6 GCC Commits 2020-11-18 21:17:39 UTC

The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:71c9d2b088c9d409a1bd3b50523ac4623a5bf1b4

commit r11-5150-g71c9d2b088c9d409a1bd3b50523ac4623a5bf1b4
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Nov 18 22:13:06 2020 +0100

    vrp: Fix operator_trunc_mod::op1_range [PR97888]
    
    As mentioned in the PR, in (x % y) >= 0 && y >= 0, we can't deduce
    x's range to be x >= 0, as e.g. -7 % 7 is 0.  But we can deduce it
    from (x % y) > 0.  The patch also fixes up the comments.
    
    2020-11-18  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/91029
            PR tree-optimization/97888
            * range-op.cc (operator_trunc_mod::op1_range): Only set op1
            range to >= 0 if lhs is > 0, rather than >= 0.  Fix up comments.
    
            * gcc.dg/pr91029.c: Add comment with PR number.
            (f2): Use > 0 rather than >= 0.
            * gcc.c-torture/execute/pr97888-1.c: New test.
            * gcc.c-torture/execute/pr97888-2.c: New test.

Comment 7 Jakub Jelinek 2020-11-18 22:23:49 UTC

Actually, if (a % b) > 0 && b >= 0, can't we infer that a > 0?  For a == 0 the
(a % b) expression would be 0.
Similarly, if say (a % b) > 2 && b >= 0, can't we infer that a > 2, generally
(a % b) > x && x >= 0 && b >= 0 implies a > x
(a % b) < x && x <= 0 && b >= 0 implies a < x

Also, what is the reason to require that b >= 0 in all of this?
Isn't a % -b == a % b (except for b equal to INT_MIN, in that case
a % INT_MIN is a == INT_MIN ? 0 : a, but that also satisfies a % INT_MIN > 0
implies a > 0, a % INT_MIN < 0 implies a < 0, or say a % INT_MIN > 30 implies a > 30 or a % INT_MIN < -42 implies a < -42.

So, shouldn't the rules be
(a % b) > x && x >= 0 implies a > x
(a % b) < x && x <= 0 implies a < x
(a % b) > x && x >= 0 implies b > x || b < -x
(a % b) < x && x <= 0 implies b > -x || b < x
?

Comment 8 Bruno Haible 2020-11-19 01:16:26 UTC

> what is the reason to require that b >= 0 in all of this?

In the 1990ies there were portability problems with a%b, b < 0. ANSI C said that the result was machine-dependent if a < 0 or b < 0. Fortunately the result is formally specified now, since ISO C 99.

You're right: Since GCC emits the instructions for the % operation, and supposedly in compliance with ISO C and ISO C++, it can assume that negative operands behave as specified.

> So, shouldn't the rules be

Yes, these 4 rules look correct.

Comment 9 Jakub Jelinek 2020-11-19 12:22:03 UTC

Created attachment 49595 [details]
gcc11-pr91029-2.patch

Untested patch implementing the op1 rules.  Dunno what to do for op2, one needs to create a union of two ranges and I have no idea how to create that.

Comment 10 Andrew Macleod 2020-11-19 15:32:09 UTC

Created attachment 49597 [details]
op2_range implementation

I think this does what you want for op2_range.

I tried it with:

void f1 (int i, int j)
{
  if ((i % j) == 3)
    {
      xx = ((j <= 3) && (j >= -3));
      if (xx)
        kill ();
    }
}

and it eliminates the call. the listing shows:
    if (_1 == 3)
      goto <bb 3>; [INV]
    else
      goto <bb 5>; [INV]

_1 : int [-2147483647, +INF]
2->3  (T) _1 :  int [3, 3]
2->3  (T) i_8(D) :      int [3, +INF]
2->3  (T) j_9(D) :      int [-INF, -4][4, +INF]
2->5  (F) _1 :  int [-2147483647, 2][4, +INF]

so its got int [-INF, -4][4, +INF] for the range which I think is correct?

I got a headache trying to write a testcase for the different cases, and I'd probably get it wrong like I did with the initial implementation..  So I leave this with you to do what you will :-) 

I get all tangled up with the negatives.

Comment 11 Jakub Jelinek 2020-11-19 19:55:08 UTC

Created attachment 49599 [details]
gcc11-pr91029.patch

Here is what I meant.

Comment 12 GCC Commits 2020-11-19 23:03:04 UTC

The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:d3f293348768667c07770e433ff00af51fee73a2

commit r11-5186-gd3f293348768667c07770e433ff00af51fee73a2
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Nov 20 00:02:21 2020 +0100

    ranger: Improve a % b operand ranges [PR91029]
    
    As mentioned in the PR, the previous PR91029 patch was testing
    op2 >= 0 which is unnecessary, even negative op2 values will work the same,
    furthermore, from if a % b > 0 we can deduce a > 0 rather than just a >= 0
    (0 % b would be 0), and it actually valid even for other constants than 0,
    a % b > 5 means a > 5 (a % b has the same sign as a and a in [0, 5] would
    result in a % b in [0, 5].  Also, we can deduce a range for the other
    operand, if we know
    a % b >= 20, then b must be (in absolute value for signed modulo) > 20,
    for a % [0, 20] the result would be [0, 19].
    
    2020-11-19  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/91029
            * range-op.cc (operator_trunc_mod::op1_range): Don't require signed
            types, nor require that op2 >= 0.  Implement (a % b) >= x && x > 0
            implies a >= x and (a % b) <= x && x < 0 implies a <= x.
            (operator_trunc_mod::op2_range): New method.
    
            * gcc.dg/tree-ssa/pr91029-1.c: New test.
            * gcc.dg/tree-ssa/pr91029-2.c: New test.