Bug 107888 - [12/13 Regression] Missed min/max transformation in phiopt due to VRP
Summary: [12/13 Regression] Missed min/max transformation in phiopt due to VRP
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 13.0
: P2 normal
Target Milestone: 12.4
Assignee: Andrew Pinski
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2022-11-27 19:28 UTC by Andrew Pinski
Modified: 2024-05-08 03:42 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2022-12-21 00:00:00


Attachments
Patch which adds what I Mentioned (1.28 KB, text/plain)
2023-05-09 10:14 UTC, Andrew Pinski
Details
testcases (155 bytes, text/plain)
2023-05-09 10:19 UTC, Andrew Pinski
Details
Patch that actually works (2.36 KB, patch)
2023-05-10 19:10 UTC, Andrew Pinski
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Pinski 2022-11-27 19:28:54 UTC
Take:
```
#define bool _Bool
int maxbool(bool ab, bool bb)
{
  int a = ab;
  int b = bb;
  int c;
  if (a > b)
    c = a;
  else
    c = b;
  return c;
}
```

We miss that c is max of a and b because VRP decides to change the phi.
We get out of VRP:
```
  if (a_3 > b_5)
    goto <bb 4>; [INV]
  else
    goto <bb 3>; [INV]

  <bb 3> :

  <bb 4> :
  # c_1 = PHI <1(2), b_5(3)>
```

What VRP is doing is correct just is harder to optimize to a max (and then a | ).

In the above case we could optimize `bool0 ? 1 : bool1` to `bool0 | bool1` But then we end up with PR 107887 too.

You can also end up with the above issue where you know the only overlap between the two arguments is [5,6] :
```
int max(int ab, int bb)
{
  if (ab < 5)  __builtin_trap();
  if (bb > 6)  __builtin_trap();
  int a = ab;
  int b = bb;
  int c;
  if (a >= b)
    c = a;
  else
    c = b;
  return c;
}
```
Which we cannot optimize based on zero/one any more. (note this version of max has been an issue since at least GCC 4.1, I suspect since VRP was added).
Comment 1 Richard Biener 2022-11-28 07:57:12 UTC
which means we fail to optimize a > b ? 1 : b as well, no?
Comment 2 Andrew Pinski 2022-11-28 17:26:55 UTC
(In reply to Richard Biener from comment #1)
> which means we fail to optimize a > b ? 1 : b as well, no?

Yes that is correct.

Even for max, "a >= b ? a : 6;" would need to be "reverted" 6 back to b.

  <bb 5> [local count: 1073741824]:
  if (ab_2(D) >= bb_3(D))
    goto <bb 6>; [65.00%]
  else
    goto <bb 7>; [35.00%]

  <bb 6> [local count: 697932184]:

  <bb 7> [local count: 1073741824]:
  # c_1 = PHI <ab_2(D)(6), 6(5)>


The min/max patterns inside match needs to handle CST if the ranges of the two operands overlap with one/two values.

Even though this is a regression, I don't know if this shows up in real code and is a small optimization really so I would suspect a P4 for this really as it requires a bigger change that most likely won't be backported. I filed it as it was showing up while I was working on the patch for PR 101805 (which won't be submitted until GCC 14 anyways).
Comment 3 Richard Biener 2022-12-21 13:35:33 UTC
Confirmed at least.
Comment 4 Andrew Pinski 2023-05-06 00:38:35 UTC
So for the second testcase in comment #0 (with __builtin_trap replaced with __builtin_unreachable so at least we have ranges):

  # RANGE [irange] int [5, +INF] NONZERO 0x7fffffff
  intD.9 ab_2(D) = abD.2773;
  # RANGE [irange] int [-INF, 6]
  intD.9 bb_3(D) = bbD.2774;
  intD.9 cD.2779;

__BB(2):
  if (ab_2(D) >= bb_3(D))
    goto __BB4;
  else
    goto __BB3;

__BB(3):

__BB(4):
  # RANGE [irange] int [5, +INF] NONZERO 0x7fffffff
  # c_1 = PHI <ab_2(D)(2), 6(3)>

We could detect (cond (a ge b) a CST) and look at CST to see if it is the same as the max range of b and one more than the min of a. I think that will work.

That will also fix I think the first testcase.

Let me try to do that.
We won't get the __builtin_trap case correctly unless we do the full range information inside phiopt; I will have to look into that later.
Comment 5 Andrew Pinski 2023-05-08 05:35:52 UTC
I think this should not be hard to add to minmax_from_comparison I think. Though right now we don't call it for the non-constant case but that should be easy to fix I think.
Comment 6 Richard Biener 2023-05-08 12:26:07 UTC
GCC 12.3 is being released, retargeting bugs to GCC 12.4.
Comment 7 Andrew Pinski 2023-05-09 10:14:33 UTC
Created attachment 55026 [details]
Patch which adds what I Mentioned

I still need to add the testcases.
Comment 8 Andrew Pinski 2023-05-09 10:19:44 UTC
Created attachment 55027 [details]
testcases

max is optimized with this, max1 was already handled.
min was already handled, min1 is optimized with this.
Note at -O1, all 4 are done at phiopt1,
At -O2, only max1/min are done at phiopt1 and max/min1 are handled at phiopt2 due to now having the range.

Note I think having phiopt enable ranges overall might be too much overhead I so we might need to leave it this way.
Comment 9 Andrew Pinski 2023-05-10 18:11:53 UTC
By the way this does show up in GCC itself.
in worse_state in ipa-pure-const.cc where it does MAX of bools

and for x86's internal_min_issue_delay in insn-automata.cc
The below is the similar code to what is there:
unsigned f(unsigned temp1, unsigned temp2)
{
  unsigned temp, res;
  temp = temp1 & 3;
  res = temp;

  temp = temp2 & 1;
  if (temp > res)
    res = temp;
  return res;
}
Comment 10 Andrew Pinski 2023-05-10 19:10:00 UTC
Created attachment 55041 [details]
Patch that actually works

Here is the patch which actually works. The two testcases that I pushed recently have been falls out of messing up on the patch.
Comment 11 GCC Commits 2023-05-16 03:50:48 UTC
The trunk branch has been updated by Andrew Pinski <pinskia@gcc.gnu.org>:

https://gcc.gnu.org/g:b06cfb62229f17eca59fa4aabf853d7e17e2327b

commit r14-868-gb06cfb62229f17eca59fa4aabf853d7e17e2327b
Author: Andrew Pinski <apinski@marvell.com>
Date:   Mon May 15 21:44:27 2023 +0000

    MATCH: [PR109424] Simplify min/max of boolean arguments
    
    This is version 2 of https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577394.html
    which does not depend on adding gimple_truth_valued_p at this point.
    Instead will use zero_one_valued_p which is already used for mult simplifications
    to make sure that we only have [0,1] rather having the mistake of maybe having [-1,0]
    as the range for signed bools.
    
    This shows up in a few places in GCC itself but only at -O1, we miss the min/max conversion
    because of PR 107888 (which I will be testing seperately).
    
    OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
    
    Thanks,
    Andrew Pinski
    
            PR tree-optimization/109424
    
    gcc/ChangeLog:
    
            * match.pd: Add patterns for min/max of zero_one_valued
            values to `&`/`|`.
    
    gcc/testsuite/ChangeLog:
    
            * gcc.dg/tree-ssa/bool-12.c: New test.
            * gcc.dg/tree-ssa/bool-13.c: New test.
            * gcc.dg/tree-ssa/minmax-20.c: New test.
            * gcc.dg/tree-ssa/minmax-21.c: New test.