107888 – [12/13 Regression] Missed min/max transformation in phiopt due to VRP

Bug 107888 - [12/13 Regression] Missed min/max transformation in phiopt due to VRP

Summary: [12/13 Regression] Missed min/max transformation in phiopt due to VRP

Status:	ASSIGNED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	13.0

Importance:	P2 normal
Target Milestone:	12.4
Assignee:	Andrew Pinski

URL:
Keywords:	missed-optimization

Depends on:
Blocks:

Reported:	2022-11-27 19:28 UTC by Andrew Pinski
Modified:	2024-05-08 03:42 UTC (History)
CC List:	2 users (show)

See Also:	101805 114967
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2022-12-21 00:00:00

Attachments
Patch which adds what I Mentioned (1.28 KB, text/plain) 2023-05-09 10:14 UTC, Andrew Pinski	Details
testcases (155 bytes, text/plain) 2023-05-09 10:19 UTC, Andrew Pinski	Details
Patch that actually works (2.36 KB, patch) 2023-05-10 19:10 UTC, Andrew Pinski	Details \| Diff
Show Obsolete (1) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andrew Pinski 2022-11-27 19:28:54 UTC

Take:
```
#define bool _Bool
int maxbool(bool ab, bool bb)
{
  int a = ab;
  int b = bb;
  int c;
  if (a > b)
    c = a;
  else
    c = b;
  return c;
}
```

We miss that c is max of a and b because VRP decides to change the phi.
We get out of VRP:
```
  if (a_3 > b_5)
    goto <bb 4>; [INV]
  else
    goto <bb 3>; [INV]

  <bb 3> :

  <bb 4> :
  # c_1 = PHI <1(2), b_5(3)>
```

What VRP is doing is correct just is harder to optimize to a max (and then a | ).

In the above case we could optimize `bool0 ? 1 : bool1` to `bool0 | bool1` But then we end up with PR 107887 too.

You can also end up with the above issue where you know the only overlap between the two arguments is [5,6] :
```
int max(int ab, int bb)
{
  if (ab < 5)  __builtin_trap();
  if (bb > 6)  __builtin_trap();
  int a = ab;
  int b = bb;
  int c;
  if (a >= b)
    c = a;
  else
    c = b;
  return c;
}
```
Which we cannot optimize based on zero/one any more. (note this version of max has been an issue since at least GCC 4.1, I suspect since VRP was added).

Comment 1 Richard Biener 2022-11-28 07:57:12 UTC

which means we fail to optimize a > b ? 1 : b as well, no?

Comment 2 Andrew Pinski 2022-11-28 17:26:55 UTC

(In reply to Richard Biener from comment #1)
> which means we fail to optimize a > b ? 1 : b as well, no?

Yes that is correct.

Even for max, "a >= b ? a : 6;" would need to be "reverted" 6 back to b.

  <bb 5> [local count: 1073741824]:
  if (ab_2(D) >= bb_3(D))
    goto <bb 6>; [65.00%]
  else
    goto <bb 7>; [35.00%]

  <bb 6> [local count: 697932184]:

  <bb 7> [local count: 1073741824]:
  # c_1 = PHI <ab_2(D)(6), 6(5)>


The min/max patterns inside match needs to handle CST if the ranges of the two operands overlap with one/two values.

Even though this is a regression, I don't know if this shows up in real code and is a small optimization really so I would suspect a P4 for this really as it requires a bigger change that most likely won't be backported. I filed it as it was showing up while I was working on the patch for PR 101805 (which won't be submitted until GCC 14 anyways).

Comment 3 Richard Biener 2022-12-21 13:35:33 UTC

Confirmed at least.

Comment 4 Andrew Pinski 2023-05-06 00:38:35 UTC

So for the second testcase in comment #0 (with __builtin_trap replaced with __builtin_unreachable so at least we have ranges):

  # RANGE [irange] int [5, +INF] NONZERO 0x7fffffff
  intD.9 ab_2(D) = abD.2773;
  # RANGE [irange] int [-INF, 6]
  intD.9 bb_3(D) = bbD.2774;
  intD.9 cD.2779;

__BB(2):
  if (ab_2(D) >= bb_3(D))
    goto __BB4;
  else
    goto __BB3;

__BB(3):

__BB(4):
  # RANGE [irange] int [5, +INF] NONZERO 0x7fffffff
  # c_1 = PHI <ab_2(D)(2), 6(3)>

We could detect (cond (a ge b) a CST) and look at CST to see if it is the same as the max range of b and one more than the min of a. I think that will work.

That will also fix I think the first testcase.

Let me try to do that.
We won't get the __builtin_trap case correctly unless we do the full range information inside phiopt; I will have to look into that later.

Comment 5 Andrew Pinski 2023-05-08 05:35:52 UTC

I think this should not be hard to add to minmax_from_comparison I think. Though right now we don't call it for the non-constant case but that should be easy to fix I think.

Comment 6 Richard Biener 2023-05-08 12:26:07 UTC

GCC 12.3 is being released, retargeting bugs to GCC 12.4.

Comment 7 Andrew Pinski 2023-05-09 10:14:33 UTC

Created attachment 55026 [details]
Patch which adds what I Mentioned

I still need to add the testcases.

Comment 8 Andrew Pinski 2023-05-09 10:19:44 UTC

Created attachment 55027 [details]
testcases

max is optimized with this, max1 was already handled.
min was already handled, min1 is optimized with this.
Note at -O1, all 4 are done at phiopt1,
At -O2, only max1/min are done at phiopt1 and max/min1 are handled at phiopt2 due to now having the range.

Note I think having phiopt enable ranges overall might be too much overhead I so we might need to leave it this way.

Comment 9 Andrew Pinski 2023-05-10 18:11:53 UTC

By the way this does show up in GCC itself.
in worse_state in ipa-pure-const.cc where it does MAX of bools

and for x86's internal_min_issue_delay in insn-automata.cc
The below is the similar code to what is there:
unsigned f(unsigned temp1, unsigned temp2)
{
  unsigned temp, res;
  temp = temp1 & 3;
  res = temp;

  temp = temp2 & 1;
  if (temp > res)
    res = temp;
  return res;
}

Comment 10 Andrew Pinski 2023-05-10 19:10:00 UTC

Created attachment 55041 [details]
Patch that actually works

Here is the patch which actually works. The two testcases that I pushed recently have been falls out of messing up on the patch.

Comment 11 GCC Commits 2023-05-16 03:50:48 UTC

The trunk branch has been updated by Andrew Pinski <pinskia@gcc.gnu.org>:

https://gcc.gnu.org/g:b06cfb62229f17eca59fa4aabf853d7e17e2327b

commit r14-868-gb06cfb62229f17eca59fa4aabf853d7e17e2327b
Author: Andrew Pinski <apinski@marvell.com>
Date:   Mon May 15 21:44:27 2023 +0000

    MATCH: [PR109424] Simplify min/max of boolean arguments
    
    This is version 2 of https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577394.html
    which does not depend on adding gimple_truth_valued_p at this point.
    Instead will use zero_one_valued_p which is already used for mult simplifications
    to make sure that we only have [0,1] rather having the mistake of maybe having [-1,0]
    as the range for signed bools.
    
    This shows up in a few places in GCC itself but only at -O1, we miss the min/max conversion
    because of PR 107888 (which I will be testing seperately).
    
    OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
    
    Thanks,
    Andrew Pinski
    
            PR tree-optimization/109424
    
    gcc/ChangeLog:
    
            * match.pd: Add patterns for min/max of zero_one_valued
            values to `&`/`|`.
    
    gcc/testsuite/ChangeLog:
    
            * gcc.dg/tree-ssa/bool-12.c: New test.
            * gcc.dg/tree-ssa/bool-13.c: New test.
            * gcc.dg/tree-ssa/minmax-20.c: New test.
            * gcc.dg/tree-ssa/minmax-21.c: New test.