Take: ``` #define bool _Bool int maxbool(bool ab, bool bb) { int a = ab; int b = bb; int c; if (a > b) c = a; else c = b; return c; } ``` We miss that c is max of a and b because VRP decides to change the phi. We get out of VRP: ``` if (a_3 > b_5) goto <bb 4>; [INV] else goto <bb 3>; [INV] <bb 3> : <bb 4> : # c_1 = PHI <1(2), b_5(3)> ``` What VRP is doing is correct just is harder to optimize to a max (and then a | ). In the above case we could optimize `bool0 ? 1 : bool1` to `bool0 | bool1` But then we end up with PR 107887 too. You can also end up with the above issue where you know the only overlap between the two arguments is [5,6] : ``` int max(int ab, int bb) { if (ab < 5) __builtin_trap(); if (bb > 6) __builtin_trap(); int a = ab; int b = bb; int c; if (a >= b) c = a; else c = b; return c; } ``` Which we cannot optimize based on zero/one any more. (note this version of max has been an issue since at least GCC 4.1, I suspect since VRP was added).
which means we fail to optimize a > b ? 1 : b as well, no?
(In reply to Richard Biener from comment #1) > which means we fail to optimize a > b ? 1 : b as well, no? Yes that is correct. Even for max, "a >= b ? a : 6;" would need to be "reverted" 6 back to b. <bb 5> [local count: 1073741824]: if (ab_2(D) >= bb_3(D)) goto <bb 6>; [65.00%] else goto <bb 7>; [35.00%] <bb 6> [local count: 697932184]: <bb 7> [local count: 1073741824]: # c_1 = PHI <ab_2(D)(6), 6(5)> The min/max patterns inside match needs to handle CST if the ranges of the two operands overlap with one/two values. Even though this is a regression, I don't know if this shows up in real code and is a small optimization really so I would suspect a P4 for this really as it requires a bigger change that most likely won't be backported. I filed it as it was showing up while I was working on the patch for PR 101805 (which won't be submitted until GCC 14 anyways).
Confirmed at least.
So for the second testcase in comment #0 (with __builtin_trap replaced with __builtin_unreachable so at least we have ranges): # RANGE [irange] int [5, +INF] NONZERO 0x7fffffff intD.9 ab_2(D) = abD.2773; # RANGE [irange] int [-INF, 6] intD.9 bb_3(D) = bbD.2774; intD.9 cD.2779; __BB(2): if (ab_2(D) >= bb_3(D)) goto __BB4; else goto __BB3; __BB(3): __BB(4): # RANGE [irange] int [5, +INF] NONZERO 0x7fffffff # c_1 = PHI <ab_2(D)(2), 6(3)> We could detect (cond (a ge b) a CST) and look at CST to see if it is the same as the max range of b and one more than the min of a. I think that will work. That will also fix I think the first testcase. Let me try to do that. We won't get the __builtin_trap case correctly unless we do the full range information inside phiopt; I will have to look into that later.
I think this should not be hard to add to minmax_from_comparison I think. Though right now we don't call it for the non-constant case but that should be easy to fix I think.
GCC 12.3 is being released, retargeting bugs to GCC 12.4.
Created attachment 55026 [details] Patch which adds what I Mentioned I still need to add the testcases.
Created attachment 55027 [details] testcases max is optimized with this, max1 was already handled. min was already handled, min1 is optimized with this. Note at -O1, all 4 are done at phiopt1, At -O2, only max1/min are done at phiopt1 and max/min1 are handled at phiopt2 due to now having the range. Note I think having phiopt enable ranges overall might be too much overhead I so we might need to leave it this way.
By the way this does show up in GCC itself. in worse_state in ipa-pure-const.cc where it does MAX of bools and for x86's internal_min_issue_delay in insn-automata.cc The below is the similar code to what is there: unsigned f(unsigned temp1, unsigned temp2) { unsigned temp, res; temp = temp1 & 3; res = temp; temp = temp2 & 1; if (temp > res) res = temp; return res; }
Created attachment 55041 [details] Patch that actually works Here is the patch which actually works. The two testcases that I pushed recently have been falls out of messing up on the patch.
The trunk branch has been updated by Andrew Pinski <pinskia@gcc.gnu.org>: https://gcc.gnu.org/g:b06cfb62229f17eca59fa4aabf853d7e17e2327b commit r14-868-gb06cfb62229f17eca59fa4aabf853d7e17e2327b Author: Andrew Pinski <apinski@marvell.com> Date: Mon May 15 21:44:27 2023 +0000 MATCH: [PR109424] Simplify min/max of boolean arguments This is version 2 of https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577394.html which does not depend on adding gimple_truth_valued_p at this point. Instead will use zero_one_valued_p which is already used for mult simplifications to make sure that we only have [0,1] rather having the mistake of maybe having [-1,0] as the range for signed bools. This shows up in a few places in GCC itself but only at -O1, we miss the min/max conversion because of PR 107888 (which I will be testing seperately). OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. Thanks, Andrew Pinski PR tree-optimization/109424 gcc/ChangeLog: * match.pd: Add patterns for min/max of zero_one_valued values to `&`/`|`. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/bool-12.c: New test. * gcc.dg/tree-ssa/bool-13.c: New test. * gcc.dg/tree-ssa/minmax-20.c: New test. * gcc.dg/tree-ssa/minmax-21.c: New test.