g++-10 -std=gu++2a -O3 #include <compare> bool k(int i){ auto c=i<=>0; return c>0; } <bb 2> [local count: 1073741824]: if (i_1(D) != 0) goto <bb 3>; [50.00%] else goto <bb 4>; [50.00%] <bb 3> [local count: 536870913]: _2 = i_1(D) >= 0; <bb 4> [local count: 1073741824]: # prephitmp_6 = PHI <_2(3), 0(2)> return prephitmp_6; For most comparisons @ we do optimize (i<=>0)@0 to just i@0, but not for > and <=. Spaceship operator<=> is very painful to use, but I expect we will end up seeing a lot of it with C++20, and comparing its result with 0 is almost the only way to use its output, so it seems important to optimize this common case. (there is probably a very old dup, but I couldn't find it)
Hmm. I would have said phiopt but then there's the missing opportunity to handle PHIs as COND_EXPRs and simplify them via match.pd patterns. In this case simplify i_1(D) != 0 ? i_1(D) >= 0 : 0 that could be forwprops job then. The only pecularity is when the PHI isn't the only thing controlled by the condition, thus the control flow not vanishing with replacing the PHI. In this case we'd turn the code into <bb 2> [local count: 1073741824]: if (i_1(D) != 0) goto <bb 3>; [50.00%] else goto <bb 4>; [50.00%] <bb 3> [local count: 536870913]: _2 = i_1(D) >= 0; <bb 4> [local count: 1073741824]: prephitmp_6 = i_1(D) > 0; return prephitmp_6; and leave the rest to DCE. Obviously if the other two compares are not dead afterwards this isn't going to be a good idea. And CFG pattern matching is done by phiopt - which for the actual simplification could resort to match.pd simplifying a COND_EXPR as well. And yeah, I think there's a dup about PHIs and "folding".
Note that gcc looks bad even on the example from Microsoft's blog post: https://godbolt.org/z/Jc7TcN The fact that MSVC also looks bad on example from Microsoft is not really relevant, it's MSVC, after all. Blogpost itself is here: https://devblogs.microsoft.com/cppblog/simplify-your-code-with-rocket-science-c20s-spaceship-operator/
*** Bug 100322 has been marked as a duplicate of this bug. ***
PR 100322 shows that this missed-optimization causes a regression for code using std::chrono::duration types. Since C++20 their comparisons use operator<=> and so produce much worse code than the same source compiled as C++17.
(In reply to Marc Glisse from comment #0) > For most comparisons @ we do optimize (i<=>0)@0 to just i@0, but not for > > and <=. Spaceship operator<=> is very painful to use, but I expect we will > end up seeing a lot of it with C++20, and comparing its result with 0 is > almost the only way to use its output, so it seems important to optimize > this common case. Maybe even common enough to transform it in the FE. (In reply to Victor Khimenko from comment #2) > Note that gcc looks bad even on the example from Microsoft's blog post: > > https://godbolt.org/z/Jc7TcN Right, this doesn't only affect (i<=>0)@0 but also (i<=>j)@0 which is very common, because it's what the FE synthesizes for operator@ when the type only defines operator<=> https://godbolt.org/z/19dM8PdaM #include <compare> struct X { int i = 0; auto operator<=>(const X&) const = default; }; bool lt(X l, X r) { return l<r; } struct Y { int i = 0; bool operator<(Y rhs) const { return i < rhs.i; } }; bool lt(Y l, Y r) { return l<r; } Defining <=> once is easier than defining all of < > <= >= but the code is bad: lt(X, X): xor eax, eax cmp edi, esi je .L2 setge al lea eax, [rax-1+rax] .L2: shr al, 7 ret lt(Y, Y): cmp edi, esi setl al ret It seems like it should be possible for the FE to recognize that in this trivial case the defaulted <=> is just comparing integers, and therefore the synthesized op< could be transformed from (l.i <=> r.i) < 0 to simply l.i < r.i The FE is not required to synthesize exactly (l.i<=>r.i) @ 0 as long as the result is correct, so l.i @ r.i would be OK (and avoids the poor codegen until the missed-optimization gets done).
Note the original testcase is optimized with GCC 11 <bb 2> [local count: 1073741824]: _2 = i_1(D) > 0; return _2; but not on the GCC 10 branch. Not sure what fixed it there.
Some key steps in the optimization: PRE turns PHI<-1,0,1> > 0 into PHI<0,0,1> reassoc then combines the operations (it didn't in gcc-10) forwprop+phiopt cleans up (i>0)!=0?1:0 into just i>0. Having to wait until phiopt4 to get the simplified form is still very long, and most likely causes missed optimizations in earlier passes. But nice progress!
PR96480 would be my guess.
(In reply to Richard Biener from comment #6) > Not sure what fixed it there. Seems to be r11-2593
(In reply to Marc Glisse from comment #8) > PR96480 would be my guess. Yes
On Thu, 29 Apr 2021, glisse at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589 > > --- Comment #7 from Marc Glisse <glisse at gcc dot gnu.org> --- > Some key steps in the optimization: > PRE turns PHI<-1,0,1> > 0 into PHI<0,0,1> > reassoc then combines the operations (it didn't in gcc-10) > forwprop+phiopt cleans up (i>0)!=0?1:0 into just i>0. > > Having to wait until phiopt4 to get the simplified form is still very > long, and most likely causes missed optimizations in earlier passes. But > nice progress! Agreed - requiring PRE (and thus -O2) is also less than optimal for such a core feature. But the IL we get is simply awkward ;) ifcombine/phiopt2 see <bb 2> [local count: 1073741824]: if (i_1(D) == 0) goto <bb 5>; [50.00%] else goto <bb 3>; [50.00%] <bb 3> [local count: 536870913]: if (i_1(D) < 0) goto <bb 5>; [41.00%] else goto <bb 4>; [59.00%] <bb 4> [local count: 316753840]: <bb 5> [local count: 1073741824]: # c$_M_value_2 = PHI <0(2), -1(3), 1(4)> _4 = c$_M_value_2 > 0; I guess that's what we should try to work with. For PR99997 I have prototyped a forwprop patch to try constant folding stmts with all-constant PHIs, thus in this case c$_M_value_2 > 0, when there's only a single use of it (that basically does what PRE later does but earlier and for a very small subset of cases).
(In reply to rguenther@suse.de from comment #11) > For PR99997 > I have prototyped a forwprop patch to try constant folding > stmts with all-constant PHIs, thus in this case c$_M_value_2 > 0, > when there's only a single use of it Maybe we could handle any case where trying to fold the single use (counting x*x as a single use of x) with each possible value satisfies is_gimple_val (or whatever the condition is to be allowed in a PHI, and without introducing a use of a ssa_name before it is defined), so that things like PHI<X,0> & X would simplify. But the constant case is indeed the most important, and should allow the optimization in this PR before the vectorizer using reassoc1.
Created attachment 50707 [details] prototype For reference this is the prototype patch I mentioned. I wasn't entirely happy and wanted to explore the ??? in the commit message.
#include <compare> bool k(int a, int b){ auto c = (a <=> b); return c>0; } Produces <bb 2> [local count: 1073741824]: if (a_1(D) == b_3(D)) goto <bb 5>; [34.00%] else goto <bb 3>; [66.00%] <bb 3> [local count: 708669601]: if (a_1(D) < b_3(D)) goto <bb 5>; [50.00%] else goto <bb 4>; [50.00%] <bb 4> [local count: 354334801]: <bb 5> [local count: 1073741824]: # prephitmp_8 = PHI <0(2), 0(3), 1(4)> return prephitmp_8; from PRE (or the proposed patch) where it is not matched further. This kind of patters could be handled by phiopt.
C testcase (though surely we need C++ one too because with C++ there are aggregates and inline functions involved): int f1 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a == 0; } int f2 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a < 0; } int f3 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a > 0; } int f4 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a != 0; } int f5 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a <= 0; } int f6 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a >= 0; } int f7 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a == -1; } int f8 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a != -1; } int f9 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a == 1; } int f10 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a != 1; } int f11 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a == 0; } int f12 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a < 0; } int f13 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a > 0; } int f14 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a != 0; } int f15 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a <= 0; } int f16 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a >= 0; } int f17 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a == -1; } int f18 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a != -1; } int f19 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a == 1; } int f20 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a != 1; }
Writing a phiopt patch now.
Created attachment 50710 [details] gcc12-pr94589-wip.patch WIP patch that just matches those spaceship comparisons followed by single use comparison of that with -1/0/1, but doesn't yet perform any of the optimizations.
Created attachment 50719 [details] gcc12-pr94589-wip.patch Updated patch. This already does the optimizations, though needs testcase coverage and perhaps something better for debug uses of the PHI_RESULT besides the single non-debug use. And, it won't work together with the above mentioned forwprop patch, I'm afraid it would need to virtually undo that change but it would be hard.
Created attachment 50741 [details] gcc12-pr94589.patch Full untested patch.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:ad96c867e173c1ebcfc201b201adac5095683a08 commit r12-559-gad96c867e173c1ebcfc201b201adac5095683a08 Author: Jakub Jelinek <jakub@redhat.com> Date: Thu May 6 10:15:40 2021 +0200 phiopt: Optimize (x <=> y) cmp z [PR94589] genericize_spaceship genericizes i <=> j to approximately ({ int c; if (i == j) c = 0; else if (i < j) c = -1; else c = 1; c; }) for strong ordering and ({ int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; c; }) for partial ordering. The C++ standard supports then == or != comparisons of that against strong/partial ordering enums, or </<=/==/!=/>/>= comparisons of <=> result against literal 0. In some cases we already optimize that but in many cases we keep performing all the 2 or 3 comparisons, compute the spaceship value and then compare that. The following patch recognizes those patterns if the <=> operands are integral types or floating point (the latter only for -ffast-math) and optimizes it to the single comparison that is needed (plus adds debug stmts if needed for the spaceship result). There is one thing I'd like to address in a follow-up: the pr94589-2.C testcase should be matching just 12 times each, but runs into operator>=(partial_ordering, unspecified) being defined as (_M_value&1)==_M_value rather than _M_value>=0. When not honoring NaNs, the 2 case should be unreachable and so (_M_value&1)==_M_value is then equivalent to _M_value>=0, but is not a single use but two uses. I'll need to pattern match that case specially. 2021-05-06 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94589 * tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Call spaceship_replacement. (cond_only_block_p, spaceship_replacement): New functions. * gcc.dg/pr94589-1.c: New test. * gcc.dg/pr94589-2.c: New test. * gcc.dg/pr94589-3.c: New test. * gcc.dg/pr94589-4.c: New test. * g++.dg/opt/pr94589-1.C: New test. * g++.dg/opt/pr94589-2.C: New test. * g++.dg/opt/pr94589-3.C: New test. * g++.dg/opt/pr94589-4.C: New test.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:f5f1838435400b837c8677c53a611e2dc6d56442 commit r12-733-gf5f1838435400b837c8677c53a611e2dc6d56442 Author: Jakub Jelinek <jakub@redhat.com> Date: Wed May 12 09:46:03 2021 +0200 match.pd: Optimize (x & y) == x into (x & ~y) == 0 [PR94589] > Somewhere in RTL (_M_value&1)==_M_value is turned into (_M_value&-2)==0, > that could be worth doing already in GIMPLE. Apparently it is /* Simplify eq/ne (and/ior x y) x/y) for targets with a BICS instruction or constant folding if x/y is a constant. */ if ((code == EQ || code == NE) && (op0code == AND || op0code == IOR) && !side_effects_p (op1) && op1 != CONST0_RTX (cmp_mode)) { /* Both (eq/ne (and x y) x) and (eq/ne (ior x y) y) simplify to (eq/ne (and (not y) x) 0). */ ... /* Both (eq/ne (and x y) y) and (eq/ne (ior x y) x) simplify to (eq/ne (and (not x) y) 0). */ Yes, doing that on GIMPLE for the case where the not argument is constant would simplify the phiopt follow-up (it would be single imm use then). On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote: > We can probably do it in 2 steps, first something like > > (for cmp (eq ne) > (simplify > (cmp (bit_and:c @0 @1) @0) > (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); }))) > > to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a > mask 111...000 (I thought we already had a function to detect such masks, or > the 000...111, but I can't find them anymore). Ok, here is the first step then. 2021-05-12 Jakub Jelinek <jakub@redhat.com> Marc Glisse <marc.glisse@inria.fr> PR tree-optimization/94589 * match.pd ((X & Y) == X -> (X & ~Y) == 0, (X | Y) == Y -> (X & ~Y) == 0): New GIMPLE simplifications. * gcc.dg/tree-ssa/pr94589-1.c: New test.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:65061ea287a80cfb214e402cfd2373a14bfec95a commit r12-864-g65061ea287a80cfb214e402cfd2373a14bfec95a Author: Jakub Jelinek <jakub@redhat.com> Date: Tue May 18 10:08:51 2021 +0200 phiopt: Optimize partial_ordering spaceship >= 0 -ffinite-math-only [PR94589] As mentioned earlier, spaceship_replacement didn't optimize partial_ordering >= 0 comparisons, because the possible values are -1, 0, 1, 2 and the >= comparison is implemented as (res & 1) == res to choose the 0 and 1 cases from that. As we optimize that only with -ffinite-math-only, the 2 case is assumed not to happen and my earlier match.pd change optimizes (res & 1) == res into (res & ~1) == 0, so this patch pattern matches that case and handles it like res >= 0. 2021-05-18 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94589 * tree-ssa-phiopt.c (spaceship_replacement): Pattern match phi result used in (res & ~1) == 0 comparison as res >= 0 as res == 2 would be UB with -ffinite-math-only. * g++.dg/opt/pr94589-2.C: Adjust scan-tree-dump count from 14 to 12.
Fixed.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:2b536797f7e43c55072a3215735f5833f1d6d218 commit r12-935-g2b536797f7e43c55072a3215735f5833f1d6d218 Author: Jakub Jelinek <jakub@redhat.com> Date: Thu May 20 09:10:38 2021 +0200 phiopt: Simplify (X & Y) == X -> (X & ~Y) == 0 even in presence of integral conversions [PR94589] On Wed, May 19, 2021 at 10:15:53AM +0200, Christophe Lyon via Gcc-patches wrote: > After this update, the test fails on arm and aarch64: according to the > logs, the optimization is still performed 14 times. Seems this is because if (change && !flag_syntax_only && (load_extend_op (TYPE_MODE (TREE_TYPE (and0))) == ZERO_EXTEND)) { tree uns = unsigned_type_for (TREE_TYPE (and0)); and0 = fold_convert_loc (loc, uns, and0); and1 = fold_convert_loc (loc, uns, and1); } in fold-const.c adds on these targets extra casts that prevent the optimizations. 2021-05-20 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94589 * match.pd ((X & Y) == X -> (X & ~Y) == 0): Simplify even in presence of integral conversions.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:f1c777f40aa0b6941efc7440495a8d7e0cc2a1bb commit r12-964-gf1c777f40aa0b6941efc7440495a8d7e0cc2a1bb Author: Jakub Jelinek <jakub@redhat.com> Date: Fri May 21 10:39:50 2021 +0200 tree-optimization: Improve spaceship_replacement [PR94589] On Wed, May 19, 2021 at 01:30:31PM -0400, Jason Merrill via Gcc-patches wrote: > Here, when genericizing lexicographical_compare_three_way, we haven't yet > walked the operands, so (a == a) still sees ADDR_EXPR <a>, but this is after > we've changed the type of a to REFERENCE_TYPE. When we try to fold (a == a) > by constexpr evaluation, the constexpr code doesn't understand trying to > take the address of a reference, and we end up crashing. > > Fixed by avoiding constexpr evaluation in genericize_spaceship, by using > fold_build2 instead of build_new_op on scalar operands. Class operands > should have been expanded during parsing. Unfortunately this slightly changed the IL and spaceship_replacement no longer pattern matches it. Here are 3 improvements that make it match: 1) as mentioned in the comment above spaceship_replacement, for strong_ordering, we are pattern matching something like: x == y ? 0 : x < y ? -1 : 1; and for partial_ordering x == y ? 0 : x < y ? -1 : x > y ? 1 : 2; but given the == comparison done first and the other comparisons only if == was false, we actually don't care if the other comparisons are < vs. <= (or > vs. >=), provided the operands of the comparison are the same; we know == is false when doing those and < vs. <= or > vs. >= have the same behavior for NaNs too 2) when y is an integral constant, we should treat x < 5 equivalently to x <= 4 etc. 3) the code punted if cond2_phi_edge wasn't a EDGE_TRUE_VALUE edge, but as the new IL shows, that isn't really needed; given 1) that > and >= are equivalent in the code, any of swapping the comparison operands, changing L[TE]_EXPR to G[TE]_EXPR or vice versa or swapping the EDGE_TRUE_VALUE / EDGE_FALSE_VALUE bits on the edges reverses one of the two comparisons 2021-05-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94589 * tree-ssa-phiopt.c (spaceship_replacement): For integral rhs1 and rhs2, treat x <= 4 equivalently to x < 5 etc. In cmp1 and cmp2 (if not the same as cmp3) treat <= the same as < and >= the same as >. Don't require that cond2_phi_edge is true edge, instead take false/true edges into account based on cmp1/cmp2 comparison kinds.
Created attachment 55893 [details] Here is my idea around the patch for prototype for doing the constant prop idea This is like the one in comment #13 but handles it in phiopt rather than forwprop.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:ca7430f145f5c7960f67ec77f585f3a2b58c7d10 commit r15-5556-gca7430f145f5c7960f67ec77f585f3a2b58c7d10 Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Nov 21 09:38:01 2024 +0100 phiopt: Fix a pasto in spaceship_replacement [PR117612] When working on the PR117612 fix, I've noticed a pasto in tree-ssa-phiopt.cc (spaceship_replacement). The code is if (absu_hwi (tree_to_shwi (arg2)) != 1) return false; if (e1->flags & EDGE_TRUE_VALUE) { if (tree_to_shwi (arg0) != 2 || absu_hwi (tree_to_shwi (arg1)) != 1 || wi::to_widest (arg1) == wi::to_widest (arg2)) return false; } else if (tree_to_shwi (arg1) != 2 || absu_hwi (tree_to_shwi (arg0)) != 1 || wi::to_widest (arg0) == wi::to_widest (arg1)) return false; where arg{0,1,2,3} are PHI args and wants to ensure that if e1 is a true edge, then arg0 is 2 and one of arg{1,2} is -1 and one is 1, otherwise arg1 is 2 and one of arg{0,2} is -1 and one is 1. But due to pasto in the latte case doesn't verify that arg0 is different from arg2, it could be both -1 or both 1 and we wouldn't punt. The wi::to_widest (arg0) == wi::to_widest (arg1) test is always false when we've made sure in the earlier conditions that arg1 is 2 and arg0 is -1 or 1, so never 2. 2024-11-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94589 PR tree-optimization/117612 * tree-ssa-phiopt.cc (spaceship_replacement): Fix up a pasto in check when arg1 is 2.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:05ab9447fe80e5b1450192e21f3116890d38ecc7 commit r15-5557-g05ab9447fe80e5b1450192e21f3116890d38ecc7 Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Nov 21 09:39:06 2024 +0100 phiopt: Improve spaceship_replacement for HONOR_NANS [PR117612] The following patch optimizes spaceship followed by comparisons of the spaceship value even for floating point spaceship when NaNs can appear. operator<=> for this emits roughly signed char c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; and I believe the /* The optimization may be unsafe due to NaNs. */ comment just isn't true. Sure, the i == j comparison doesn't raise exceptions on qNaNs, but if one of the operands is qNaN, then i == j is false and i < j or i > j is then executed and raises exceptions even on qNaNs. And we can safely optimize say c == -1 comparison after the above into i < j, that also raises exceptions like before and handles NaNs the same way as the original. The only unsafe transormation would be c == 0 or c != 0, turning it into i == j or i != j wouldn't raise exception, so I'm not doing that optimization (but other parts of the compiler optimize the i < j comparison away anyway). Anyway, to match the HONOR_NANS case, we need to verify that the second comparison has true edge to the phi_bb (yielding there -1 or 1), it can't be the false edge because when NaNs are honored, the false edge is for both the case where the inverted comparison is true or when one of the operands is NaN. Similarly we need to ensure that the two non-equality comparisons are the opposite, while for -ffast-math we can in some cases get one comparison x >= 5.0 and the other x > 5.0 and it is fine, because NaN is UB, when NaNs are honored, they must be different to leave the unordered case with 2 value as the last one remaining. The patch also punts if HONOR_NANS and the phi has just 3 arguments instead of 4. When NaNs are honored, we also in some cases need to perform some comparison and then invert its result (so that exceptions are properly thrown and we get the correct result). 2024-11-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94589 PR tree-optimization/117612 * tree-ssa-phiopt.cc (spaceship_replacement): Handle HONOR_NANS (TREE_TYPE (lhs1)) case when possible. * gcc.dg/pr94589-5.c: New test. * gcc.dg/pr94589-6.c: New test. * g++.dg/opt/pr94589-5.C: New test. * g++.dg/opt/pr94589-6.C: New test.
The releases/gcc-14 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:3190d6263c90f8e859b738a9b39f5e650a4a3a16 commit r14-11142-g3190d6263c90f8e859b738a9b39f5e650a4a3a16 Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Nov 21 09:38:01 2024 +0100 phiopt: Fix a pasto in spaceship_replacement [PR117612] When working on the PR117612 fix, I've noticed a pasto in tree-ssa-phiopt.cc (spaceship_replacement). The code is if (absu_hwi (tree_to_shwi (arg2)) != 1) return false; if (e1->flags & EDGE_TRUE_VALUE) { if (tree_to_shwi (arg0) != 2 || absu_hwi (tree_to_shwi (arg1)) != 1 || wi::to_widest (arg1) == wi::to_widest (arg2)) return false; } else if (tree_to_shwi (arg1) != 2 || absu_hwi (tree_to_shwi (arg0)) != 1 || wi::to_widest (arg0) == wi::to_widest (arg1)) return false; where arg{0,1,2,3} are PHI args and wants to ensure that if e1 is a true edge, then arg0 is 2 and one of arg{1,2} is -1 and one is 1, otherwise arg1 is 2 and one of arg{0,2} is -1 and one is 1. But due to pasto in the latte case doesn't verify that arg0 is different from arg2, it could be both -1 or both 1 and we wouldn't punt. The wi::to_widest (arg0) == wi::to_widest (arg1) test is always false when we've made sure in the earlier conditions that arg1 is 2 and arg0 is -1 or 1, so never 2. 2024-11-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94589 PR tree-optimization/117612 * tree-ssa-phiopt.cc (spaceship_replacement): Fix up a pasto in check when arg1 is 2. (cherry picked from commit ca7430f145f5c7960f67ec77f585f3a2b58c7d10)