In test() gcc is not able to determine that for a==b it does not have to evaluate 2nd comparison and can use value of a if 1st comparison is true. When operators are swapped like in test2() or are the same, code is optimized. [code] double test(double a, double b) { if (a <= b) return a < b ? a : b; return 0.0; } double test2(double a, double b) { if (a < b) return a <= b ? a : b; return 0.0; } [/code] [asm] test(double, double): vcomisd xmm1, xmm0 jnb .L10 vxorpd xmm0, xmm0, xmm0 ret .L10: vminsd xmm0, xmm0, xmm1 ret test2(double, double): vcmpnltsd xmm1, xmm0, xmm1 vxorpd xmm2, xmm2, xmm2 vblendvpd xmm0, xmm0, xmm2, xmm1 ret [/asm]
On Sat, 22 Dec 2018, bugzilla@poradnik-webmastera.com wrote: > In test() gcc is not able to determine that for a==b it does not have to > evaluate 2nd comparison and can use value of a if 1st comparison is true. When > operators are swapped like in test2() or are the same, code is optimized. > > [code] > double test(double a, double b) > { > if (a <= b) > return a < b ? a : b; > return 0.0; > } You didn't give compilation options, but if a and b are +0 and -0 in some order, the first comparison is true but b must be returned instead of a (in the absence of -fno-signed-zeros or an option implying it).
Code was compiled with -O3 -march=skylake. I have tried to add -fno-signed-zeros and -fsigned-zeros, and got the same output for both cases.
I have tried to compile with -O3 -march=skylake -ffast-math and got this: [asm] test(double, double): vminsd xmm2, xmm0, xmm1 vcmplesd xmm0, xmm0, xmm1 vxorpd xmm1, xmm1, xmm1 vblendvpd xmm0, xmm1, xmm2, xmm0 ret test2(double, double): vminsd xmm2, xmm0, xmm1 vcmpltsd xmm0, xmm0, xmm1 vxorpd xmm1, xmm1, xmm1 vblendvpd xmm0, xmm1, xmm2, xmm0 ret [/asm] And this is for -O3 -march=skylake -funsafe-math-optimizations. As you can see, one instruction was eliminated from test2(). For some reason it was not eliminated from test() function. I checked that -ffinite-math-only present in -ffast-math prevented elimination of this extra instruction. [asm] test(double, double): vminsd xmm2, xmm0, xmm1 vcmplesd xmm0, xmm0, xmm1 vxorpd xmm1, xmm1, xmm1 vblendvpd xmm0, xmm1, xmm2, xmm0 ret test2(double, double): vcmpnltsd xmm1, xmm0, xmm1 vxorpd xmm2, xmm2, xmm2 vblendvpd xmm0, xmm0, xmm2, xmm1 ret [/asm]
Confirmed. Even with -ffast-math on GIMPLE we fail to elide the MIN_EXPR in both functions (test has a_2(D) < b_3(D)). When doing integer operations VRP manages to elide those. Given we have no such thing as value-range propagation for floats the closest we have is DOM/VN registering the conditions and MIN_EXPR VN "failing" to lookup whether a<=b is known. <bb 2> [local count: 1073741825]: if (a_2(D) <= b_3(D)) goto <bb 3>; [34.00%] else goto <bb 4>; [66.00%] <bb 3> [local count: 365072220]: _4 = MIN_EXPR <a_2(D), b_3(D)>; <bb 4> [local count: 1073741825]: # _1 = PHI <_4(3), 0.0(2)> return _1;
So VRP handles floating point now but it still does not optimize it: Folding statement: if (a_2(D) <= b_3(D)) Registering value_relation (a_2(D) <= b_3(D)) on (2->3) Visiting conditional with predicate: if (a_2(D) <= b_3(D)) With known ranges a_2(D): [frange] double VARYING b_3(D): [frange] double VARYING Predicate evaluates to: DON'T KNOW Not folded Folding statement: _4 = MIN_EXPR <a_2(D), b_3(D)>; Not folded While the int does this: Folding statement: _4 = MIN_EXPR <a_2(D), b_3(D)>; folding with relation a_2(D) <= b_3(D) Global Exported: _4 = [irange] int [-INF, 2147483646] Not folded Which is a regression from GCC 12 ....
(In reply to Andrew Pinski from comment #5) > So VRP handles floating point now but it still does not optimize it: > Folding statement: if (a_2(D) <= b_3(D)) > Registering value_relation (a_2(D) <= b_3(D)) on (2->3) > > Visiting conditional with predicate: if (a_2(D) <= b_3(D)) > > With known ranges > a_2(D): [frange] double VARYING b_3(D): [frange] double VARYING > > Predicate evaluates to: DON'T KNOW > Not folded > Folding statement: _4 = MIN_EXPR <a_2(D), b_3(D)>; > Not folded > > > While the int does this: > Folding statement: _4 = MIN_EXPR <a_2(D), b_3(D)>; > folding with relation a_2(D) <= b_3(D) > Global Exported: _4 = [irange] int [-INF, 2147483646] > Not folded > > Which is a regression from GCC 12 .... I filed PR 110199 for that. Maybe once that is fixed the FP case will work too.
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>: https://gcc.gnu.org/g:b0eeb540497c7b9dee01f8724f9a4978b53a12ae commit r15-6815-gb0eeb540497c7b9dee01f8724f9a4978b53a12ae Author: Andrew MacLeod <amacleod@redhat.com> Date: Fri Jan 10 13:33:01 2025 -0500 Use relations when simplifying MIN and MAX. Query for known relations between the operands, and pass that to fold_range to help simplify MIN and MAX relations. Make it type agnostic as well. Adapt testcases from DOM to EVRP (e suffix) and test floats (f suffix). PR tree-optimization/88575 gcc/ * vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Query relation between op0 and op1 and utilize it. (simplify_using_ranges::simplify): Do not eliminate float checks. gcc/testsuite/ * gcc.dg/tree-ssa/minmax-27.c: Disable VRP. * gcc.dg/tree-ssa/minmax-27e.c: New. * gcc.dg/tree-ssa/minmax-27f.c: New. * gcc.dg/tree-ssa/minmax-28.c: Disable VRP. * gcc.dg/tree-ssa/minmax-28e.c: New. * gcc.dg/tree-ssa/minmax-28f.c: New.
I believe this is fixed now.