| Summary: | gcc got confused by different comparison operators | ||
|---|---|---|---|
| Product: | gcc | Reporter: | Daniel Fruzynski <bugzilla> |
| Component: | tree-optimization | Assignee: | Not yet assigned to anyone <unassigned> |
| Status: | RESOLVED FIXED | ||
| Severity: | enhancement | CC: | amacleod, rguenth |
| Priority: | P3 | Keywords: | missed-optimization |
| Version: | 9.0 | ||
| Target Milestone: | 15.0 | ||
| See Also: |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110199 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114559 |
||
| Host: | Target: | ||
| Build: | Known to work: | ||
| Known to fail: | Last reconfirmed: | 2023-06-09 00:00:00 | |
| Bug Depends on: | 110199 | ||
| Bug Blocks: | 85316 | ||
On Sat, 22 Dec 2018, bugzilla@poradnik-webmastera.com wrote: > In test() gcc is not able to determine that for a==b it does not have to > evaluate 2nd comparison and can use value of a if 1st comparison is true. When > operators are swapped like in test2() or are the same, code is optimized. > > [code] > double test(double a, double b) > { > if (a <= b) > return a < b ? a : b; > return 0.0; > } You didn't give compilation options, but if a and b are +0 and -0 in some order, the first comparison is true but b must be returned instead of a (in the absence of -fno-signed-zeros or an option implying it). Code was compiled with -O3 -march=skylake. I have tried to add -fno-signed-zeros and -fsigned-zeros, and got the same output for both cases. I have tried to compile with -O3 -march=skylake -ffast-math and got this:
[asm]
test(double, double):
vminsd xmm2, xmm0, xmm1
vcmplesd xmm0, xmm0, xmm1
vxorpd xmm1, xmm1, xmm1
vblendvpd xmm0, xmm1, xmm2, xmm0
ret
test2(double, double):
vminsd xmm2, xmm0, xmm1
vcmpltsd xmm0, xmm0, xmm1
vxorpd xmm1, xmm1, xmm1
vblendvpd xmm0, xmm1, xmm2, xmm0
ret
[/asm]
And this is for -O3 -march=skylake -funsafe-math-optimizations. As you can see, one instruction was eliminated from test2(). For some reason it was not eliminated from test() function. I checked that -ffinite-math-only present in -ffast-math prevented elimination of this extra instruction.
[asm]
test(double, double):
vminsd xmm2, xmm0, xmm1
vcmplesd xmm0, xmm0, xmm1
vxorpd xmm1, xmm1, xmm1
vblendvpd xmm0, xmm1, xmm2, xmm0
ret
test2(double, double):
vcmpnltsd xmm1, xmm0, xmm1
vxorpd xmm2, xmm2, xmm2
vblendvpd xmm0, xmm0, xmm2, xmm1
ret
[/asm]
Confirmed. Even with -ffast-math on GIMPLE we fail to elide the MIN_EXPR in
both functions (test has a_2(D) < b_3(D)). When doing integer operations
VRP manages to elide those. Given we have no such thing as value-range
propagation for floats the closest we have is DOM/VN registering the
conditions and MIN_EXPR VN "failing" to lookup whether a<=b is known.
<bb 2> [local count: 1073741825]:
if (a_2(D) <= b_3(D))
goto <bb 3>; [34.00%]
else
goto <bb 4>; [66.00%]
<bb 3> [local count: 365072220]:
_4 = MIN_EXPR <a_2(D), b_3(D)>;
<bb 4> [local count: 1073741825]:
# _1 = PHI <_4(3), 0.0(2)>
return _1;
So VRP handles floating point now but it still does not optimize it: Folding statement: if (a_2(D) <= b_3(D)) Registering value_relation (a_2(D) <= b_3(D)) on (2->3) Visiting conditional with predicate: if (a_2(D) <= b_3(D)) With known ranges a_2(D): [frange] double VARYING b_3(D): [frange] double VARYING Predicate evaluates to: DON'T KNOW Not folded Folding statement: _4 = MIN_EXPR <a_2(D), b_3(D)>; Not folded While the int does this: Folding statement: _4 = MIN_EXPR <a_2(D), b_3(D)>; folding with relation a_2(D) <= b_3(D) Global Exported: _4 = [irange] int [-INF, 2147483646] Not folded Which is a regression from GCC 12 .... (In reply to Andrew Pinski from comment #5) > So VRP handles floating point now but it still does not optimize it: > Folding statement: if (a_2(D) <= b_3(D)) > Registering value_relation (a_2(D) <= b_3(D)) on (2->3) > > Visiting conditional with predicate: if (a_2(D) <= b_3(D)) > > With known ranges > a_2(D): [frange] double VARYING b_3(D): [frange] double VARYING > > Predicate evaluates to: DON'T KNOW > Not folded > Folding statement: _4 = MIN_EXPR <a_2(D), b_3(D)>; > Not folded > > > While the int does this: > Folding statement: _4 = MIN_EXPR <a_2(D), b_3(D)>; > folding with relation a_2(D) <= b_3(D) > Global Exported: _4 = [irange] int [-INF, 2147483646] > Not folded > > Which is a regression from GCC 12 .... I filed PR 110199 for that. Maybe once that is fixed the FP case will work too. The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>: https://gcc.gnu.org/g:b0eeb540497c7b9dee01f8724f9a4978b53a12ae commit r15-6815-gb0eeb540497c7b9dee01f8724f9a4978b53a12ae Author: Andrew MacLeod <amacleod@redhat.com> Date: Fri Jan 10 13:33:01 2025 -0500 Use relations when simplifying MIN and MAX. Query for known relations between the operands, and pass that to fold_range to help simplify MIN and MAX relations. Make it type agnostic as well. Adapt testcases from DOM to EVRP (e suffix) and test floats (f suffix). PR tree-optimization/88575 gcc/ * vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Query relation between op0 and op1 and utilize it. (simplify_using_ranges::simplify): Do not eliminate float checks. gcc/testsuite/ * gcc.dg/tree-ssa/minmax-27.c: Disable VRP. * gcc.dg/tree-ssa/minmax-27e.c: New. * gcc.dg/tree-ssa/minmax-27f.c: New. * gcc.dg/tree-ssa/minmax-28.c: Disable VRP. * gcc.dg/tree-ssa/minmax-28e.c: New. * gcc.dg/tree-ssa/minmax-28f.c: New. I believe this is fixed now. |
In test() gcc is not able to determine that for a==b it does not have to evaluate 2nd comparison and can use value of a if 1st comparison is true. When operators are swapped like in test2() or are the same, code is optimized. [code] double test(double a, double b) { if (a <= b) return a < b ? a : b; return 0.0; } double test2(double a, double b) { if (a < b) return a <= b ? a : b; return 0.0; } [/code] [asm] test(double, double): vcomisd xmm1, xmm0 jnb .L10 vxorpd xmm0, xmm0, xmm0 ret .L10: vminsd xmm0, xmm0, xmm1 ret test2(double, double): vcmpnltsd xmm1, xmm0, xmm1 vxorpd xmm2, xmm2, xmm2 vblendvpd xmm0, xmm0, xmm2, xmm1 ret [/asm]