This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/80237] float to double conversion is not optimized away
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 29 Mar 2017 08:03:59 +0000
- Subject: [Bug tree-optimization/80237] float to double conversion is not optimized away
- Auto-submitted: auto-generated
- References: <bug-80237-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80237
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2017-03-29
Component|c |tree-optimization
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
We have (after inlining g into foo):
foo (float x)
{
float _3;
float _4;
double iftmp.0_5;
double _6;
double iftmp.0_7;
double iftmp.0_8;
<bb 2> [100.00%]:
if (x_2(D) > 0.0)
goto <bb 3>; [60.23%]
else
goto <bb 4>; [39.77%]
<bb 3> [60.23%]:
_3 = f (x_2(D));
iftmp.0_5 = (double) _3;
goto <bb 5>; [100.00%]
<bb 4> [39.77%]:
_6 = (double) x_2(D);
iftmp.0_7 = _6 + 1.0e+0;
<bb 5> [100.00%]:
# iftmp.0_8 = PHI <iftmp.0_5(3), iftmp.0_7(4)>
_4 = (float) iftmp.0_8;
return _4;
thus the tailcall is not exposed at GIMPLE level.
For the above we fail to optimize the partial redundancy of the conversion
in PRE. During phi translation we do see that _4 is equal to _3 on the
3->5 edge but phi translation fails because _3 is not antic-in in bb 5.
Not really relevant, I think the check is somewhat bogus. With that fixed
we do detect the tailcall:
<bb 2> [100.00%]:
if (x_2(D) > 0.0)
goto <bb 3>; [60.23%]
else
goto <bb 4>; [39.77%]
<bb 3> [60.23%]:
_3 = f (x_2(D)); [tail call]
goto <bb 5>; [100.00%]
<bb 4> [39.77%]:
_6 = (double) x_2(D);
iftmp.0_7 = _6 + 1.0e+0;
_9 = (float) iftmp.0_7;
<bb 5> [100.00%]:
# prephitmp_10 = PHI <_3(3), _9(4)>
return prephitmp_10;
foo:
.LFB2:
.cfi_startproc
ucomiss .LC0(%rip), %xmm0
jbe .L7
jmp f
.p2align 4,,10
.p2align 3
.L7:
cvtss2sd %xmm0, %xmm0
addsd .LC1(%rip), %xmm0
cvtsd2ss %xmm0, %xmm0
ret
Mine.