[Bug tree-optimization/63537] [4.9/5 Regression] Missed optimization: Loop unrolling adds extra copy when returning aggregate
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Oct 15 08:38:00 GMT 2014
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63537
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Status|UNCONFIRMED |NEW
Last reconfirmed| |2014-10-15
Known to work| |4.7.3
Target Milestone|--- |4.9.2
Summary|Missed optimization: Loop |[4.9/5 Regression] Missed
|unrolling adds extra copy |optimization: Loop
|when returning aggregate |unrolling adds extra copy
| |when returning aggregate
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
This is because the outer loop is unrolled only after SRA gets a chance to
scalarize away the local aggregate. With GCC 4.7 we unroll the loop during
early unrolling even at -O2.
With 4.9 we conclude:
Estimating sizes for loop 1
BB: 4, after_exit: 0
size: 2 if (i_1 <= 2)
Exit condition will be eliminated in peeled copies.
BB: 3, after_exit: 1
size: 1 _4 = lhs.n[i_1];
size: 1 _6 = _4 * rhs_5(D);
size: 1 ret.n[i_1] = _6;
size: 1 i_8 = i_1 + 1;
Induction variable computation will be folded away.
size: 6-3, last_iteration: 2-0
Loop size: 6
Estimated size after unrolling: 7
Not unrolling loop 1: size would grow.
while 4.7 had:
Estimating sizes for loop 1
BB: 4, after_exit: 0
size: 2 if (i_1 <= 2)
Exit condition will be eliminated.
BB: 3, after_exit: 1
size: 1 D.1593_3 = lhs.n[i_1];
size: 1 D.1594_5 = D.1593_3 * rhs_4(D);
size: 1 ret.n[i_1] = D.1594_5;
size: 1 i_6 = i_1 + 1;
Induction variable computation will be folded away.
size: 6-3, last_iteration: 2-2
Loop size: 6
Estimated size after unrolling: 6
so the difference is in last_iteration handling.
Honza?
Otherwise this is a optimization pass ordering issue.
Eventually a simple pass could handle
<retval> = ret;
ret ={v} {CLOBBER};
return <retval>;
and back-propagate <retval> into all stores/loads of ret.
More information about the Gcc-bugs
mailing list