[Bug tree-optimization/91975] worse code for small array copy using pointer arithmetic than array indexing
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Fri Oct 4 07:28:00 GMT 2019
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91975
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2019-10-04
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
f1 and g1 are detected as memcpy by loop-distribution. f0 is unrolled
completely by late unrolling:
Loop size: 10
Estimated size after unrolling: 10
while g0 is not:
Loop size: 8
Estimated size after unrolling: 20
so the size estimation doesn't quite "work" here. f0 body before unrolling:
<bb 3> [local count: 954449108]:
# i_14 = PHI <0(2), i_10(4)>
# prephitmp_19 = PHI <0(2), pretmp_18(4)>
# ivtmp_3 = PHI <8(2), ivtmp_13(4)>
_1 = (long unsigned int) i_14;
_2 = _1 * 4;
_4 = &b + _2;
*_4 = prephitmp_19;
i_10 = i_14 + 1;
ivtmp_13 = ivtmp_3 - 1;
if (ivtmp_13 != 0)
goto <bb 4>; [87.50%]
else
goto <bb 5>; [12.50%]
<bb 4> [local count: 835156388]:
_12 = (long unsigned int) i_10;
_11 = _12 * 4;
_16 = &a + _11;
pretmp_18 = MEM[(const int *)_16];
goto <bb 3>; [100.00%]
g0 body:
<bb 3> [local count: 954449108]:
# s_16 = PHI <&a(2), s_7(4)>
# d_17 = PHI <&b(2), d_8(4)>
# i_18 = PHI <0(2), i_10(4)>
# prephitmp_4 = PHI <0(2), pretmp_5(4)>
# ivtmp_3 = PHI <8(2), ivtmp_1(4)>
s_7 = s_16 + 4;
d_8 = d_17 + 4;
*d_17 = prephitmp_4;
i_10 = i_18 + 1;
ivtmp_1 = ivtmp_3 - 1;
if (ivtmp_1 != 0)
goto <bb 4>; [87.50%]
else
goto <bb 5>; [12.50%]
<bb 4> [local count: 835156388]:
pretmp_5 = MEM[(const int *)s_16 + 4B];
goto <bb 3>; [100.00%]
for g0 we do not think that the s_7 = s_16 + 4 are going to be optimized "away"
but for f0 we think that _4 = &b + _2 will. Those are actually the same.
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index 5952cad7bba..d38959c3aa2 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -195,9 +195,8 @@ constant_after_peeling (tree op, gimple *stmt, class loop
*loop)
/* Induction variables are constants when defined in loop. */
if (loop_containing_stmt (stmt) != loop)
return false;
- tree ev = analyze_scalar_evolution (loop, op);
- if (chrec_contains_undetermined (ev)
- || chrec_contains_symbols (ev))
+ tree ev = instantiate_parameters (loop, analyze_scalar_evolution (loop,
op));
+ if (chrec_contains_undetermined (ev))
return false;
return true;
}
fixes this but we still end up with
size: 8-6, last_iteration: 7-6
Loop size: 8
Estimated size after unrolling: 10
Not unrolling loop 1: size would grow.
and not unrolling because the not unrolled estimate is lower than that for f0
(that costs &a + i * 4 as 2 while g0 has IV + 4).
I'm testing the above anyway.
More information about the Gcc-bugs
mailing list