This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
- From: "matz at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 14 Jan 2013 15:55:51 +0000
- Subject: [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
- Auto-submitted: auto-generated
- References: <bug-53342-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342
--- Comment #7 from Michael Matz <matz at gcc dot gnu.org> 2013-01-14 15:55:51 UTC ---
The patch should lead to wrong code at some places (when peeling for
alignment actually does something). The problem is, you
calculate base and step before peeling and cache that. Caching the step
is fine, but caching the base is not, as the peeling specifically changes
the initial value of the accessed pointer. For instance in the testcase
of pr53185.c we have this loop after peeling:
bb_6 (preds = {bb_8 bb_26 }, succs = {bb_8 bb_23 })
{
# .MEM_27 = PHI <.MEM_21(8), .MEM_51(26)>
# e.1_29 = PHI <e.4_22(8), e.1_52(26)>
_10 = (long unsigned int) e.1_29;
_11 = _10 * 4;
_12 = f_5(D) + _11;
_14 = (int) e.1_29;
_16 = _14 * pretmp_38;
_17 = (long unsigned int) _16;
_18 = _17 * 4;
_19 = pretmp_35 + _18;
# VUSE <.MEM_27>
_20 = *_19;
# .MEM_21 = VDEF <.MEM_27>
*_12 = _20;
e.4_22 = e.1_29 + 1;
if (e.4_22 < a.5_26)
goto <bb 8>;
else
goto <bb 23>;
}
Note the initial value of e.1_52 for e.1_29. But your cached
information sets
iv.base = pretmp_35
iv.step = (long unsigned int) pretmp_38 * 4
It actually should be iv.base
= pretmp_35 + 4 * ((long unsigned int) (pretmp_38 * (int) e.1_52))
The casts here are actually the cause for simple_iv not working in this
case. This expression would have to be calculated outside the loop
and used as stride_base. I don't see where this could easily be done
from existing places (like where the peeling loop is generated).