This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340

From: "matz at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Mon, 14 Jan 2013 15:55:51 +0000
Subject: [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
Auto-submitted: auto-generated
References: <bug-53342-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

--- Comment #7 from Michael Matz <matz at gcc dot gnu.org> 2013-01-14 15:55:51 UTC ---
The patch should lead to wrong code at some places (when peeling for
alignment actually does something).  The problem is, you
calculate base and step before peeling and cache that.  Caching the step
is fine, but caching the base is not, as the peeling specifically changes
the initial value of the accessed pointer.  For instance in the testcase
of pr53185.c we have this loop after peeling:

  bb_6 (preds = {bb_8 bb_26 }, succs = {bb_8 bb_23 })
  {
    # .MEM_27 = PHI <.MEM_21(8), .MEM_51(26)>
    # e.1_29 = PHI <e.4_22(8), e.1_52(26)>
    _10 = (long unsigned int) e.1_29;
    _11 = _10 * 4;
    _12 = f_5(D) + _11;
    _14 = (int) e.1_29;
    _16 = _14 * pretmp_38;
    _17 = (long unsigned int) _16;
    _18 = _17 * 4;
    _19 = pretmp_35 + _18;
    # VUSE <.MEM_27>
    _20 = *_19;
    # .MEM_21 = VDEF <.MEM_27>
    *_12 = _20;
    e.4_22 = e.1_29 + 1;
    if (e.4_22 < a.5_26)
      goto <bb 8>;
    else
      goto <bb 23>;
  }

Note the initial value of e.1_52 for e.1_29.  But your cached
information sets
  iv.base = pretmp_35
  iv.step = (long unsigned int) pretmp_38 * 4

It actually should be iv.base
 = pretmp_35 + 4 * ((long unsigned int) (pretmp_38 * (int) e.1_52))

The casts here are actually the cause for simple_iv not working in this
case.  This expression would have to be calculated outside the loop
and used as stride_base.  I don't see where this could easily be done
from existing places (like where the peeling loop is generated).

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]