This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Make try_unroll_loop_completely to use loop bounds recorded

From: Jan Hubicka <hubicka at ucw dot cz>
To: Richard Biener <rguenther at suse dot de>
Cc: Jan Hubicka <hubicka at ucw dot cz>, "H.J. Lu" <hjl dot tools at gmail dot com>, gcc-patches at gcc dot gnu dot org
Date: Thu, 25 Oct 2012 14:05:49 +0200
Subject: Re: Make try_unroll_loop_completely to use loop bounds recorded
References: <20121011162109.GC32179@kam.mff.cuni.cz> <alpine.LNX.2.00.1210120959051.4063@zhemvz.fhfr.qr> <20121012132430.GA18634@kam.mff.cuni.cz> <alpine.LNX.2.00.1210121522320.4063@zhemvz.fhfr.qr> <20121013205922.GA2254@kam.mff.cuni.cz> <20121014194308.GA19314@kam.mff.cuni.cz> <20121016173226.GA1373@kam.mff.cuni.cz> <CAMe9rOqo2YmG6F-cbtHZ1xwoEa8E_+cFhbdC4uc9hjwXyquMcw@mail.gmail.com> <20121025000957.GA5388@kam.mff.cuni.cz> <alpine.LNX.2.00.1210251232000.4063@zhemvz.fhfr.qr>

> On Thu, 25 Oct 2012, Jan Hubicka wrote:
> 
> > > >         * tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Add edge_to_cancel
> > > >         parameter and use it to estimate code optimized out in the final iteration.
> > > >         (loop_edge_to_cancel): New function.
> > > >         (try_unroll_loop_completely): New IRRED_IVALIDATED parameter;
> > > >         handle unrolling loops with bounds given via max_loop_iteratins;
> > > >         handle unrolling non-inner loops when code size shrinks;
> > > >         tidy dump output; when the last iteration loop still stays
> > > >         as loop in the CFG forcongly redirect the latch to
> > > >         __builtin_unreachable.
> > > >         (canonicalize_loop_induction_variables): Add irred_invlaidated
> > > >         parameter; record niter bound derrived; dump
> > > >         max_loop_iterations bounds; call try_unroll_loop_completely
> > > >         even if no niter bound is given.
> > > >         (canonicalize_induction_variables): Handle irred_invalidated.
> > > >         (tree_unroll_loops_completely): Handle non-innermost loops;
> > > >         handle irred_invalidated.
> > > >         * cfgloop.h (unlop): Declare.
> > > >         * cfgloopmanip.c (unloop): Export.
> > > >         * tree.c (build_common_builtin_nodes): Build BULTIN_UNREACHABLE.
> > > >
> > > 
> > > This caused:
> > > 
> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55051
> > Oops, I did not really anticipated this to happen during profiledbootstrap, but I
> > discussed the problem in
> > http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01103.html
> > I suppose we will need to find porper fix for this long lasting issue more
> > urgently now :(
> 
> Please try to construct a testcase I can look at.
Like this:
int a[3];
int b[4];
main()
{
  int i;
  for (i=0;i<4;i++)
    if (b[i]==2)
     a[i]++;
}
here we deirve the bound of the loop by IV variable to be 4, a[i] does not count because
it does not dominate latch edge.  We unroll 4 times and the 4th copy gets out of range access.
(before or after my patch; with my patch we trigger this more often because more often we derive
some bounds)

I was thinking of the following:

 1) make loop->bounds to collect all statements having undefined effect on given iteration, not only
    those dominating latch
 2) make record_estimate to ignore those not dominating latch for loop bound
 3) Once done make walk through the loop determining upper bound by Dijiskra's algorithm
 4) Make complette unroling to patch in __builtin_trap into every peeled copy when given statement
    is known to not be executed.

3) will make us to handle correctly the following:
int a[3];
int b[4];
int c[3];
main()
{
  int i;
  for (i=0;b[i];i++)
    if (b[i]==2)
     a[i]++;
   else
     c[i]++;
}

Here bound is 3 by a[i] and c[i] access but it is not discovered because there
is no dominance relationship.

4) should get rid of those warings and moreover improve codegen since
__builtin_trap is less confusing than impossible path in the CFG.  We will need
to update not only the last iteratin of the loop, but all peeled copies, so we
will need change in duplicate_loop_to_header_edge.

We can also patch in __builtin_unreachable but that will make bugs with out
overflows/ out of bound accesses hard in loops to debug at -O levels (as seen
by do-1.f90 testcase and the new PR with integer overflow), so perhaps we could
do that at -Ofast or only or with -funsafe-loop-optimization or so even if it
is standard conforming.  I think friendly trap showing proper line number info
is better.

What do you think?
Honza

> 
> Richard.

Follow-Ups:
- Re: Make try_unroll_loop_completely to use loop bounds recorded
  - From: Richard Biener

References:
- Make try_unroll_loop_completely to use loop bounds recorded
  - From: Jan Hubicka
- Re: Make try_unroll_loop_completely to use loop bounds recorded
  - From: Richard Biener
- Re: Make try_unroll_loop_completely to use loop bounds recorded
  - From: Jan Hubicka
- Re: Make try_unroll_loop_completely to use loop bounds recorded
  - From: Richard Biener
- Re: Make try_unroll_loop_completely to use loop bounds recorded
  - From: Jan Hubicka
- Re: Make try_unroll_loop_completely to use loop bounds recorded
  - From: Jan Hubicka
- Re: Make try_unroll_loop_completely to use loop bounds recorded
  - From: Jan Hubicka
- Re: Make try_unroll_loop_completely to use loop bounds recorded
  - From: H.J. Lu
- Re: Make try_unroll_loop_completely to use loop bounds recorded
  - From: Jan Hubicka
- Re: Make try_unroll_loop_completely to use loop bounds recorded
  - From: Richard Biener

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]