This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: unrolling and peeling decisions


> Hello.
> 
> > And found interesting thing that we predict it to iterate 5 times, peel it 1
> > time and don't unroll even with -funroll-all-loops.  This is somewhat creazy,
> > as I would like to see it preconditionoed over iter, unrolled and probably not
> > peeled.
> > 
> > About the number of iterations, without profile, I think we can't do much better.
> > We predict all loops to iterate 5 or 11 times (depending whether they do have
> > multiple exits or not) and that corresponds to the average number of iterations
> > as measured by Andrea's tester.
> 
> Well, the problem is that I (wrongly) compare predicted number of iterations
> with maximum number of unrollings/peelings, not the actual maximum for given
> loop; the fix is easy (in unroll-new.c:732 and 818 replace with npeel/nunroll).
> I can't do it here; if you want to, change it (or I will do it on Tuesday).

I noticed that too.  I didn't understand the logic behind the test so I wanted
to discuss that first.  I will commit the change.
> 
> > Anyway I think deciding to peel 1 time is sick.  The decision is based on number
> > of instructions in the loop.  I think better we should try hether we can peel
> > the expected number of iterations of the loop w/o exceeding the maximal
> > amount of instructions.  If so, then peel.  I am not sure if we can do that without
> > profile feedback, but for feedback, this is definitly the way.
> 
> And this (aproximately) is what it would do, if I haven't made the mistake.

Good that we are in tune. I am not quite sure whehter my vision is optimal tought,
so I am happy to see the agreement :)
> 
> > Without feedback, I guess we need better analysis than we do now and peel only
> > when we can prove fixed amount of iterations to be low or the loop simplifies
> > by peeling as first few iterations are exceptional.  Lets postnote this at the moment.
> > 
> > Also the unrolling heuristics I think needs to be tweaked.  Why it disqualify
> > the loop from unrolling?  I see that expected number of iterations is low, but
> > w/o profile when we are asked for unrolling I guess we should unroll anyway.
> 
> Do as you wish.

I guess if user specify -funroll-all-loops he wants it. I, unfortunately, don't
see any way how we can decide it without any aditional help from user side.
The average number of iterations of loop in program is, surprisingly, quite low
and we can't predict each loop to have above average number of iterations.

But perhaps even this will work, as we will optimize properly the very hot
regiions loosing a bit in not so hot ones.  I remmeber seeing paper about trace
formation that did use "wrong" heuristics to produce better performaing code.
> 
> > Also I would like to see the loop to match simple_loop.  Why do you prohibit
> > the loop to have multiple exits?  Even when it has multiple of them and it has
> > one, that is "well behaved" - postdominated by latch (executed each iteration)
> > and executed maximally once (how can we easilly determine this?), we still
> > can compute maximal number of iterations and use it as base for unrolling and
> > preconditioning.
> 
> Simple_loop is used to check that we are able to calculate number of iterations
> exactly; this loop does not fit into the cathegory. I don't see use for upper
> bound on number of iterations, please explain more precisely.

What old loop unroller does is to test last BB in the loop (this should be
genralized).  It it sees the iteration counter pattern, it does compute number
of iterations ignoring other exits from loop.  This is correct, as this is
upper bound on number of iterations.  Then it does unroll same way as you do,
but keeps other exits at place. This again works, as loop will just terminate
earlier, but we still save some work for extra exit condition and incrementing
of the counter.

We can do basically everything we do with "simple" loops, only we need to take
care we cleanup at those extra exit edges if we do some more advanced tricks,
like induction variable splitting, we don't do at the moment. Both unrolling
and precondtiioning should be just happy as it is.

Honza
> 
> Zdenek


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]