This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

unrolling and peeling decisions


Zdenek,
I've tried how our cfg-branch loop code groks the following mandelrbot set loop:

int maxiter;
typedef double number_t;
static int
mand_calc (number_t cre,
           number_t cim,
           number_t pre,
           number_t pim)
{
  number_t rp = 0, ip = 0;
  unsigned long iter = maxiter;
  number_t zre, zim;
  zre = cre;
  zim = cim;
  while ((iter) && (rp + ip < 4))
    {
      ip = (zim * zim);
      zim = (zim * zre) * 2 + pim;
      rp = (zre * zre);
      zre = rp - ip + pre;
      iter--;

    }
  iter = maxiter - iter;
  return (iter);
}

And found interesting thing that we predict it to iterate 5 times, peel it 1
time and don't unroll even with -funroll-all-loops.  This is somewhat creazy,
as I would like to see it preconditionoed over iter, unrolled and probably not
peeled.

About the number of iterations, without profile, I think we can't do much better.
We predict all loops to iterate 5 or 11 times (depending whether they do have
multiple exits or not) and that corresponds to the average number of iterations
as measured by Andrea's tester.

One possibility can be to try distinguish exit tests that are executed each
iteration. Perhaps these tests do have higher predictability, as tests previously
guearded by other conditional don't need to high predictability to maitain
high amount of iterations of the loop.  I will check.

Anyway I think deciding to peel 1 time is sick.  The decision is based on number
of instructions in the loop.  I think better we should try hether we can peel
the expected number of iterations of the loop w/o exceeding the maximal
amount of instructions.  If so, then peel.  I am not sure if we can do that without
profile feedback, but for feedback, this is definitly the way.
Later we may want to extend profiling to bring histogram of first few iterations
of loop to figure out whether the loop tends to execute very few (usually 1)
or very many iterations, but that impossible to do before Pavel's file format
changes gets to the mainline.

Without feedback, I guess we need better analysis than we do now and peel only
when we can prove fixed amount of iterations to be low or the loop simplifies
by peeling as first few iterations are exceptional.  Lets postnote this at the moment.

Also the unrolling heuristics I think needs to be tweaked.  Why it disqualify
the loop from unrolling?  I see that expected number of iterations is low, but
w/o profile when we are asked for unrolling I guess we should unroll anyway.

Also I would like to see the loop to match simple_loop.  Why do you prohibit
the loop to have multiple exits?  Even when it has multiple of them and it has
one, that is "well behaved" - postdominated by latch (executed each iteration)
and executed maximally once (how can we easilly determine this?), we still
can compute maximal number of iterations and use it as base for unrolling and
preconditioning.

Except for few problems, the code seems to work just perfectly now, once
it decides to do the transfromation :)  Good job!

Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]