This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Proposal for the coarse grain unrolling heuristics and renaming for the enablement of better fine grain Loop transformation.

15.03.2015 17:44, Ajit Kumar Agarwal writes:
> Hello All:
> The below example is taken from the article by Albert Cohen
> "Deep Jam : Conversion of Coarse grain Parallelism to Fine grain vector parallelism"

It is not clear from this article, whether such optimization can be
implemented in practice (i.e. automatically and within reasonable time).

Authors implemented some proof-of-concept, but from their description it
looks like they had to tweak their algorithm in each particular case
(and they hope, that later this could be automated using FDO). Also
nothing definite is said about compilation time.

> I would like to propose the above heuristics for unroll and jam and renaming which enables the loop fusion and the IF-merging 
> to achieve the optimize code.

AFAIK, most of loop transformations in GCC are applicable to rather
simple loops, which, at least have an induction variable. So, loops like

while (q)

for some arbitrary condition q, e.g.

while (*a++ = *b++)

are unlikely to be optimized well. The proposed transformation

> For ( I = 0 ; I < 10; i++)
>   a1 = phi(a1,a2);
>   If ( p1 && p2)
>      a1 = ...;
>      a2 = ...;
>  Else
>    If (p1)
>       a1= ....;
>   Else
>      If (p2)
>        a2 = ...;
>    While (q1 && q2)
>    {
>        ... = a1 + ...;
>        ... = a2 + .....;
>    }
>   While (q1)
>      ....= a1 + ...;
>   While ( q2)
>         ..... = a2 + ..;

involves some sort of unroll-and-jam, nontrival code motion, loop
fusion. And it obviously requires analysis of loop-carried data
dependence, that can deal with nested loops having complex conditions
inside them, which GCC does not yet provide.

GCC includes Graphite framework - it allows to perform rather complex
transformations of loops (perhaps similar to what you described), but it
only works for loops which have canonical induction variables and
involve only affine memory access (i.e. addresses must be represented as
linear combinations of induction variables and constants) - no if's, no
while's with non-trivial exit conditions.

So, I think this idea is, unfortunately, not very realistic in terms of
required efforts.

P.S. I'm new in GCC community, but I hope that in general terms my
description is close to reality. Please correct me, if something is wrong.

    Mikhail Maltsev

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]