[GRAPHITE, PATCH] Loop unroll and jam optimization

Richard Biener richard.guenther@gmail.com
Sat Nov 8 12:04:00 GMT 2014


On Sat, Nov 8, 2014 at 12:32 AM, Mircea Namolaru
<mircea.namolaru@inria.fr> wrote:
> Hello,
>
> This is the patch for unroll and jam optimizations. It is based on the
> code written by Tobias from graphite-optimize-isl.c (the code was
> unreachable till now) extended and enabled it via a new option -floop-unroll-jam.

Why not -floop-unroll-and-jam?

> The patch takes advantage of the new isl based code generator introduced recently
> in GCC (in fact of the possible options for building the AST).
>
> The code generated for this optimization in the case of non-constant loop bounds
> initially looks as below. This is not very useful because the standard GCC
> unrolling don't succeed to unroll the most inner loop.
>
> ISL AST generated by ISL:
> for (int c0 = 0; c0 < HEIGHT; c0 += 4)
>   for (int c1 = 0; c1 < LENGTH - 3; c1 += 1)
>     for (int c2 = c0; c2 <= min(HEIGHT - 1, c0 + 3); c2 += 1)

Hmm, so this iterates at most 4 times, right?  Eventually the body is considered
too large by GCC or it fails to compute an upper bound for the number
of iterations.
Is that (an upper bound for the number of iterations) available readily from ISL
at code-generation time?  If so you can transfer this knowledge to the GCC loop
information.

I'm curious to see a testcase (and a way to generate the above form) to see what
is actually the problem.

Thanks,
Richard.

>       S_4(c2, c1);
>
> Now, the "separating class" option (set for unroll and jam) produces this nice loop
> structure:
> ISL AST generated by ISL:
> for (int c0 = 0; c0 < HEIGHT; c0 += 4)
>   for (int c1 = 0; c1 < LENGTH - 3; c1 += 1)
>     if (HEIGHT >= c0 + 4) {
>       for (int c2 = c0; c2 <= c0 + 3; c2 += 1)
>         S_4(c2, c1);
>     } else
>       for (int c2 = c0; c2 < HEIGHT; c2 += 1)
>         S_4(c2, c1);
>
> The "unroll" option (set for unroll and jam) produces:
> ISL AST generated by ISL:
> for (int c0 = 0; c0 < HEIGHT; c0 += 4)
>   for (int c1 = 0; c1 < LENGTH - 3; c1 += 1)
>     if (HEIGHT >= c0 + 4) {
>       S_4(c0, c1);
>       S_4(c0 + 1, c1);
>       S_4(c0 + 2, c1);
>       S_4(c0 + 3, c1);
>     } else {
>       S_4(c0, c1);
>       if (HEIGHT >= c0 + 2) {
>         S_4(c0 + 1, c1);
>         if (4 * floord(HEIGHT - 3, 4) + 3 == HEIGHT && c0 + 3 == HEIGHT)
>           S_4(HEIGHT - 1, c1);
>       }
>     }
>
> The "separate" option (set by default for all dimensions for the new isl based code generator)
> don't succeed to remove the ifs from the loops and generate two loop structures (this would
> have been highly desirable).
>
> As the stage 1 is going to close soon, quick feedback to this patch is greatly appreciated.
> Many thanks, Mircea Namolaru



More information about the Gcc-patches mailing list