[patch] Ping: loop distribution for single nested loops

Sebastian Pop sebpop@gmail.com
Wed Mar 5 12:02:00 GMT 2008


On 02 Mar 2008 01:19:18 +0100, Andi Kleen <andi@firstfloor.org> wrote:
> "Sebastian Pop" <sebpop@gmail.com> writes:
>
>  > +@item -ftree-loop-distribution
>  > +Perform loop distribution.  This flag can improve cache performance on
>  > +big loop bodies and allow further loop optimizations, like
>  > +parallelization or vectorization, to take place.
>  > +
>
>  Very brief
>
>  > +/* This pass performs loop distribution: for example, the loop
>  > +
>  > +   |DO I = 2, N
>  > +   |    A(I) = B(I) + C
>  > +   |    D(I) = A(I-1)*E
>  > +   |ENDDO
>  > +
>  > +   is transformed to
>  > +
>  > +   |DOALL I = 2, N
>  > +   |   A(I) = B(I) + C
>  > +   |ENDDO
>  > +   |
>  > +   |DOALL I = 2, N
>  > +   |   D(I) = A(I-1)*E
>  > +   |ENDDO
>
>  It would be nice if this example was in the info file as part of the
>  flag description so that normal users can figure out what the
>  flag actually does.
>

I don't like this example, as I had to tune the ldist code for
sequential machines, and this particular example does not happen
anymore...  The reason is that it is better to keep the data for A in
the cache, so it is better to keep the code in the same loop.

Here is a patch that improves the documentation with an examples that
should still be distributed:

Index: invoke.texi
===================================================================
--- invoke.texi	(revision 132834)
+++ invoke.texi	(working copy)
@@ -5932,7 +5932,22 @@ is used for debugging the data dependenc
 @item -ftree-loop-distribution
 Perform loop distribution.  This flag can improve cache performance on
 big loop bodies and allow further loop optimizations, like
-parallelization or vectorization, to take place.
+parallelization or vectorization, to take place.  For example, the loop
+@smallexample
+DO I = 1, N
+  A(I) = B(I) + C
+  D(I) = E(I) * F
+ENDDO
+@end smallexample
+is transformed to
+@smallexample
+DO I = 1, N
+   A(I) = B(I) + C
+ENDDO
+DO I = 2, N
+   D(I) = E(I) * F
+ENDDO
+@end smallexample

 @item -ftree-loop-im
 @opindex ftree-loop-im



Thanks for reviewing,
Sebastian
-- 
AMD - GNU Tools



More information about the Gcc-patches mailing list