[patch] Ping: loop distribution for single nested loops
Sebastian Pop
sebpop@gmail.com
Wed Mar 5 12:02:00 GMT 2008
On 02 Mar 2008 01:19:18 +0100, Andi Kleen <andi@firstfloor.org> wrote:
> "Sebastian Pop" <sebpop@gmail.com> writes:
>
> > +@item -ftree-loop-distribution
> > +Perform loop distribution. This flag can improve cache performance on
> > +big loop bodies and allow further loop optimizations, like
> > +parallelization or vectorization, to take place.
> > +
>
> Very brief
>
> > +/* This pass performs loop distribution: for example, the loop
> > +
> > + |DO I = 2, N
> > + | A(I) = B(I) + C
> > + | D(I) = A(I-1)*E
> > + |ENDDO
> > +
> > + is transformed to
> > +
> > + |DOALL I = 2, N
> > + | A(I) = B(I) + C
> > + |ENDDO
> > + |
> > + |DOALL I = 2, N
> > + | D(I) = A(I-1)*E
> > + |ENDDO
>
> It would be nice if this example was in the info file as part of the
> flag description so that normal users can figure out what the
> flag actually does.
>
I don't like this example, as I had to tune the ldist code for
sequential machines, and this particular example does not happen
anymore... The reason is that it is better to keep the data for A in
the cache, so it is better to keep the code in the same loop.
Here is a patch that improves the documentation with an examples that
should still be distributed:
Index: invoke.texi
===================================================================
--- invoke.texi (revision 132834)
+++ invoke.texi (working copy)
@@ -5932,7 +5932,22 @@ is used for debugging the data dependenc
@item -ftree-loop-distribution
Perform loop distribution. This flag can improve cache performance on
big loop bodies and allow further loop optimizations, like
-parallelization or vectorization, to take place.
+parallelization or vectorization, to take place. For example, the loop
+@smallexample
+DO I = 1, N
+ A(I) = B(I) + C
+ D(I) = E(I) * F
+ENDDO
+@end smallexample
+is transformed to
+@smallexample
+DO I = 1, N
+ A(I) = B(I) + C
+ENDDO
+DO I = 2, N
+ D(I) = E(I) * F
+ENDDO
+@end smallexample
@item -ftree-loop-im
@opindex ftree-loop-im
Thanks for reviewing,
Sebastian
--
AMD - GNU Tools
More information about the Gcc-patches
mailing list