[PATCH] Implementing Swing Modulo Scheduling in GCC
Vladimir Makarov
vmakarov@redhat.com
Thu Apr 22 18:39:00 GMT 2004
Mark Mitchell wrote:
> Mostafa Hagog wrote:
>
>> We addressed the comments below. The troubled code (unrolling and
>> renaming) was removed and replaced by a direct dependence computation
>> using df.c. We also added support for loops with unknown bounds.
>>
>> Here is the revised patch relative to mainline. Passed regression
>> and bootstrap on powerpc-apple-darwin7.2.0 target.
>>
> I'm not going to approve the patch as it stands.
> However, I think it looks very good; it's certainly tidy and has
> better documentation than many patches. Furthermore, the algorithm
> looks like a good choice.
>
> Before check-in the patch should be tested on three architectures.
> I'd suggest IA32 GNU/Linux and IA64 GNU/Linux in addition to OS X.
> Also, are you able to post SPEC 2000 numbers with and without the
> patch on these platforms? That would help to demonstrate that the
> patch is doing useful stuff on code that a lot of people believe
> should benefit from these kinds of improvements. Finally, you should
> post compile-time performance with and without the patch. It's
> reasonable for the compile-time performance to get a little worse if
> the SPEC nubmers are getting better, but the impact should hopefully
> be minimal.
Sorry, Mark. I've just finished to review the new version of the patch
and sent the comments before reading your email.
Software pipelining is a quite specific optimization. I remeber that
a professor from NCSU specilized in insn scheduling told us to stay away
from implementing SP (it is to complicated and expensive optimization).
I believe it will not give an improvement for SPEC2000. Although It
could improve code for small benchmarks like sorting and matrix
multiplication. So I'd expect a small benchmark demonstrating the
improvement. Mostafa and Ayal gave such example. Software pipelining
is also very expensive optimization (with the compilation time point of
view). I'd not recommend it to use by default even for -O3.
IMHO the current implementation is mainly oriented to RISC
architectures. I'd not expect benefits to use it for x86. But you are
right it should be checked for regression for ia32 too.
This implementation is a good start. There are many opportunities to
impove it and I hope people will start to work on the improvements when
it is on the mainline.
Vlad
More information about the Gcc-patches
mailing list