[PATCH] Merge the "ISL optimizer" from the graphite branch

Tue Jul 3 12:06:00 GMT 2012

On 07/03/2012 01:56 PM, Richard Guenther wrote:
> On Tue, 3 Jul 2012, Tobias Grosser wrote:
>
>> On 07/03/2012 01:15 PM, Richard Guenther wrote:
>>>
>>> This merges the last bit from the graphite ISL branch - an
>>> integrated optimizer based on ISL.  To quote Tobias:
>>>
>>> "The isl scheduling optimizer implements the scheduling algorithm first
>>> developed in Pluto [1]. Pluto has shown significant speedups and is
>>> nowadays even implemented in the IBM XL-C compiler. The implementation of
>>> this pass is a first draft and was copied largely from Polly. We need
>>> still to adapt the code to the gcc coding style and we need to tune the
>>> isl scheduler. At the moment we get reasonable compile times (at most
>>> 2x-3x slowdown) and first speedups.  We now need to tune the compile time
>>> and start to investigate which optimizations and heuristics need to be
>>> tuned in our reimplementation.
>>>
>>> [1] http://pluto-compiler.sourceforge.net/"
>>>
>>> Micha kindly did the code adaption to gcc coding style and I renamed
>>> the flag to -floop-nest-optimize (from -floop-optimize-isl).  We
>>> both agree that such integrated LNO is the way to go, superseeding
>>> individual graphite transforms we have now.  We might be even able
>>> to drop the separate blocking&   strip-mining transforms we have
>>> right now in favor of this?
>>
>> Thanks Micha for adapting the style to gcc.
>>
>> I would like to point out that this pass is still very experimental and not
>> tuned at all. Specifically, it was only tested on polybench with one specific
>> set of flags. Even there we did not only get speedups, but due to missing
>> heuristics some benchmarks also got large slowdowns. When using it on even
>> slightly different benchmarks or with slightly different flags, infinite
>> compile time or large performance regressions may show up! This optimizer may
>> obviously also contain bugs that yield to miscompiles.
>>
>> Also, the loop nest optimizer will be not very effective, as long as pre and
>> licm are scheduled before graphite.
>
> I have noticed the change to disable those on the graphite-isl branch,
> but I fail to see how we can not handle PREd/LIMd loops from within
> polyhedral optimizations.  In fact even the user may have performed
> PRE or LIM at the source level, thus the point to address this issue
> is surely within graphite/ISL.

You can still handle those loops, however the PREd/LIMD will introduce a 
lot of additional dependences, which will block transformations. Such 
dependences can be removed e.g. with array expansion or undoing some of 
the PREd/LIMD transformations, but this is a difficult problem 
especially if you don't want to increase the used memory too much.

I do not see any benefits from PREd and LIMD before graphite, as at the 
very best their transformations will be reverted (if such an 
optimization is ever written). However, scheduling PREd and LIMD after 
graphite makes perfect sense. (Especially as there are probably a lot of 
new opportunities).

Moving such passes behind graphite, does obviously not solve the case of 
manual PRE and LIMD done by the user. However, it will allow us to 
optimize the non manually PREd code a lot better.

Cheers
Tobi