Scheduling an early complete loop unrolling pass?

Dorit Nuzman DORIT@il.ibm.com
Mon Feb 5 19:13:00 GMT 2007


Hi Richard,

Richard Guenther <rguenther@suse.de> wrote on 05/02/2007 17:27:03:

...
>
> ...
>
> and we are later not able to do constant propagation to the
> second loop which we can do if we first unroll such small loops.
>
> As we also only vectorize innermost loops

by the way, we are working on vectorization of outer-loops

> I believe doing a
> complete unrolling pass early will help in general (I pushed
> for this some time ago).
>
> Thoughts?
>

My initial thought was that it's probably not the right thing to do
because:

1. the problems with constant propagation and aliasing really calls for
improving/extending these optimization (and probably also have the
vectorizer convey more of the information it has), rather than work around
the problem.
2. the vectorizer should be able to decide, based on a cost model, whether
it is profitable to vectorize a loop, and if not - as in the case above -
leave it unvectorized.
3. In the meantime, you can use --param min-vect-loop-bound=2 to disallow
vectorization of this loop. In fact, maybe we should make that the deafult
(instead of the current default - 0).

However...,

I have seen cases in which complete unrolling before vectorization enabled
constant propagation, which in turn enabled significant simplification of
the code, thereby, in fact making a previously unvectorizable loop (at
least on some targets, due to the presence of divisions, unsupported in the
vector unit), into a loop (in which the divisions were replaced with
constants), that can be vectorized.

Also, given that we are working on "SLP" kind of technology (straight line
code vectorization), which would make vectorization less sensitive to
unrolling, I think maybe it's not such a bad idea after all... One option
is to increase the default value of --param min-vect-loop-bound for now,
and when SLP is incorporated, go ahead and schedule early complete
unrolling. However, since SLP implementation may take some time (hopefully
within the time frame of 4.3 though) - we could just go ahead and schedule
early complete unrolling right now. (I can't believe I'm in favor of this
idea, but that loop I was talking about before - improved by a factor over
20x when early complete unrolling + subsequent vectorization were
applied...)

dorit


> Thanks,
> Richard.
>
> --
> Richard Guenther <rguenther@suse.de>
> Novell / SUSE Labs



More information about the Gcc mailing list