This is the mail archive of the
mailing list for the GCC project.
Re: auto vectorization in gcc
- From: "Dorit Naishlos" <DORIT at il dot ibm dot com>
- To: law at redhat dot com
- Cc: David Edelsohn <dje at watson dot ibm dot com>, Diego Novillo <dnovillo at redhat dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Mon, 21 Jul 2003 16:19:01 +0300
- Subject: Re: auto vectorization in gcc
> I don't see the value in doing the data dependency analysis at the tree
> level, then trying to propagate that down into the RTL optimizers when
> we can take advantage of that information and do the vectorization
> at the tree level.
That depends on how much machine dependent information you are willing
to expose to the tree level. If the tree level will have access to all
the vital target information, then you are right - the value is arguable.
However, if that's not the case, then by deferring the actual
to a later phase, in which all the required information is available, you
avoid (or at least minimize) a lot of undoing of transformations, which may
not be trivial at all (reverse unrolling, alignment checks, if
We should try to evaluate what makes more sense (or is less painful...) -
pass information from the tree level to the RTL level (+ maintain it until
the vectorization pass, and do the transformations at the RTL level),
or try to undo transformations that were applied at the tree level, when we
discover that the target can't support it efficiently.
By the way, doesn't the tree level propagate information into other
To: Diego Novillo <firstname.lastname@example.org>
17/07/2003 22:42 cc: Dorit Naishlos/Haifa/IBM@IBMIL, "email@example.com" <firstname.lastname@example.org>, David
Please respond to Edelsohn <email@example.com>
law Subject: Re: auto vectorization in gcc
In message <firstname.lastname@example.org>, Diego
>On Thu, 2003-07-17 at 08:45, Dorit Naishlos wrote:
>> We are planning to implement auto-vectorization targeting vector
>> extensions such as AltiVec.
>> (a) the tree-SSA branch is preparing the infrastructure for
>> (b) the task of auto-vectorizaiton will probably be divided between
>> the RTL level and the tree level
>> (c) since the tree-level is machine independent, actual vectorization
>> will take place in the RTL level
>> (d) the tree-SSA branch will perform data-dependence analysis and
>> loop transformations to increase vectorizability of loops,
>> and somehow pass on the information to the RTL level.
>> How far is this from what has really been envisioned?
>That's pretty much the idea we had in mind.
More correctly, I had envisioned having the vectorization happen at the
tree level rather than at the RTL level.
I don't see the value in doing the data dependency analysis at the tree
level, then trying to propagate that down into the RTL optimizers when
we can take advantage of that information and do the vectorization directly
at the tree level.
>> Is it indeed the case that no target specific information will be
>> available at the tree level, deferring transformations that rely on
>> such information to the RTL level?
>No. I think we will need to have some target information at the tree
>level. But it's not clear how much or how detailed, yet.
At some point we'll probably need information about the memory
subsystem. But for basic vectorization that information isn't necessary.
My thought was to generate vectors as wide as possible, then let the
tree->rtl conversion phase break those vectors down into whatever the
target actually supports. We've already prove the ops are independent,
so breaking them down into smaller vectors (or even scalars) is easy.
>> like "substract and saturate" if they are supported, and unrolling
>> or blocking the iterations according to the vector length, etc)?
>Something along those lines, yes.
I haven't thought much about saturation in years :-) I thought when I
left the embedded world behind saturation wouldn't be a big issue :-)
>> We would like to start making progress on auto-vectorization
>> right away, and we're trying to figure out what's the best way to do
>> that. On one hand, we don't want to duplicate work, and we want to
>> take a path that could take advantage of the infrastructure that is
>> being developed (tree-SSA branch, rtlopt branch), (and hopefully
>> would be merged into the main trunk in the future).
>I don't think there's much risk in duplicating effort at the moment.
>I'm not aware of anybody working on vectorization today.
Agreed. Hell, we really don't have anyone even working on the loop
optimizer for trees yet. Given that we're going to need an unroller
and basic loop opts, that might be a great place to start.
>Since you want to start from the bottom up, I think the rtlopt branch
>may be a better starting point. At some point both tree-ssa and rtlopt
>should merge, but I don't really know what the status of rtlopt is.
I disagree. I'd prefer to see the work happen on the tree-ssa branch --
SLP needs basic loop opts and those should be happening on the tree
structures. Thus using the rtlopt branch seems rather inappropriate.