This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: auto vectorization in gcc

From: "Dorit Naishlos" <DORIT at il dot ibm dot com>
To: law at redhat dot com
Cc: David Edelsohn <dje at watson dot ibm dot com>, Diego Novillo <dnovillo at redhat dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
Date: Mon, 21 Jul 2003 16:19:01 +0300
Subject: Re: auto vectorization in gcc

> I don't see the value in doing the data dependency analysis at the tree
> level, then trying to propagate that down into the RTL optimizers when
> we can take advantage of that information and do the vectorization
directly
> at the tree level.

That depends on how much machine dependent information you are willing
to expose to the tree level. If the tree level will have access to all
the vital target information, then you are right - the value is arguable.

However, if that's not the case, then by deferring the actual
transformation
to a later phase, in which all the required information is available, you
can
avoid (or at least minimize) a lot of undoing of transformations, which may
not be trivial at all (reverse unrolling, alignment checks, if
conversion...).

We should try to evaluate what makes more sense (or is less painful...) -
pass information from the tree level to the RTL level (+ maintain it until
the vectorization pass, and do the transformations at the RTL level),
or try to undo transformations that were applied at the tree level, when we
discover that the target can't support it efficiently.

By the way, doesn't the tree level propagate information into other
RTL passes?

thanks,
dorit



                                                                                                                                       
                      law@redhat.com                                                                                                   
                                               To:       Diego Novillo <dnovillo@redhat.com>                                           
                      17/07/2003 22:42         cc:       Dorit Naishlos/Haifa/IBM@IBMIL, "gcc@gcc.gnu.org" <gcc@gcc.gnu.org>, David    
                      Please respond to         Edelsohn <dje@watson.ibm.com>                                                          
                      law                      Subject:  Re: auto vectorization in gcc                                                 
                                                                                                                                       




In message <1058450952.3307.60.camel@frodo.toronto.redhat.com>, Diego
Novillo w
rites:
 >On Thu, 2003-07-17 at 08:45, Dorit Naishlos wrote:
 >
 >> We are planning to implement auto-vectorization targeting vector
 >> extensions such as AltiVec.
 >>
 >Cool!
 >
 >> (a) the tree-SSA branch is preparing the infrastructure for
 >>     auto-vectorization
 >> (b) the task of auto-vectorizaiton will probably be divided between
 >>     the RTL level and the tree level
 >> (c) since the tree-level is machine independent, actual vectorization
 >>     will take place in the RTL level
 >> (d) the tree-SSA branch will perform data-dependence analysis and
 >>     loop transformations to increase vectorizability of loops,
 >>     and somehow pass on the information to the RTL level.
 >>
 >> How far is this from what has really been envisioned?
 >>
 >That's pretty much the idea we had in mind.
More correctly, I had envisioned having the vectorization happen at the
tree level rather than at the RTL level.

I don't see the value in doing the data dependency analysis at the tree
level, then trying to propagate that down into the RTL optimizers when
we can take advantage of that information and do the vectorization directly
at the tree level.

 >> Is it indeed the case that no target specific information will be
 >> available at the tree level, deferring transformations that rely on
 >> such information to the RTL level?
 >>
 >No.  I think we will need to have some target information at the tree
 >level.  But it's not clear how much or how detailed, yet.
At some point we'll probably need information about the memory
subsystem.  But for basic vectorization that information isn't necessary.

My thought was to generate vectors as wide as possible, then let the
tree->rtl conversion phase break those vectors down into whatever the
target actually supports.  We've already prove the ops are independent,
so breaking them down into smaller vectors (or even scalars) is easy.

 >> like "substract and saturate" if they are supported, and unrolling
 >> or blocking the iterations according to the vector length, etc)?
 >>
 >Something along those lines, yes.
I haven't thought much about saturation in years :-)  I thought when I
left the embedded world behind saturation wouldn't be a big issue :-)


 >> We would like to start making progress on auto-vectorization
 >> right away, and we're trying to figure out what's the best way to do
 >> that. On one hand, we don't want to duplicate work, and we want to
 >> take a path that could take advantage of the infrastructure that is
 >> being developed (tree-SSA branch, rtlopt branch), (and hopefully
 >> would be merged into the main trunk in the future).
 >>
 >I don't think there's much risk in duplicating effort at the moment.
 >I'm not aware of anybody working on vectorization today.
Agreed.  Hell, we really don't have anyone even working on the loop
optimizer for trees yet.  Given that we're going to need an unroller
and basic loop opts, that might be a great place to start.

 >>
 >Since you want to start from the bottom up, I think the rtlopt branch
 >may be a better starting point.  At some point both tree-ssa and rtlopt
 >should merge, but I don't really know what the status of rtlopt is.
 >Jan?  Zdenek?
I disagree.  I'd prefer to see the work happen on the tree-ssa branch --
SLP needs basic loop opts and those should be happening on the tree
structures.  Thus using the rtlopt branch seems rather inappropriate.


Jeff

References:
- Re: auto vectorization in gcc
  - From: law

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]