This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: auto vectorization in gcc


>In short, if there is target information tree-ssa level optimizations need
>to know in order to perform optimizations effectively, this information
>should be provided to them.

The best scenario is if the tree-level has all the information
it needs, in which case we can do the entire vectorization there.
My concern was that there was a design decision to keep the tree level
with minimum target information dependencies. If this is not the case,
I think we can all agree that vectorization will take place in the tree-ssa
branch, and can move on to the next step.

>Doing auto-vectorization at the RTL level is a bad idea.
>No compiler I know of attempts to perform these high-level optimizations
>at such a low level, through passing information.

By the way, there are a couple of backend compilers that do
auto-vectorization on a low level IR:
- Vizer:
  http://www.cs.rice.edu/~anshuman/LacsiPaper.pdf
- Chameleon:
  http://www.research.ibm.com/elite/Compiler/compiler.html

Indeed, these compilers do not pass information from a front end as they
do not have a front end (they compile a "disassembled" object file), so
they need to recover all the information themselves. We've implemented
vectorization in Chameleon's low level IR, and, yes, you are right,
it can get a bit messy...

So having been there, done that, we support a decision to vectorize in
tree-ssa :-)

dorit




                                                                                                                                   
                      Daniel Berlin                                                                                                
                      <dberlin@dberlin.        To:       Dorit Naishlos/Haifa/IBM@IBMIL                                            
                      org>                     cc:       Richard Henderson <rth@redhat.com>, dje@watson.ibm.com,                   
                                                dnovillo@redhat.com, aldyh@redhat.com, law@redhat.com, joern.rennecke@superh.com,  
                      21/07/2003 19:45          gcc-mail@the-meissners.org, gcc@gcc.gnu.org                                        
                                               Subject:  Re: auto vectorization in gcc                                             
                                                                                                                                   






On Mon, 21 Jul 2003, Dorit Naishlos wrote:

>
> Thanks very much for the responsiveness!
>
> > The tree level is *more* capable than the rtl level at representing
> > vector types (and thus operations).  I think all we need is some
> > small amount of info from the target about vector widths and memory
> > blocking, and then the transformation should happen at the tree level.
>
> I wonder if the target info that you suggest to expose to the tree level
> would suffice. In many cases code sequences that are perfectly
> parallelizable with respect to data dependences, will not benefit from
> vectorization. In order to avoid making really poor decisions, you want
> to have at least the following information exposed:

The tree-ssa level should be looked at as more of a mid-level than a high
level, since it is a lowered form of high level, language specific trees.
Other compilers perform these optimizations on a form quite like tree-ssa,
*not* at a low-level, RTL like form.  Thus, if there is target
specific information necessary to optimizations in a "good" manner, this
information should be made available.
The answer is not to try to pass around more information to a lower
level, which necesarrily loses information, but instead provide the
necessary target specific information to the higher level, which can use
it *without* any loss of information.
In short, if there is target information tree-ssa level optimizations need
to know in order to perform optimizations effectively, this information
should be provided to them.


To give another example, we can't perform cache blocking optimizations
effectively without knowing something of the target's cache structure.
This does not mean we should either perform these optimizations on RTL
(which is well near certain death), or attempt to pass the necessary
information about which loops should be cache blocked to the RTL level,
which can't make good use of it anyway. The proper solution is to provide
the information about the cache structure of the target to the  higher
level.


Doing auto-vectorization at the RTL level is a bad idea.
No compiler I know of attempts to perform these high-level optimizations
at such a low level, through passing information.
It's a recipe for a mess.

--Dan




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]