This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: auto vectorization in gcc


>I suspect if you look hard at trying to do even the simple vectorization
>algorithms such as SLP on RTL you're going to find that it's exceedingly
>difficult.  You've got to deal with the alignment issues, memory
dependency
>analysis, etc.  I looked pretty hard at this a few years back.

That's exactly why I want the tree level to provide me with
all that information. Apply all the array, alignment and
dependence analyses at the tree level, and then all you need
to do at the RTL level is unroll and pack loops/statements
that have been marked as vectorizable.

Based on your experience from looking into this issue, and
assuming that it is feasible to develop a mechanism to
propagate and maintain this information (is it...?),
are there other problems that you foresee with respect to
implementing the remaining transformations (unroll and pack)
at the RTL level?

thanks,
dorit



                                                                                                                                    
                      law@redhat.com                                                                                                
                                               To:       David Edelsohn <dje@watson.ibm.com>                                        
                      17/07/2003 21:25         cc:       Daniel Berlin <dberlin@dberlin.org>, Richard Henderson <rth@redhat.com>,   
                      Please respond to         Dorit Naishlos/Haifa/IBM@IBMIL, gcc@gcc.gnu.org, Diego Novillo                      
                      law                       <dnovillo@redhat.com>                                                               
                                               Subject:  Re: auto vectorization in gcc                                              
                                                                                                                                    




In message <200307171815.OAA28902@makai.watson.ibm.com>, David Edelsohn
writes:
 >>>>>> Daniel Berlin writes:
 >
 >Dan> Especially since it's a matter of just making tree nodes with a
vector
 >Dan> type, and performing standard tree ops on them.
 >
 >Dan> IE PLUS_EXPR of two trees with types of V4SF_type_node should work
just
 >Dan> fine, and convert to the right vector RTL.
 >
 >Dan> If it doesn't, we should make it work.
 >
 >           I believe part of the concern is alignment issues.  If the
code
 >was not originally written with vector types, we cannot guarantee that
 >objects will have the appropriate alignment for SIMD instructions.
But if you're working at the tree level, then you have the opportunity
to fix the alignment of some objects.  It's one of the many advantages
of working at the tree level.

 >We may not know this until the storage layout occurs during RTL
generation.
You've got a lot more control over alignment of objects at the tree
level than you do at the RTL level -- largely because you haven't allocated
stack space for the objects when working at the tree level (say for local
arrays).  Thus you get the chance to increase the alignment *before*
you put the object onto the stack.

 >Once
 >we have committed to auto-vectorization in the trees and possible skipped
 >some scalar loop optimizations, how do we roll back once we determine
that
 >the SIMD instructions cannot be generated efficiently?
You can always break a SIMD instruction down to its components.  In the
extreme case you could re-roll the loop if the loop unrolling is
also unprofitable.


 >           The issue isn't that we *have to* do auto-vectorization in
RTL,
 >but how do we deal with reality intervening at later compilation stages?
No, quite the opposite.  It's bloody hard to do in RTL and a hell of a
lot easier to do on trees.

I suspect if you look hard at trying to do even the simple vectorization
algorithms such as SLP on RTL you're going to find that it's exceedingly
difficult.  You've got to deal with the alignment issues, memory dependency
analysis, etc.  I looked pretty hard at this a few years back.



jeff





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]