This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch, vectorizer] Fix PR tree-optimization/37482


On Tue, Sep 16, 2008 at 5:13 AM, Ira Rosen <IRAR@il.ibm.com> wrote:
>
> Hi,
>
> When a group of loads is vectorized together (like in vectorization of
> strided memory accesses or loop-aware SLP), the vector loads must be
> inserted before the first (scalar) load of the group. In SLP, loads of the
> same SLP instance can be distributed between several SLP nodes, and the
> nodes can be scheduled in order  different from the original statements
> order in the loop, causing uses to appear before definitions as in PR
> 37482.
>
> The attached patch solves this by adding a new function
> vect_find_first_load_in_slp_instance, and inserting the vectorized loads
> before the first load of the SLP instance.
>
> Bootstrapped with vectorization enabled and tested on ppc-linux. O.K. for
> mainline?
>

This algorithm you are using is goign to walk possibly the entire
program for each SLP instance.
1. You can cut the search space immensely by taking
nearest_common_dominator of bb_for_stmt for each stmt and starting
there, following successors.
2. You can eliminate this finding entirely by doing your analyze walk
in dominator order and assigning uids to statements in that order
(which we do in other passes), then comparing them later on (what you
consider the "first" load may depend on the order in which you process
siblings in the dominator tree but it is no more or less arbitrary
than your current algorithm)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]