This is the mail archive of the
mailing list for the GCC project.
[RFC] BB vectorizer and vectorizer reorganization
- From: Ira Rosen <IRAR at il dot ibm dot com>
- To: gcc-patches at gcc dot gnu dot org
- Cc: Dorit Nuzman <DORIT at il dot ibm dot com>
- Date: Mon, 10 Nov 2008 11:09:50 +0200
- Subject: [RFC] BB vectorizer and vectorizer reorganization
We're planning to add basic block vectorization (aka SLP). This is designed
to catch vectorization opportunities in straight-line code sequences
out-of-loops (as opposed to the similar capability we already have that
exploits such opportunities within a loop iteration, in a loop-aware
manner). The implementation is pretty simple and mostly reuses the existing
A first version will support only simple cases of code sequences that end
with a group of adjacent stores and contain only aligned and non-aliasing
data-refs of same type (the first version will, therefore, not include a
cost model, since it will not introduce any overheads). This will be later
extended to support chains of statements that don't end up with stores
(along with cost considerations, and probably also along an optimization to
consider BB boundaries to avoid redundant vector-scalar data moves).
Some things to decide on (that people may have feedback on):
- The thought is to invoke the bb-vectorize pass right after the unrolling
pass which follows the loop-vectorizer (before auto-par pass), thereby
letting the loop-aware BB-vectorizer have a go first (within the
loop-vectorizer), taking advantage of the loop-context if possible.
Alternatively, we may want to consider scheduling it after all the loop
optimizations (as it doesn't use the loop context at all), however that
requires rewriting portions of the data-ref analysis.
- We'll introduce a new flag -fbb-vectorize, that we plan to turn on by
default when -ftree-vectorize is set.
- Data-refs analysis is a reduced data-refs analysis that ignores any
evolution and any loop data dependences if exist (in other words we are
completely unaware here of the loop context).
This new BB-vectorization will be the third vectorization technique we will
have in GCC (together with the existing loop-vectorizer and loop-aware
BB-vectorizer). Therefore, the first step towards incorporating this new
functionality would be some code reorganization... We're thinking to have
the following files:
- tree-vectorizer.c - drivers for the three vectorizers: (1) loop
vectorizer (inter-iteration parallelism), (2) loop-aware BB-vectorizer
(intra-iteration parallelism), and (3) BB vectorizer (out-of-loops).
- tree-vect-loop.c - loop specific parts such as loop control-flow
utilities, reductions, etc. (possibly further divided into several files if
needed). These will be used by drivers (1) and (2).
- tree-vect-bb.c - BB vectorization specific analysis and transformation.
This will be used by drivers (2) and (3).
- tree-vect-stmts.c - statements analysis and transformation (to be used by
- tree-vect-data-refs.c - vectorizer specific data-refs analysis and
manipulations (to be used by all).
- tree-vect-patterns.c - untouched.
Here's a poor attempt at illustrating that:
loop_vect() loop_aware_bb_vect() bb_vect()
| / \ /
| / \ /
| \ / |.
| \/ |.
| /\ |.
| / \ |.
Comments are welcome.
Dorit and Ira