This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Automatic Parallelization & Graphite - future plans
- From: Razya Ladelsky <RAZYA at il dot ibm dot com>
- To: gcc at gcc dot gnu dot org
- Date: Tue, 10 Mar 2009 16:13:08 +0200
- Subject: Automatic Parallelization & Graphite - future plans
Hello,
Described here is the future plan for automatic parallelization in GCC.
The current autopar pass is based on GOMP infrastructure; it distributes
iterations of loops
to several threads (the number is instructed by the user) if it was
determined that
they are independent. The only dependency allowed to exist is reduction,
which is handled as a special case.
This pass was initially contributed to GCC4.3 by Zdenek Dvorak and
Sebastian Pop.
With the integration of Graphite (http://gcc.gnu.org/wiki/Graphite)
to GCC4.4, a strong loop nest analysis and transformation engine was
introduced,
and the notion of using the polyhedral model to expose loop parallelism in
GCC becomes feasible
and relevant.
Our prospective goals are to incrementally integrate autopar and Graphite.
As in auto par, we'll initially focus on synchronization free
parallelization.
The first step, as we see it, will teach Graphite that parallel code needs
to be produced.
This means that Graphite will recognize simple parallel loops (using SCoP
detection and data dependency analysis),
and pass on that information.
The information that needs to be conveyed expresses that a loop is
parallelizable, and may also include annotations of more
detailed information e.g, the shared/private variables.
There are two possible models for the code generation:
1. Graphite will annotate parallel loops and pass that information all the
way through CLOOG
to the current autopar code generator to produce the parallel, GOMP based
code.
2. Graphite will annotate the parallel loops and CLOOG itself will be
responsible of generating
the parallel code.
A point to notice here is that scalars/reductions are currently not
handled in Graphite.
In the first model, where Graphite calls autopar's code generation,
scalars can be handled.
After Graphite finishes its analysis, it calls autopar's reduction
analysis, and only then the code
generation is called (if the scalar analysis determines that the loop
still parallelizable, of course).
Once the first step is accomplished, the following steps will focus on
teaching Graphite
to find loop transformations (such as skewing, interchange etc.) that
expose coarse grain synchronization free parallelism.
This will be heavily based on the polyhedral data dependence and
transformation infrastructures.
We have not determined which algorithm/ techniques we're going to use for
this part.
Having synchronization free parallelization integrated in Graphite, will
set the ground for
handling parallelism requiring a small amount of parallelization.
This is a rough view for our planned work on autopar in GCC.
Please feel free to ask/comment.
Thanks,
Razya