This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Automatic Parallelization & Graphite - future plans


Hi Razya

great to hear these Graphite plans. Some short comments.

On Tue, 2009-03-10 at 16:13 +0200, Razya Ladelsky wrote:
> [...]
> 
> The first step, as we see it, will teach Graphite that parallel code needs 
> to be produced.
> This means that Graphite will recognize simple parallel loops (using SCoP 
> detection and data dependency analysis), 
> and pass on that information.
>  The information that needs to be conveyed expresses that a loop is 
> parallelizable, and may also include annotations of  more 
> detailed information e.g, the shared/private variables. 
> 
> There are two possible models for the code generation:
> 1. Graphite will annotate parallel loops and pass that information all the 
> way through CLOOG 
> to the current autopar code generator to produce the parallel, GOMP based 
> code.

It might be possible to recognize parallel loops in graphite, but you
should keep in mind that in the graphite polyhedral representation loops
do not yet exist. So you would have to foresee which loops CLOOG will
produce. This might be possible depending how strict the scheduling we
give to CLOOG is. Another problem is, that cloog might split some loops
automatically (if possible) to reduce the control flow.

> 2. Graphite will annotate the parallel loops and CLOOG itself will be 
> responsible of generating 
> the parallel code.

The same as above. It will hard to mark loops as loops do not yet exist.

> A point to notice here is that scalars/reductions are
>  currently not 
> handled in Graphite.

We are working heavily on this. Expect it to be ready at least at the
end of march. Hopefully the end of this week.

> In the first model, where Graphite calls autopar's code generation, 
> scalars can be handled.

3. Wait for cloog to generate the new loops. As we have the polyhedral
information (poly_bb_p) still available during code generation, we can
try to update the dependency information using the restrictions cloog
added and use the polyhedral dependency analysis to check if there are
any dependencies in the CLOOG generated loops. So we can add a pass in
between CLOOG and clast-to-gimple that marks parallel loops.

Advantage: - Can be 100% exact, no forecasts as we are working on 
	     actually generated loops.
	   - Nice splitting of what is done where.
             1. Graphite is in charge of optimizations (generate 
	        parallelism)
	     2. CodeGen just detects parallel loops and generates 
		code for them.

> After Graphite finishes its analysis, it calls autopar's reduction 
> analysis, and only then the code
> generation is called (if the scalar analysis determines that the loop 
> still parallelizable, of course)
> 
> Once the first step is accomplished, the following steps will focus on 
> teaching Graphite 
> to find loop transformations (such as skewing, interchange etc.) that 
> expose coarse grain synchronization free parallelism.
> This will be heavily based on the polyhedral data dependence and 
> transformation infrastructures.
> We have not determined which algorithm/ techniques we're going to use for 
> this part.
> 
>  Having synchronization free parallelization integrated in Graphite, will 
> set the ground for 
> handling parallelism requiring a small amount of parallelization. 

Yes, great. This will allow us to experiment with advanced auto
parallelization. I am really looking forward to see the first patches!

> This is a rough view for our planned work on autopar in GCC.
> Please feel free to ask/comment.
> 
> Thanks,
> Razya


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]