[RFC, PR66873] Use graphite for parloops

Richard Biener richard.guenther@gmail.com
Thu Jul 16 08:48:00 GMT 2015


On Wed, Jul 15, 2015 at 10:26 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> Hi,
>
> I tried to parallelize this fortran test-case (based on autopar/outer-1.c),
> specifically the outer loop of the first loop nest using
> -ftree-parallelize-loops=2:
> ...
> program main
>   implicit none
>   integer, parameter         :: n = 500
>   integer, dimension (0:n-1, 0:n-1) :: x
>   integer                    :: i, j, ii, jj
>
>
>   do ii = 0, n - 1
>      do jj = 0, n - 1
>         x(jj, ii) = ii + jj + 3
>      end do
>   end do
>
>   do i = 0, n - 1
>      do j = 0, n - 1
>         if (x(j, i) .ne. i + j + 3) call abort
>      end do
>   end do
>
> end program main
> ...
>
> But autopar fails to parallelize due to failing dependency analysis.
>
> I then tried to add -floop-parallelize-all, and found that the graphite
> dependency analysis did manage to decide that the iterations are
> independent.
>
> At https://gcc.gnu.org/wiki/Graphite/Parallelization I read:
> ...
> In GCC there already exists an auto-parallelization pass (tree-parloops.c),
> which is base on the lambda framework originally developed by Sebastian.
> Since Lambda framework is limited to some cases (e.g. triangle loops, loops
> with 'if' conditions), Graphite was developed to handle the loops that
> lambda was not able to handle .
> ...
>
> So I wondered, why not always use the graphite dependency analysis in
> parloops. (Of course you could use -floop-parallelize-all, but that also
> changes the heuristic). So I wrote a patch for parloops to use graphite
> dependency analysis by default (so without -floop-parallelize-all), but
> while testing found out that all the reduction test-cases started failing
> because the modifications graphite makes to the code messes up the parloops
> reduction analysis.
>
> Then I came up with this patch, which:
> - first runs a parloops pass, restricted to reduction loops only,
> - then runs graphite dependency analysis
> - followed by a normal parloops pass run.
>
> This way, we get to both:
> - compile the reduction testcases as before, and
> - profit from the better graphite dependency analysis otherwise.
>
> A point worth noting is that I stopped running pass_iv_canon before parloops
> (only in case of -ftree-parallelize-loops > 1) because running it before
> graphite makes the graphite scop detection fail.
>
> Bootstrapped and reg-tested on x86_64.
>
> Any comments?

graphite dependence analysis is too slow to be enabled unconditionally.
(read: hours in some simple cases - see bugzilla)

Richard.

> Thanks,
> - Tom



More information about the Gcc-patches mailing list