This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Preliminary patch for PR23820 and PR24309
- From: Ayal Zaks <ZAKS at il dot ibm dot com>
- To: "Sebastian Pop" <sebpop at gmail dot com>, "Daniel Berlin" <dberlin at dberlin dot org>
- Cc: "Zdenek Dvorak" <rakdver at kam dot mff dot cuni dot cz>, gcc-patches at gcc dot gnu dot org
- Date: Wed, 17 Oct 2007 14:06:58 +0200
- Subject: Re: Preliminary patch for PR23820 and PR24309
> On 10/15/07, Zdenek Dvorak <rakdver@kam.mff.cuni.cz> wrote:
> > > After this patch, as we won't have other PRs open on the loop
> > > interchange (otherwise ping me on these and I'll give them a look),
> > > would it be possible to enable loop interchange at -O3 or -O2 level?
> >
> > I think enabling it at -O3 for now would be a good idea,
> >
>
> I just bootstrapped and tested a patch with loop interchange
> enabled at -O3, and there were 6 regressions with it:
>
> gfortran.sum gfortran.dg/alloc_comp_assign_2.f90
> gfortran.sum gfortran.dg/pr19928-2.f90
> gfortran.sum gfortran.fortran-torture/execute/scalarize2.f90
> gfortran.sum gfortran.fortran-torture/execute/scalarize.f90
> gfortran.sum gfortran.fortran-torture/execute/where_1.f90
> gfortran.sum gfortran.fortran-torture/execute/where_6.f90
>
> They are all failing the execution test for -O3. I'll have a look at
> these fails before proposing a patch to enable loop interchange
> at -O3.
A couple of points to consider/clarify:
1. Are the statistics gathered by gather_interchange_stats() invariant, in
the sense that you can pre-gather them once per loop and cache them,
instead of re-gathering them twice for each pair of loops? This may save
compile time.
2. The priority function
> if (dependence_steps_i < dependence_steps_j
> || nb_deps_not_carried_by_i > nb_deps_not_carried_by_j
> || double_int_ucmp (access_strides_i, access_strides_j) < 0)
is not 'stable' in the sense that you may prefer to interchange i with j,
and if asked again prefer to interchange back. If point 1 above is true,
you are (bubble) sorting the loops according to their priorities, using
unstable comparisons. (Don't iterate until convergence ;-).
3. Can dist
> else if (dist < 0)
> (*dependence_steps) += -dist;
be negative?
4. Not too clear an example ...:
> Example: for the following loop,
>
> | loop_1 runs 1335 times
> | loop_2 runs 1335 times
> | A[{{0, +, 1}_1, +, 1335}_2]
> | B[{{0, +, 1}_1, +, 1335}_2]
> | endloop_2
> | A[{0, +, 1336}_1]
> | endloop_1
>
> gather_interchange_stats (in loop_1) will return
> DEPENDENCE_STEPS = 3002
> NB_DEPS_NOT_CARRIED_BY_LOOP = 5
> ACCESS_STRIDES = 10694
>
> gather_interchange_stats (in loop_2) will return
> DEPENDENCE_STEPS = 3000
> NB_DEPS_NOT_CARRIED_BY_LOOP = 7
> ACCESS_STRIDES = 8010
Thanks to Ira and Victor,
Ayal.