This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][GRAPHITE] More TLC
- From: Sebastian Pop <sebpop at gmail dot com>
- To: Richard Biener <rguenther at suse dot de>, Sven Verdoolaege <sven dot verdoolaege at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 26 Sep 2017 09:19:50 -0500
- Subject: Re: [PATCH][GRAPHITE] More TLC
- Authentication-results: sourceware.org; auth=none
- References: <alpine.LSU.2.20.1709221453010.26836@zhemvz.fhfr.qr> <CAFk3UF_6h1fhLfx_YzBtkhTsNr2A0ppaUvJX2H=AXAZcUmATCQ@mail.gmail.com> <alpine.LSU.2.20.1709251511510.26836@zhemvz.fhfr.qr>
On Mon, Sep 25, 2017 at 8:12 AM, Richard Biener <rguenther@suse.de> wrote:
> On Fri, 22 Sep 2017, Sebastian Pop wrote:
>
> > On Fri, Sep 22, 2017 at 8:03 AM, Richard Biener <rguenther@suse.de>
> wrote:
> >
> > >
> > > This simplifies canonicalize_loop_closed_ssa and does other minimal
> > > TLC. It also adds a testcase I reduced from a stupid mistake I made
> > > when reworking canonicalize_loop_closed_ssa.
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> > >
> > > SPEC CPU 2006 is happy with it, current statistics on x86_64 with
> > > -Ofast -march=haswell -floop-nest-optimize are
> > >
> > > 61 loop nests "optimized"
> > > 45 loop nest transforms cancelled because of code generation issues
> > > 21 loop nest optimizations timed out the 350000 ISL "operations" we
> allow
> > >
> > > I say "optimized" because the usual transform I've seen is static
> tiling
> > > as enforced by GRAPHITE according to --param loop-block-tile-size.
> > > There's no way to automagically figure what kind of transform ISL did
> > >
> >
> > Here is how to automate (without magic) the detection
> > of the transform that isl did.
> >
> > The problem solved by isl is the minimization of strides
> > in memory, and to do this, we need to tell the isl scheduler
> > the validity dependence graph, in graphite-optimize-isl.c
> > see the validity (RAW, WAR, WAW) and the proximity
> > (RAR + validity) maps. The proximity does include the
> > read after read, as the isl scheduler needs to minimize
> > strides between consecutive reads.
> >
> > When you apply the schedule to the dependence graph,
> > one can tell from the result the strides in memory, a good
> > way to say whether a transform was beneficial is to sum up
> > all memory strides, and make sure that the sum of all strides
> > decreases after transform. We could add a printf with the
> > sum of strides before and after transforms, and have the
> > testcases check for that.
>
> Interesting. Can you perhaps show me in code how to do that?
>
>
Sven, is there already a function that computes the sum of all
strides in a proximity map? Maybe you have code that does
something similar in pet or ppcg?
Thanks,
Sebastian