[PATCH] SLP: Move load/store-lanes check till late

Tamar Christina Tamar.Christina@arm.com
Fri Nov 6 11:33:25 GMT 2020



> -----Original Message-----
> From: Christophe Lyon <christophe.lyon@linaro.org>
> Sent: Friday, November 6, 2020 10:27 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: Richard Biener <rguenther@suse.de>; nd <nd@arm.com>; gcc-
> patches@gcc.gnu.org; ook@ucw.cz
> Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> 
> On Fri, 6 Nov 2020 at 11:18, Tamar Christina <Tamar.Christina@arm.com>
> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Christophe Lyon <christophe.lyon@linaro.org>
> > > Sent: Friday, November 6, 2020 8:42 AM
> > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > Cc: Richard Biener <rguenther@suse.de>; nd <nd@arm.com>; gcc-
> > > patches@gcc.gnu.org; ook@ucw.cz
> > > Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> > >
> > > On Thu, 5 Nov 2020 at 11:21, Tamar Christina via Gcc-patches <gcc-
> > > patches@gcc.gnu.org> wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: rguenther@c653.arch.suse.de <rguenther@c653.arch.suse.de>
> > > > > On Behalf Of Richard Biener
> > > > > Sent: Thursday, November 5, 2020 10:17 AM
> > > > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; ook@ucw.cz
> > > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > > > >
> > > > > On Wed, 4 Nov 2020, Tamar Christina wrote:
> > > > >
> > > > > > Hi Richi,
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: rguenther@c653.arch.suse.de
> > > > > > > <rguenther@c653.arch.suse.de> On Behalf Of Richard Biener
> > > > > > > Sent: Wednesday, November 4, 2020 8:07 AM
> > > > > > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > > > > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; ook@ucw.cz
> > > > > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till
> > > > > > > late
> > > > > > >
> > > > > > > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > > > > > >
> > > > > > > > Hi Richi,
> > > > > > > >
> > > > > > > > We decided to take the regression in any code-gen this
> > > > > > > > could give and fix it properly next stage-1.  As such
> > > > > > > > here's a new patch based on your previous feedback.
> > > > > > > >
> > > > > > > > Ok for master?
> > > > > > >
> > > > > > > Looks good sofar but be aware that you elide the
> > > > > > >
> > > > > > > -             && vect_store_lanes_supported
> > > > > > > -                  (STMT_VINFO_VECTYPE (scalar_stmts[0]), group_size,
> > > > > > > false))
> > > > > > >
> > > > > > > part of the check - that is, you don't verify the store part
> > > > > > > of the instance can use store-lanes.  Btw, this means the
> > > > > > > original code cancelled an instance only when the SLP graph
> > > > > > > entry is a store-lane capable store but your variant would
> > > > > > > also cancel in case there's a load-
> > > > > lane capable reduction.
> > > > > > >
> > > > > >
> > > > > > I do still have it,
> > > > > >
> > > > > >       if (loads_permuted
> > > > > >           && vect_store_lanes_supported (vectype, group_size,
> > > > > > false))
> > > > > >
> > > > > > I just grab the type from the SLP_TREE_VECTYPE (slp_root);
> > > > > > which should be the store if one exists.
> > > > > >
> > > > > > > I think that you eventually want to re-instantiate the
> > > > > > > store-lane check but treat it the same as any of the load
> > > > > > > checks (thus not require all instances to be stores for the
> cancellation).
> > > > > > > But at least when a store cannot use store-lanes we probably
> > > > > > > shouldn't cancel the SLP.
> > > > > >
> > > > > > I did however elide the kind check, that was added as part of
> > > > > > the rebase, it looked like kind wasn't Being stored inside the
> > > > > > SLP instance and
> > > > > I'd have to redo the analysis to find it.
> > > > > >
> > > > > > Does it does reasonable to include kind as a field in the SLP instance?
> > > > > >
> > > > > > >
> > > > > > > Anyway, the patch is OK for master.  The store-lane check
> > > > > > > part can be re- added as followup.
> > > > > > >
> > > > > >
> > > > > > Thanks! Will do.
> > > > >
> > > > > Btw, the patch regressed
> > > > >
> > > > > FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects
> > > > > scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> > > > > FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect
> > > > > "vectorizing stmts using SLP" 1
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > > scan-tree-dump vect "Built SLP cancelled: can use load/store-lanes"
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > > scan-tree-dump vect "LOAD_LANES"
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > > scan-tree-dump vect "STORE_LANES"
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP
> cancelled:
> > > > > can use load/store-lanes"
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES"
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES"
> > > > >
> > > > > on x86_64.  The slp-11b.c testcase is interesting since there
> > > > > extract_muldiv folding makes the group of four stores not
> > > > > matching so we split into a size of
> > > > > 3 and one remaining store.
> > > > > This causes us to arrive at
> > > > >
> > > > > note:   node 0x441a940 (max_nunits=4, refcnt=2)
> > > > > note:       stmt 0 _2 = in[_1];
> > > > > note:       stmt 1 _6 = in[_5];
> > > > > note:       stmt 2 _10 = in[_8];
> > > > > note:       load permutation { 0 2 1 }
> > > > >
> > > > > which on x86_64 we in the end cannot handle (without SSE4 I
> > > > > think) so it fails to SLP there.  Guess arm can do the permute
> > > > > but not the load-
> > > lane here.
> > > > >
> > > > > For gcc.dg/vect/slp-perm-6.c the XFAILs shouldn't be done for
> > > > > !vect_load_lanes targets.  Not sure if that's possible easily,
> > > > > like with a { target vect_load_lanes } { xfail vect_load_lanes }
> > > > > combo ...?  I suggest to make it xfail everywhere instead and
> > > > > add a comment as to we're expecting those only for vect_load_lanes
> targets.
> > > >
> > > > Yes just fixed these, the change in gcc.dg/vect/slp-11b.c
> > > > shouldn't be there and I updated the target selector properly :)
> > > >
> > >
> > > Hi Tamar,
> > >
> > > This patch (r11-4728) introduced a regression on arm:
> > > FAIL:    gcc.dg/vect/slp-reduc-9.c -flto -ffat-lto-objects
> > > scan-tree-dump vect "vectorized 1 loops"
> > > FAIL:    gcc.dg/vect/slp-reduc-9.c -flto -ffat-lto-objects
> > > scan-tree-dump vect "vectorizing stmts using SLP"
> > > FAIL:    gcc.dg/vect/slp-reduc-9.c scan-tree-dump vect "vectorized 1
> loops"
> > > FAIL:    gcc.dg/vect/slp-reduc-9.c scan-tree-dump vect "vectorizing
> > > stmts using SLP"
> > >
> > > Can you check?
> >
> > Hi Christoph,
> >
> > Which arm target is this?
> > I checked arm-none-linux-gnueabihf and slp-reduc-9.c is passing there.
> >
> 
> Weird, I wasn't more specific because it fails for all the arm targets I'm testing,
> see:
> https://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/r11-
> 4728-g4d76079fdfa3d36ebd95ac8b489519945e8ee88f/report-build-info.html
> 
> For instance:
> --target=arm-none-linux-gnueabihf
> --prefix=/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools
> --with-sysroot=/aci-gcc-fsf/builds/gcc-fsf-gccsrc/sysroot-arm-none-linux-
> gnueabihf
> --disable-nls --disable-libgomp --disable-libmudflap --disable-libcilkrts --
> enable-checking --enable-languages=c,c++,fortran --with-float=hard --
> enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a9
> --with-fpu=neon-fp16

Hi Christoph,

Thanks, I was testing trunk, this was already fixed by e3587a2d8b005d72f882a21864fb132f8a966366
which constraints when it aborts.

Regards,
Tamar

> 
> Christophe
> 
> > Regards,
> > Tamar
> > >
> > > > Cheers,
> > > > Tamar
> > > >
> > > > >
> > > > > Richard.
> > > > >
> > > > > > > Thanks,
> > > > > > > Richard.
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Tamar
> > > > > > > >
> > > > > > > > gcc/ChangeLog:
> > > > > > > >
> > > > > > > >         * tree-vect-slp.c (vect_analyze_slp_instance):
> > > > > > > > Moved load/store
> > > > > > > lanes
> > > > > > > >         check to ...
> > > > > > > >         * tree-vect-loop.c (vect_analyze_loop_2): ..Here
> > > > > > > >
> > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > >
> > > > > > > >         * gcc.dg/vect/slp-11b.c: Update output scan.
> > > > > > > >         * gcc.dg/vect/slp-perm-6.c: Likewise.
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: rguenther@c653.arch.suse.de
> > > > > > > > > <rguenther@c653.arch.suse.de> On Behalf Of Richard
> > > > > > > > > Biener
> > > > > > > > > Sent: Thursday, October 22, 2020 9:44 AM
> > > > > > > > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > > > > > > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>;
> ook@ucw.cz
> > > > > > > > > Subject: Re: [PATCH] SLP: Move load/store-lanes check
> > > > > > > > > till late
> > > > > > > > >
> > > > > > > > > On Wed, 21 Oct 2020, Tamar Christina wrote:
> > > > > > > > >
> > > > > > > > > > Hi All,
> > > > > > > > > >
> > > > > > > > > > This moves the code that checks for load/store lanes
> > > > > > > > > > further in the pipeline and places it after slp_optimize.
> > > > > > > > > > This would allow us to perform optimizations on the
> > > > > > > > > > SLP tree and only bail out if we really have a
> > > > > > > > > permute.
> > > > > > > > > >
> > > > > > > > > > With this change it allows us to handle permutes such
> > > > > > > > > > as {1,1,1,1} which should be handled by a load and replicate.
> > > > > > > > > >
> > > > > > > > > > This change however makes it all or nothing. Either
> > > > > > > > > > all instances can be handled or none at all.  This is
> > > > > > > > > > why some of the test cases have been
> > > > > > > > > adjusted.
> > > > > > > > >
> > > > > > > > > So this possibly leaves a loop unvectorized in case
> > > > > > > > > there's a ldN/stN opportunity but another SLP instance
> > > > > > > > > with a permutation not handled by interleaving is
> > > > > > > > > present.  What I was originally suggesting is to only
> > > > > > > > > cancel the SLP build if _all_ instances can be handled
> > > > > > > with ldN/stN.
> > > > > > > > >
> > > > > > > > > Of course I'm also happy with completely removing this
> heuristics.
> > > > > > > > >
> > > > > > > > > Note some of the comments look off now, also the
> > > > > > > > > assignment to ok before the goto is pointless and you
> > > > > > > > > should probably turn this into a dump print instead.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Richard.
> > > > > > > > >
> > > > > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > > > > > > > > -x86_64-pc-linux-gnu and no issues.
> > > > > > > > > >
> > > > > > > > > > Ok for master?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Tamar
> > > > > > > > > >
> > > > > > > > > > gcc/ChangeLog:
> > > > > > > > > >
> > > > > > > > > >     * tree-vect-slp.c (vect_analyze_slp_instance):
> > > > > > > > > > Moved load/store
> > > > > > > > > lanes
> > > > > > > > > >     check to ...
> > > > > > > > > >     * tree-vect-loop.c (vect_analyze_loop_2): ..Here
> > > > > > > > > >
> > > > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > > > >
> > > > > > > > > >     * gcc.dg/vect/slp-11b.c: Update output scan.
> > > > > > > > > >     * gcc.dg/vect/slp-perm-6.c: Likewise.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Richard Biener <rguenther@suse.de> SUSE Software
> > > > > > > > > Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > > > > > > > > Nuernberg, Germany;
> > > GF:
> > > > > > > > > Felix Imend
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Richard Biener <rguenther@suse.de> SUSE Software Solutions
> > > > > > > Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> GF:
> > > > > > > Felix Imend
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de> SUSE Software Solutions
> > > > > Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF:
> > > > > Felix Imend


More information about the Gcc-patches mailing list