This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, rs6000] Add expansions for min/max vector reductions
- From: Richard Biener <rguenther at suse dot de>
- To: Segher Boessenkool <segher at kernel dot crashing dot org>
- Cc: Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>, Alan Lawrence <alan dot lawrence at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, "dje dot gcc at gmail dot com" <dje dot gcc at gmail dot com>, Alan Hayward <Alan dot Hayward at arm dot com>, "ramana dot gcc at googlemail dot com" <ramana dot gcc at googlemail dot com>
- Date: Fri, 18 Sep 2015 10:38:58 +0200 (CEST)
- Subject: Re: [PATCH, rs6000] Add expansions for min/max vector reductions
- Authentication-results: sourceware.org; auth=none
- References: <1442413689 dot 2896 dot 45 dot camel at gnopaine> <55F98AD2 dot 4080408 at arm dot com> <1442419857 dot 10907 dot 0 dot camel at gnopaine> <55F9A0D8 dot 3020900 at arm dot com> <alpine dot LSU dot 2 dot 11 dot 1509170933050 dot 24931 at zhemvz dot fhfr dot qr> <1442499522 dot 10907 dot 25 dot camel at gnopaine> <20150917161752 dot GD2613 at gate dot crashing dot org>
On Thu, 17 Sep 2015, Segher Boessenkool wrote:
> On Thu, Sep 17, 2015 at 09:18:42AM -0500, Bill Schmidt wrote:
> > On Thu, 2015-09-17 at 09:39 +0200, Richard Biener wrote:
> > > So just to clarify - you need to reduce the vector with max to a scalar
> > > but want the (same) result in all vector elements?
> >
> > Yes. Alan Hayward's cond-reduction patch is set up to perform a
> > reduction to scalar, followed by a scalar broadcast to get the value
> > into all positions. It happens that our most efficient expansion to
> > reduce to scalar will naturally produce the value in all positions.
>
> It also is many insns after expand, so relying on combine to combine
> all that plus the following splat (as Richard suggests below) is not
> really going to work.
>
> If there also are targets where the _scal version is cheaper, maybe
> we should keep both, and have expand expand to whatever the target
> supports?
Wait .. so you don't actually have an instruction to do, say,
REDUC_MAX_EXPR (neither to scalar nor to vector)? Then it's better
to _not_ define such pattern and let the vectorizer generate
its fallback code. If the fallback code isn't "best" then better
think of a way to make it choose the best variant out of its
available ones (and maybe add another). I think it tests
availability of the building blocks for the variants and simply
picks the first that works without checking the cost model.
Richard.
>
> Segher
>
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)