This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][AArch64] Use conditional negate for abs expansion
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- Cc: "'GCC Patches'" <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 14 May 2015 10:39:10 +0100
- Subject: Re: [PATCH][AArch64] Use conditional negate for abs expansion
- Authentication-results: sourceware.org; auth=none
- References: <000301d080f0$05b46840$111d38c0$ at com> <20150427162640 dot GB1993 at arm dot com> <000901d0810b$3d6ab0d0$b8401270$ at com>
On Mon, Apr 27, 2015 at 05:57:26PM +0100, Wilco Dijkstra wrote:
> > James Greenhalgh wrote:
> > On Mon, Apr 27, 2015 at 02:42:36PM +0100, Wilco Dijkstra wrote:
> > > > -----Original Message-----
> > > > From: Wilco Dijkstra [mailto:wdijkstr@arm.com]
> > > > Sent: 03 March 2015 16:19
> > > > To: GCC Patches
> > > > Subject: [PATCH][AArch64] Use conditional negate for abs expansion
> > > >
> > > > Expand abs into a compare and conditional negate. This is the most obvious expansion,
> > enables
> > > > merging of the comparison into ALU instructions and is faster on all implementations.
> > > > Bootstrapped & regression tested.
> > > >
> > > > int f(int x) { return abs (x + 1); }
> > > >
> > > > Before:
> > > > add w0, w0, 1
> > > > sxtw x0, w0
> > > > eor x1, x0, x0, asr 63
> > > > sub x1, x1, x0, asr 63
> > > > mov x0, x1
> > > > ret
> > > >
> > > > After:
> > > > adds w0, w0, 1
> > > > csneg w0, w0, w0, pl
> > > > ret
> > > >
> > > > ChangeLog:
> > > >
> > > > 2015-03-03 Wilco Dijkstra <wdijkstr@arm.com>
> > > >
> > > > * gcc/config/aarch64/aarch64.md (absdi2): optimize abs expansion.
> > > > (csneg3<mode>_insn): enable expansion of pattern.
> > > > * gcc/testsuite/gcc.target/aarch64/abs_1.c (abs64): update test
> > > > for new abs expansion. (abs64_in_dreg): likewise.
> >
> >
> > This looks like it breaks support for abs in a D register (for example
> > at the end of a loop, or extracted from Neon Intrinsics, etc).
> >
> > e.g. (totally contrived...)
> >
> > int64x1_t
> > abs_max (int64x2_t x, int64_t *wb)
> > {
> > int64_t y = vgetq_lane_s64 (x, 0);
> > if (y < 0)
> > y = -y;
> > return vdup_n_s64 (y);
> > }
> >
> > Which currently generates:
> >
> > abs_max:
> > abs d0, d0
> > ret
> >
> > I suppose we don't need to worry too much about that (and the current
> > implementation doesn't seem to catch it reliably anyway), but it would be
> > good if we could keep the support - even if it is rarely used.
>
> Well it may be possible to write a peephole for this rare case, but I hope
> people would use a vabs if that's what they wanted...
OK, I've seen some of the pain of our current abs expansion code, so I
suppose that we want to get the common case right first, rather
than be too concerned about contrived corner cases.
This patch is OK.
Thanks,
James