This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch match.pd] Add a simplify rule for x * copysign (1.0, y);
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: "pinskia at gmail dot com" <pinskia at gmail dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 1 Oct 2015 15:51:26 +0100
- Subject: Re: [Patch match.pd] Add a simplify rule for x * copysign (1.0, y);
- Authentication-results: sourceware.org; auth=none
- References: <1443707835-6888-1-git-send-email-james dot greenhalgh at arm dot com> <9593201B-9298-4529-A4DA-41B5DD6DCBFE at gmail dot com>
On Thu, Oct 01, 2015 at 03:28:22PM +0100, pinskia@gmail.com wrote:
> >
> > On Oct 1, 2015, at 6:57 AM, James Greenhalgh <james.greenhalgh@arm.com> wrote:
> >
> >
> > Hi,
> >
> > If it is cheap enough to treat a floating-point value as an integer and
> > to do bitwise arithmetic on it (as it is for AArch64) we can rewrite:
> >
> > x * copysign (1.0, y)
> >
> > as:
> >
> > x ^ (y & (1 << sign_bit_position))
>
> Why not just convert it to copysign (x, y) instead and let expand chose
> the better implementation?
Because that transformation is invalid :-)
let x = -1.0, y = -1.0
x * copysign (1.0, y)
= -1.0 * copysign (1.0, -1.0)
= -1.0 * -1.0
= 1.0
copysign (x, y)
= copysign (-1.0, -1.0)
= -1.0
Or have I completely lost my maths skills :-)
> Also I think this can only be done for finite and non trapping types.
That may be well true, I swithered either way and went for no checks, but
I'd happily go back on that and wrap this in something suitable restrictive
if I need to.
Thanks,
James
> >
> > This patch implements that rewriting rule in match.pd, and a testcase
> > expecting the transform.
> >
> > This is worth about 6% in 481.wrf for AArch64. I don't don't know enough
> > about the x86 microarchitectures to know how productive this transformation
> > is there. In Spec2006FP I didn't see any interesting results in either
> > direction. Looking at code generation for the testcase I add, I think the
> > x86 code generation looks worse, but I can't understand why it doesn't use
> > a vector-side xor and load the mask vector-side. With that fixed up I think
> > the code generation would look better - though as I say, I'm not an expert
> > here...
> >
> > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> >
> > OK for trunk?
> >
> > Thanks,
> > James
> >
> > ---
> > gcc/
> >
> > 2015-10-01 James Greenhalgh <james.greenhalgh@arm.com>
> >
> > * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> >
> > gcc/testsuite/
> >
> > 2015-10-01 James Greenhalgh <james.greenhalgh@arm.com>
> >
> > * gcc.dg/tree-ssa/copysign.c: New.
> >
> > <0001-Patch-match.pd-Add-a-simplify-rule-for-x-copysign-1..patch>
>