This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch match.pd] Add a simplify rule for x * copysign (1.0, y);
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: James Greenhalgh <james dot greenhalgh at arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 2 Oct 2015 11:27:11 +0200
- Subject: Re: [Patch match.pd] Add a simplify rule for x * copysign (1.0, y);
- Authentication-results: sourceware.org; auth=none
- References: <1443707835-6888-1-git-send-email-james dot greenhalgh at arm dot com> <20151001183643 dot GD28276 at tucnak dot redhat dot com> <CAFiYyc1FTirc+cgo1fmqanNRDw=iXSN2uH6LjEaj7yGyr1isqA at mail dot gmail dot com> <20151002090358 dot GE28276 at tucnak dot redhat dot com> <CAFiYyc24XG5L8wUCBd-deBEw1HapcAmxSgyvg8JYnPKPXrxA7g at mail dot gmail dot com>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Fri, Oct 02, 2015 at 11:10:58AM +0200, Richard Biener wrote:
> > BTW, it seems wrf also in many places uses MAX <copysign (1.0, y), 0.0>
> > or MIN <copysign (1.0, y), 0.0> (always in pairs), would that be also
> > something to optimize?
>
> Hmm, we'll already CSE copysign so the question is how to optimize
> tem1 = MAX <x, y>; tem2= MIN <x, y>; Turn them back into control-flow?
> What would we like to end up with in assembly?
This wasn't a well thought thing. Just that perhaps depending on how
copysign is expanded we might merge that with the min/max somehow.
> > Also, the x * copysign (1.0, y) in wrf is actually x * (1/12.) * copysign (1.0, y)
> > (or similar - other constants), wouldn't it make more sense to optimize that
> > as x * copysign (1/12., y) first (at least if we can reassociate)?
>
> Yeah, I think CST * copysign (CST, ...) should constant fold to
> copysign (CST', ...)
> if that's always valid. I don't think association comes into play
> here but as always
> you read the fine prints of the standard for FP optimziations...
The reason talking about reassoc is that we might not necessarily see
CST * copysign (CST, ...) in the IL, but something different:
_2051 = __builtin_copysignf (1.0e+0, vel_2000);
...
_2059 = _2051 * _2058;
_2060 = _2059 * 1.666666753590106964111328125e-2;
is before reassoc1 and
_2051 = __builtin_copysignf (1.0e+0, vel_2000);
...
_2047 = _2051 * 1.666666753590106964111328125e-2;
_2060 = _2047 * _2058;
is after reassoc1, so in this case reassoc1 worked fine, but in another spot
I see
_2968 = __builtin_copysignf (1.0e+0, vel_2943);
...
_2973 = _2968 * _2972;
_2974 = _2973 * 8.3333335816860198974609375e-2;
before reassoc1 and
_2968 = __builtin_copysignf (1.0e+0, vel_2943);
...
_3008 = _2972 * 8.3333335816860198974609375e-2;
_2974 = _3008 * _2968;
after reassoc1, so clearly reassoc puts those two together only randomly.
So, either we'd teach reassoc to prefer putting constant and copysign of
constant together, or even perform this optimization, or the match.pd
(or elsewhere) change would need additional smarts.
Note, I won't have time to work on this in the near future (OpenMP work
still on the plate), so if James (or anyone else?) has time for that, it
would be greatly appreciated.
Jakub