This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch match.pd] Add a simplify rule for x * copysign (1.0, y);


On Fri, Oct 2, 2015 at 11:27 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Oct 02, 2015 at 11:10:58AM +0200, Richard Biener wrote:
>> > BTW, it seems wrf also in many places uses MAX <copysign (1.0, y), 0.0>
>> > or MIN <copysign (1.0, y), 0.0> (always in pairs), would that be also
>> > something to optimize?
>>
>> Hmm, we'll already CSE copysign so the question is how to optimize
>> tem1 = MAX <x, y>; tem2= MIN <x, y>;  Turn them back into control-flow?
>> What would we like to end up with in assembly?
>
> This wasn't a well thought thing.  Just that perhaps depending on how
> copysign is expanded we might merge that with the min/max somehow.
>
>> > Also, the x * copysign (1.0, y) in wrf is actually x * (1/12.) * copysign (1.0, y)
>> > (or similar - other constants), wouldn't it make more sense to optimize that
>> > as x * copysign (1/12., y) first (at least if we can reassociate)?
>>
>> Yeah, I think CST * copysign (CST, ...) should constant fold to
>> copysign (CST', ...)
>> if that's always valid.  I don't think association comes into play
>> here but as always
>> you read the fine prints of the standard for FP optimziations...
>
> The reason talking about reassoc is that we might not necessarily see
> CST * copysign (CST, ...) in the IL, but something different:
>   _2051 = __builtin_copysignf (1.0e+0, vel_2000);
> ...
>   _2059 = _2051 * _2058;
>   _2060 = _2059 * 1.666666753590106964111328125e-2;
> is before reassoc1 and
>   _2051 = __builtin_copysignf (1.0e+0, vel_2000);
> ...
>   _2047 = _2051 * 1.666666753590106964111328125e-2;
>   _2060 = _2047 * _2058;
> is after reassoc1, so in this case reassoc1 worked fine, but in another spot
> I see
>   _2968 = __builtin_copysignf (1.0e+0, vel_2943);
> ...
>   _2973 = _2968 * _2972;
>   _2974 = _2973 * 8.3333335816860198974609375e-2;
> before reassoc1 and
>   _2968 = __builtin_copysignf (1.0e+0, vel_2943);
> ...
>   _3008 = _2972 * 8.3333335816860198974609375e-2;
>   _2974 = _3008 * _2968;
> after reassoc1, so clearly reassoc puts those two together only randomly.
> So, either we'd teach reassoc to prefer putting constant and copysign of
> constant together, or even perform this optimization, or the match.pd
> (or elsewhere) change would need additional smarts.

I think for this case reassoc would need to assign the proper 'rank' to
the DEF with the copysign (based on the rank of its first argument).

But yes, this kind of association dependent patterns need to be handled
in reassoc itself.

> Note, I won't have time to work on this in the near future (OpenMP work
> still on the plate), so if James (or anyone else?) has time for that, it
> would be greatly appreciated.

So maybe just open an enhancement bug for now, citing the WRF use.

Richard.

>         Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]