This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*
- From: Richard Sandiford <richard dot sandiford at arm dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: David Sherwood <David dot Sherwood at arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 19 Aug 2015 10:54:44 +0100
- Subject: Re: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*
- Authentication-results: sourceware.org; auth=none
- References: <000001d0d5b0$5da4dbb0$18ee9310$ at arm dot com> <CAFiYyc1bgWwdV4PRLBuUv3yC0X-k5gJVuyyV9V7Vrz3Lte+wZw at mail dot gmail dot com> <000001d0d8cf$2fb42770$8f1c7650$ at arm dot com> <CAFiYyc2XT+iqyRNgp+N2gWsaP-=1xVUWsuUEj+bOq_UmE_1eLw at mail dot gmail dot com> <000001d0d9a6$1efdc350$5cf949f0$ at arm dot com> <CAFiYyc3CLF8beK5GaB86Ad7623gWc9yhc8nTom-ByoaHTEMyOg at mail dot gmail dot com> <87fv3gbs36 dot fsf at e105548-lin dot cambridge dot arm dot com> <CAFiYyc1q7deng7mjnt788RwqkHuvdDCKW=uWpd=goEvTeZiK5Q at mail dot gmail dot com>
Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Aug 18, 2015 at 4:15 PM, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Tue, Aug 18, 2015 at 1:07 PM, David Sherwood
>>> <david.sherwood@arm.com> wrote:
>>>>> On Mon, Aug 17, 2015 at 11:29 AM, David Sherwood
>>>>> <david.sherwood@arm.com> wrote:
>>>>> > Hi Richard,
>>>>> >
>>>>> > Thanks for the reply. I'd chosen to add new expressions as this
>>>>> > seemed more
>>>>> > consistent with the existing MAX_EXPR and MIN_EXPR tree codes. In
>>>>> > addition it
>>>>> > would seem to provide more opportunities for optimisation than a
>>>>> > target-specific
>>>>> > builtin implementation would. I accept that optimisation
>>>>> > opportunities will
>>>>> > be more limited for strict math compilation, but that it was still
>>>>> > worth having
>>>>> > them. Also, if we did map it to builtins then the scalar version would go
>>>>> > through the optabs and the vector version would go through the
>>>>> > target's builtin
>>>>> > expansion, which doesn't seem very consistent.
>>>>>
>>>>> On another note ISTR you can't associate STRICT_MIN/MAX_EXPR and thus
>>>>> you can't vectorize anyway? (strict IEEE behavior is about NaNs, correct?)
>>>> I thought for this particular case associativity wasn't an issue?
>>>> We're not doing any
>>>> reductions here, just simply performing max/min operations on each
>>>> pair of elements
>>>> in the vectors. I thought for IEEE-compliant behaviour we just need to
>>>> ensure that for
>>>> each pair of elements if one element is a NaN we return the other one.
>>>
>>> Hmm, true. Ok, my comment still stands - I don't see that using a
>>> tree code is the best thing to do here. You can add fmin/max optabs
>>> and special expansion of BUILT_IN_FMIN/MAX and you can use a target
>>> builtin for the vectorized variant.
>>>
>>> The reason I am pushing against a new tree code is that we'd have an
>>> awful lot of similar codes when pushing other flag related IL
>>> specialities to actual IL constructs. And we still need to find a
>>> consistent way to do that.
>>
>> In this case though the new code is really the "native" min/max operation
>> for fp, rather than some weird flag-dependent behaviour. Maybe it's
>> a bit unfortunate that the non-strict min/max fp operation got mapped
>> to the generic MIN_EXPR and MAX_EXPR when the non-strict version is really
>> the flag-related modification. The STRICT_* prefix is forced by that and
>> might make it seem like more of a special case than it really is.
>
> In some sense. But the "strict" version already has a builtin (just no
> special expander in builtins.c). We usually don't add 1:1 tree codes
> for existing builtins (why have builtins at all then?).
We still need the builtin to match the C function (and to allow direct
calls to __builtin_fmin, etc., which are occasionally useful).
>> If you're still not convinced, how about an internal function instead
>> of a built-in function, so that we can continue to use optabs for all
>> cases? I'd really like to avoid forcing such a generic concept down to
>> target-specific builtins with target-specific expansion code, especially
>> when the same concept is exposed by target-independent code for scalars.
>
> The target builtin is for the vectorized variant - not all targets might have
> that and we'd need to query the target about this. So using a IFN would
> mean adding a target hook for that query.
No, the idea is that if we have a tree code or an internal function, the
decision about whether we have target support is based on a query of the
optabs (just like it is for scalar, and for other vectorisable tree codes).
No new hooks are needed.
The patch checked for target support that way.
> > TBH though I'm not sure why an internal_fn value (or a target-specific
> > builtin enum value) is worse than a tree-code value, unless the limit
> > of the tree_code bitfield is in sight (maybe it is).
>
> I think tree_code is 64bits now.
Even better :-)
Thanks,
Richard