[PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
Dr. Philipp Tomsich
philipp.tomsich@theobroma-systems.com
Wed Jun 24 20:11:00 GMT 2015
Evandro,
Shouldn't ‘execute_cse_reciprocals_1’ take care of this, once the reciprocal-division is implemented?
Do you think there’s additional work needed to catch all cases/opportunities?
Best,
Philipp.
> On 24 Jun 2015, at 20:19, Evandro Menezes <e.menezes@samsung.com> wrote:
>
> Benedikt,
>
> Are you developing the reciprocal approximation just for 1/x proper or for any division, as in x/y = x * 1/y?
>
> Thank you,
>
> --
> Evandro Menezes Austin, TX
>
>
>> -----Original Message-----
>> From: Benedikt Huber [mailto:benedikt.huber@theobroma-systems.com]
>> Sent: Wednesday, June 24, 2015 12:11
>> To: Dr. Philipp Tomsich
>> Cc: Evandro Menezes; gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt)
>> estimation in -ffast-math
>>
>> Evandro,
>>
>> Yes, we also have the 1/x approximation.
>> However we do not have the test cases yet, and it also would need some clean
>> up.
>> I am going to provide a patch for that soon (say next week).
>> Also, for this optimization we have *not* yet found a benchmark with
>> significant improvements.
>>
>> Best Regards,
>> Benedikt
>>
>>
>>> On 24 Jun 2015, at 18:52, Dr. Philipp Tomsich <philipp.tomsich@theobroma-
>> systems.com> wrote:
>>>
>>> Evandro,
>>>
>>> We’ve seen a 28% speed-up on gromacs in SPECfp for the (scalar) reciprocal
>> sqrt.
>>>
>>> Also, the “reciprocal divide” patches are floating around in various
>>> of our git-tree, but aren’t ready for public consumption, yet… I’ll
>>> leave Benedikt to comment on potential timelines for getting that pushed
>> out.
>>>
>>> Best,
>>> Philipp.
>>>
>>>> On 24 Jun 2015, at 18:42, Evandro Menezes <e.menezes@samsung.com> wrote:
>>>>
>>>> Benedikt,
>>>>
>>>> You beat me to it! :-) Do you have the implementation for dividing
>>>> using the Newton series as well?
>>>>
>>>> I'm not sure that the series is always for all data types and on all
>>>> processors. It would be useful to allow each AArch64 processor to
>>>> enable this or not depending on the data type. BTW, do you have some
>>>> tests showing the speed up?
>>>>
>>>> Thank you,
>>>>
>>>> --
>>>> Evandro Menezes Austin, TX
>>>>
>>>>> -----Original Message-----
>>>>> From: gcc-patches-owner@gcc.gnu.org
>>>>> [mailto:gcc-patches-owner@gcc.gnu.org]
>>>> On
>>>>> Behalf Of Benedikt Huber
>>>>> Sent: Thursday, June 18, 2015 7:04
>>>>> To: gcc-patches@gcc.gnu.org
>>>>> Cc: benedikt.huber@theobroma-systems.com; philipp.tomsich@theobroma-
>>>>> systems.com
>>>>> Subject: [PATCH] [aarch64] Implemented reciprocal square root
>>>>> (rsqrt) estimation in -ffast-math
>>>>>
>>>>> arch64 offers the instructions frsqrte and frsqrts, for rsqrt
>>>>> estimation
>>>> and
>>>>> a Newton-Raphson step, respectively.
>>>>> There are ARMv8 implementations where this is faster than using fdiv
>>>>> and rsqrt.
>>>>> It runs three steps for double and two steps for float to achieve
>>>>> the
>>>> needed
>>>>> precision.
>>>>>
>>>>> There is one caveat and open question.
>>>>> Since -ffast-math enables flush to zero intermediate values between
>>>>> approximation steps will be flushed to zero if they are denormal.
>>>>> E.g. This happens in the case of rsqrt (DBL_MAX) and rsqrtf (FLT_MAX).
>>>>> The test cases pass, but it is unclear to me whether this is
>>>>> expected behavior with -ffast-math.
>>>>>
>>>>> The patch applies to commit:
>>>>> svn+ssh://gcc.gnu.org/svn/gcc/trunk@224470
>>>>>
>>>>> Please consider including this patch.
>>>>> Thank you and best regards,
>>>>> Benedikt Huber
>>>>>
>>>>> Benedikt Huber (1):
>>>>> 2015-06-15 Benedikt Huber <benedikt.huber@theobroma-systems.com>
>>>>>
>>>>> gcc/ChangeLog | 9 +++
>>>>> gcc/config/aarch64/aarch64-builtins.c | 60 ++++++++++++++++
>>>>> gcc/config/aarch64/aarch64-protos.h | 2 +
>>>>> gcc/config/aarch64/aarch64-simd.md | 27 ++++++++
>>>>> gcc/config/aarch64/aarch64.c | 63 +++++++++++++++++
>>>>> gcc/config/aarch64/aarch64.md | 3 +
>>>>> gcc/testsuite/gcc.target/aarch64/rsqrt.c | 113
>>>>> +++++++++++++++++++++++++++++++
>>>>> 7 files changed, 277 insertions(+)
>>>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt.c
>>>>>
>>>>> --
>>>>> 1.9.1
>>>> <Mail Attachment.eml>
>>>
>
>
More information about the Gcc-patches
mailing list