This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math


Benedikt,

Are you developing the reciprocal approximation just for 1/x proper or for any division, as in x/y = x * 1/y?

Thank you,

-- 
Evandro Menezes                              Austin, TX


> -----Original Message-----
> From: Benedikt Huber [mailto:benedikt.huber@theobroma-systems.com]
> Sent: Wednesday, June 24, 2015 12:11
> To: Dr. Philipp Tomsich
> Cc: Evandro Menezes; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt)
> estimation in -ffast-math
> 
> Evandro,
> 
> Yes, we also have the 1/x approximation.
> However we do not have the test cases yet, and it also would need some clean
> up.
> I am going to provide a patch for that soon (say next week).
> Also, for this optimization we have *not* yet found a benchmark with
> significant improvements.
> 
> Best Regards,
> Benedikt
> 
> 
> > On 24 Jun 2015, at 18:52, Dr. Philipp Tomsich <philipp.tomsich@theobroma-
> systems.com> wrote:
> >
> > Evandro,
> >
> > Weâve seen a 28% speed-up on gromacs in SPECfp for the (scalar) reciprocal
> sqrt.
> >
> > Also, the âreciprocal divideâ patches are floating around in various
> > of our git-tree, but arenât ready for public consumption, yetâ Iâll
> > leave Benedikt to comment on potential timelines for getting that pushed
> out.
> >
> > Best,
> > Philipp.
> >
> >> On 24 Jun 2015, at 18:42, Evandro Menezes <e.menezes@samsung.com> wrote:
> >>
> >> Benedikt,
> >>
> >> You beat me to it! :-)  Do you have the implementation for dividing
> >> using the Newton series as well?
> >>
> >> I'm not sure that the series is always for all data types and on all
> >> processors.  It would be useful to allow each AArch64 processor to
> >> enable this or not depending on the data type.  BTW, do you have some
> >> tests showing the speed up?
> >>
> >> Thank you,
> >>
> >> --
> >> Evandro Menezes                              Austin, TX
> >>
> >>> -----Original Message-----
> >>> From: gcc-patches-owner@gcc.gnu.org
> >>> [mailto:gcc-patches-owner@gcc.gnu.org]
> >> On
> >>> Behalf Of Benedikt Huber
> >>> Sent: Thursday, June 18, 2015 7:04
> >>> To: gcc-patches@gcc.gnu.org
> >>> Cc: benedikt.huber@theobroma-systems.com; philipp.tomsich@theobroma-
> >>> systems.com
> >>> Subject: [PATCH] [aarch64] Implemented reciprocal square root
> >>> (rsqrt) estimation in -ffast-math
> >>>
> >>> arch64 offers the instructions frsqrte and frsqrts, for rsqrt
> >>> estimation
> >> and
> >>> a Newton-Raphson step, respectively.
> >>> There are ARMv8 implementations where this is faster than using fdiv
> >>> and rsqrt.
> >>> It runs three steps for double and two steps for float to achieve
> >>> the
> >> needed
> >>> precision.
> >>>
> >>> There is one caveat and open question.
> >>> Since -ffast-math enables flush to zero intermediate values between
> >>> approximation steps will be flushed to zero if they are denormal.
> >>> E.g. This happens in the case of rsqrt (DBL_MAX) and rsqrtf (FLT_MAX).
> >>> The test cases pass, but it is unclear to me whether this is
> >>> expected behavior with -ffast-math.
> >>>
> >>> The patch applies to commit:
> >>> svn+ssh://gcc.gnu.org/svn/gcc/trunk@224470
> >>>
> >>> Please consider including this patch.
> >>> Thank you and best regards,
> >>> Benedikt Huber
> >>>
> >>> Benedikt Huber (1):
> >>> 2015-06-15  Benedikt Huber  <benedikt.huber@theobroma-systems.com>
> >>>
> >>> gcc/ChangeLog                            |   9 +++
> >>> gcc/config/aarch64/aarch64-builtins.c    |  60 ++++++++++++++++
> >>> gcc/config/aarch64/aarch64-protos.h      |   2 +
> >>> gcc/config/aarch64/aarch64-simd.md       |  27 ++++++++
> >>> gcc/config/aarch64/aarch64.c             |  63 +++++++++++++++++
> >>> gcc/config/aarch64/aarch64.md            |   3 +
> >>> gcc/testsuite/gcc.target/aarch64/rsqrt.c | 113
> >>> +++++++++++++++++++++++++++++++
> >>> 7 files changed, 277 insertions(+)
> >>> create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt.c
> >>>
> >>> --
> >>> 1.9.1
> >> <Mail Attachment.eml>
> >



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]