This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

From: "Kumar, Venkataramanan" <Venkataramanan dot Kumar at amd dot com>
To: "pinskia at gmail dot com" <pinskia at gmail dot com>
Cc: "Dr. Philipp Tomsich" <philipp dot tomsich at theobroma-systems dot com>, "Benedikt Huber" <benedikt dot huber at theobroma-systems dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
Date: Mon, 29 Jun 2015 08:17:03 +0000
Subject: RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
Authentication-results: sourceware.org; auth=none
Authentication-results: spf=none (sender IP is 165.204.84.221) smtp.mailfrom=amd.com; gcc.gnu.org; dkim=none (message not signed) header.d=none;
References: <1434629045-24650-1-git-send-email-benedikt dot huber at theobroma-systems dot com> <8B73CF78-11D4-4963-A60A-E1C2A3B219E2 at gmail dot com> <F2FF9755-1DF9-4000-8602-77AB12077240 at theobroma-systems dot com> <7794A52CE4D579448B959EED7DD0A4723DD10430 at satlexdag06 dot amd dot com> <1E4680F0-02C8-4999-958C-8B531BC850DA at theobroma-systems dot com> <7794A52CE4D579448B959EED7DD0A4723DD104AF at satlexdag06 dot amd dot com> <08D3EBD5-B67B-4D97-9940-3CAE6D020DC6 at gmail dot com>

Hmm,  Reducing the iterations to "1 step for float" and "2 steps for double"

 I got VE (miscompares) on following benchmarks
416.gamess 
453.povray         
454.calculix   
459.GemsFDTD  

Benedikt , I have ICE for 444.namd with your patch,  not sure if something wrong in my local tree.  

Regards,
Venkat.

> -----Original Message-----
> From: pinskia@gmail.com [mailto:pinskia@gmail.com]
> Sent: Sunday, June 28, 2015 8:35 PM
> To: Kumar, Venkataramanan
> Cc: Dr. Philipp Tomsich; Benedikt Huber; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt)
> estimation in -ffast-math
> 
> 
> 
> 
> 
> > On Jun 25, 2015, at 9:44 AM, Kumar, Venkataramanan
> <Venkataramanan.Kumar@amd.com> wrote:
> >
> > I got around ~12% gain with -Ofast -mcpu=cortex-a57.
> 
> I get around 11/12% on thunderX with the patch and the decreasing the
> iterations change (1/2) compared to without the patch.
> 
> Thanks,
> Andrew
> 
> 
> >
> > Regards,
> > Venkat.
> >
> >> -----Original Message-----
> >> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> >> owner@gcc.gnu.org] On Behalf Of Dr. Philipp Tomsich
> >> Sent: Thursday, June 25, 2015 9:13 PM
> >> To: Kumar, Venkataramanan
> >> Cc: Benedikt Huber; pinskia@gmail.com; gcc-patches@gcc.gnu.org
> >> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root
> >> (rsqrt) estimation in -ffast-math
> >>
> >> Kumar,
> >>
> >> what is the relative gain that you see on Cortex-A57?
> >>
> >> Thanks,
> >> Philipp.
> >>
> >>>> On 25 Jun 2015, at 17:35, Kumar, Venkataramanan
> >>> <Venkataramanan.Kumar@amd.com> wrote:
> >>>
> >>> Changing to  "1 step for float" and "2 steps for double" gives
> >>> better gains
> >> now for gromacs on cortex-a57.
> >>>
> >>> Regards,
> >>> Venkat.
> >>>> -----Original Message-----
> >>>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> >>>> owner@gcc.gnu.org] On Behalf Of Benedikt Huber
> >>>> Sent: Thursday, June 25, 2015 4:09 PM
> >>>> To: pinskia@gmail.com
> >>>> Cc: gcc-patches@gcc.gnu.org; philipp.tomsich@theobroma-
> systems.com
> >>>> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root
> >>>> (rsqrt) estimation in -ffast-math
> >>>>
> >>>> Andrew,
> >>>>
> >>>>> This is NOT a win on thunderX at least for single precision
> >>>>> because you have
> >>>> to do the divide and sqrt in the same time as it takes 5 multiples
> >>>> (estimate and step are multiplies in the thunderX pipeline).
> >>>> Doubles is 10 multiplies which is just the same as what the patch
> >>>> does (but it is really slightly less than 10, I rounded up). So in
> >>>> the end this is NOT a win at all for thunderX unless we do one less
> >>>> step for both single
> >> and double.
> >>>>
> >>>> Yes, the expected benefit from rsqrt estimation is implementation
> >>>> specific. If one has a better initial rsqrte or an application that
> >>>> can trade precision for execution time, we could offer a command
> >>>> line option to do only 2 steps for doulbe and 1 step for float;
> >>>> similar to -
> >> mrecip-precision for PowerPC.
> >>>> What are your thoughts on that?
> >>>>
> >>>> Best regards,
> >>>> Benedikt
> >

Follow-Ups:
- Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Dr. Philipp Tomsich
- Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Benedikt Huber
- Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Benedikt Huber

References:
- [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Benedikt Huber
- Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: pinskia
- Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Benedikt Huber
- RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Kumar, Venkataramanan
- Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Dr. Philipp Tomsich
- RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Kumar, Venkataramanan
- Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: pinskia

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]