This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

From: "Dr. Philipp Tomsich" <philipp dot tomsich at theobroma-systems dot com>
To: "Kumar, Venkataramanan" <Venkataramanan dot Kumar at amd dot com>
Cc: Benedikt Huber <benedikt dot huber at theobroma-systems dot com>, "pinskia at gmail dot com" <pinskia at gmail dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
Date: Thu, 25 Jun 2015 17:42:52 +0200
Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
Authentication-results: sourceware.org; auth=none
References: <1434629045-24650-1-git-send-email-benedikt dot huber at theobroma-systems dot com> <8B73CF78-11D4-4963-A60A-E1C2A3B219E2 at gmail dot com> <F2FF9755-1DF9-4000-8602-77AB12077240 at theobroma-systems dot com> <7794A52CE4D579448B959EED7DD0A4723DD10430 at satlexdag06 dot amd dot com>

Kumar,

what is the relative gain that you see on Cortex-A57?

Thanks,
Philipp.

> On 25 Jun 2015, at 17:35, Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com> wrote:
> 
> Changing to  "1 step for float" and "2 steps for double" gives better gains now for gromacs on cortex-a57.
> 
> Regards,
> Venkat.
>> -----Original Message-----
>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>> owner@gcc.gnu.org] On Behalf Of Benedikt Huber
>> Sent: Thursday, June 25, 2015 4:09 PM
>> To: pinskia@gmail.com
>> Cc: gcc-patches@gcc.gnu.org; philipp.tomsich@theobroma-systems.com
>> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt)
>> estimation in -ffast-math
>> 
>> Andrew,
>> 
>>> This is NOT a win on thunderX at least for single precision because you have
>> to do the divide and sqrt in the same time as it takes 5 multiples (estimate
>> and step are multiplies in the thunderX pipeline).  Doubles is 10 multiplies
>> which is just the same as what the patch does (but it is really slightly less than
>> 10, I rounded up). So in the end this is NOT a win at all for thunderX unless
>> we do one less step for both single and double.
>> 
>> Yes, the expected benefit from rsqrt estimation is implementation specific. If
>> one has a better initial rsqrte or an application that can trade precision for
>> execution time, we could offer a command line option to do only 2 steps for
>> doulbe and 1 step for float; similar to -mrecip-precision for PowerPC.
>> What are your thoughts on that?
>> 
>> Best regards,
>> Benedikt

Follow-Ups:
- RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Kumar, Venkataramanan

References:
- [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Benedikt Huber
- Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: pinskia
- Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Benedikt Huber
- RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
  - From: Kumar, Venkataramanan

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]