This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
- From: "Kumar, Venkataramanan" <Venkataramanan dot Kumar at amd dot com>
- To: Benedikt Huber <benedikt dot huber at theobroma-systems dot com>, "pinskia at gmail dot com" <pinskia at gmail dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, "philipp dot tomsich at theobroma-systems dot com" <philipp dot tomsich at theobroma-systems dot com>
- Date: Thu, 25 Jun 2015 15:35:28 +0000
- Subject: RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is 165.204.84.222) smtp.mailfrom=amd.com; gcc.gnu.org; dkim=none (message not signed) header.d=none;
- References: <1434629045-24650-1-git-send-email-benedikt dot huber at theobroma-systems dot com> <8B73CF78-11D4-4963-A60A-E1C2A3B219E2 at gmail dot com> <F2FF9755-1DF9-4000-8602-77AB12077240 at theobroma-systems dot com>
Changing to "1 step for float" and "2 steps for double" gives better gains now for gromacs on cortex-a57.
Regards,
Venkat.
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of Benedikt Huber
> Sent: Thursday, June 25, 2015 4:09 PM
> To: pinskia@gmail.com
> Cc: gcc-patches@gcc.gnu.org; philipp.tomsich@theobroma-systems.com
> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt)
> estimation in -ffast-math
>
> Andrew,
>
> > This is NOT a win on thunderX at least for single precision because you have
> to do the divide and sqrt in the same time as it takes 5 multiples (estimate
> and step are multiplies in the thunderX pipeline). Doubles is 10 multiplies
> which is just the same as what the patch does (but it is really slightly less than
> 10, I rounded up). So in the end this is NOT a win at all for thunderX unless
> we do one less step for both single and double.
>
> Yes, the expected benefit from rsqrt estimation is implementation specific. If
> one has a better initial rsqrte or an application that can trade precision for
> execution time, we could offer a command line option to do only 2 steps for
> doulbe and 1 step for float; similar to -mrecip-precision for PowerPC.
> What are your thoughts on that?
>
> Best regards,
> Benedikt