This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [AArch64] Add precision choices for the reciprocal square root approximation

From: Evandro Menezes <e dot menezes at samsung dot com>
To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
Cc: James Greenhalgh <James dot Greenhalgh at arm dot com>, Andrew Pinski <pinskia at gmail dot com>, nd <nd at arm dot com>
Date: Fri, 18 Mar 2016 11:19:29 -0500
Subject: Re: [AArch64] Add precision choices for the reciprocal square root approximation
Authentication-results: sourceware.org; auth=none
References: <56EB2BDC dot 30209 at samsung dot com> <AM3PR08MB00883C48B491A1BA92CD0783838C0 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com>

On 03/18/16 10:21, Wilco Dijkstra wrote:

Hi Evandro,

For example, though this approximation is improves the performance
noticeably for DF on A57, for SF, not so much, if at all.

I'm still skeptical that you ever can get any gain on scalars. I bet the only gain is on
4x vectorized floats.

I created a simple test that loops around an inline asm version of theNewton series using scalar insns and got these results on A57:


   1/sqrt(x):    18290898/s
   Fast:         45896823/s

   1/sqrtf(x):   69618490/s
   Fast:         61865874/s

So what I would like to see is this implemented in a more general way. We should
be able choose whether to expand depending on the mode - including whether it is
vectorized. For example enable on V4SFmode and maybe V2DFmode, but not
on any scalars.

Then we'd add new CPU tuning settings for division, sqrt and rsqrt (rather than adding lots
of extra tune flags).

If I understood you correctly, would something like coarse tuning flagsalong with target-specific cost or parameters tables be what you have inmind?

Note the md file should call a function in aarch64.c to decide whether to
expand or not (your division approximation patch makes the decision in the md file which
does not seem a good idea).


I agree.  Will modify it.

Thank you,

--
Evandro Menezes

Follow-Ups:
- Re: [AArch64] Add precision choices for the reciprocal square root approximation
  - From: Wilco Dijkstra

References:
- [AArch64] Add precision choices for the reciprocal square root approximation
  - From: Evandro Menezes
- Re: [AArch64] Add precision choices for the reciprocal square root approximation
  - From: Wilco Dijkstra

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]