This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [AArch64] Add precision choices for the reciprocal square root approximation
- From: Evandro Menezes <e dot menezes at samsung dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Cc: James Greenhalgh <James dot Greenhalgh at arm dot com>, Andrew Pinski <pinskia at gmail dot com>, nd <nd at arm dot com>
- Date: Fri, 18 Mar 2016 11:19:29 -0500
- Subject: Re: [AArch64] Add precision choices for the reciprocal square root approximation
- Authentication-results: sourceware.org; auth=none
- References: <56EB2BDC dot 30209 at samsung dot com> <AM3PR08MB00883C48B491A1BA92CD0783838C0 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com>
On 03/18/16 10:21, Wilco Dijkstra wrote:
Hi Evandro,
For example, though this approximation is improves the performance
noticeably for DF on A57, for SF, not so much, if at all.
I'm still skeptical that you ever can get any gain on scalars. I bet the only gain is on
4x vectorized floats.
I created a simple test that loops around an inline asm version of the
Newton series using scalar insns and got these results on A57:
1/sqrt(x): 18290898/s
Fast: 45896823/s
1/sqrtf(x): 69618490/s
Fast: 61865874/s
So what I would like to see is this implemented in a more general way. We should
be able choose whether to expand depending on the mode - including whether it is
vectorized. For example enable on V4SFmode and maybe V2DFmode, but not
on any scalars.
Then we'd add new CPU tuning settings for division, sqrt and rsqrt (rather than adding lots
of extra tune flags).
If I understood you correctly, would something like coarse tuning flags
along with target-specific cost or parameters tables be what you have in
mind?
Note the md file should call a function in aarch64.c to decide whether to
expand or not (your division approximation patch makes the decision in the md file which
does not seem a good idea).
I agree. Will modify it.
Thank you,
--
Evandro Menezes