This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [AArch64] Add precision choices for the reciprocal square root approximation


On 03/18/16 18:00, Evandro Menezes wrote:
On 03/18/16 17:20, Wilco Dijkstra wrote:
Evandro Menezes <e.menezes@samsung.com> wrote:
On 03/18/16 10:21, Wilco Dijkstra wrote:
Hi Evandro,

For example, though this approximation is improves the performance
noticeably for DF on A57, for SF, not so much, if at all.
I'm still skeptical that you ever can get any gain on scalars. I bet the only gain is on
4x vectorized floats.
I created a simple test that loops around an inline asm version of the
Newton series using scalar insns and got these results on A57:
That's pure max throughput rather than answering the question whether
it speeds up code that does real work. A test that loads an array of vectors and
writes back the unit vectors would be a more realistic scenario.

Note our testing showed rsqrt slows down various benchmarks:
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00574.html.

I remember having seen that, but my point is that if A57 enabled this for only DF, it might be an overall improvement.

If I understood you correctly, would something like coarse tuning flags
along with target-specific cost or parameters tables be what you have in
mind?
Yes, the magic tuning flags can be coarse (on/off is good enough). If we can agree that these expansions are really only useful for 4x vectorized code and not much else then all we need is a function that enables it for those modes. Otherwise we would need per-CPU settings that select which expansions are
enabled for which modes (not just single/double).

Just to be clear, the flags refer to the inner mode, whether scalar or vector. I'm not hopeful that it can be said that this is only useful when vectorized though.


Ping^1

--
Evandro Menezes


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]