[ARM] Implement division using vrecpe, vrecps

Wilco Dijkstra Wilco.Dijkstra@arm.com
Fri Nov 2 13:38:00 GMT 2018


Prathamesh Kulkarni wrote:

> This is a rebased version of patch that adds a pattern to neon.md for
> implementing division with multiplication by reciprocal using
> vrecpe/vrecps with -funsafe-math-optimizations excluding -Os.
> The newly added test-cases are not vectorized on armeb target with
> -O2. I posted the analysis for that here:
> https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01765.html

I don't think doing this unconditionally for any CPU is a good idea. On AArch64
we don't enable this for any core since it's not really faster (newer CPUs have
significantly improved division and the reciprocal instructions reduce throughput
of other FMAs). On wrf doing reciprocal square root is far better than reciprocal
division, but it's only faster on some specific CPUs, so it's not enabled by default.

Wilco


More information about the Gcc-patches mailing list