[PATCH][ARM] Rewrite vc<cond> NEON patterns to use RTL operations rather than UNSPECs
Kyrill Tkachov
kyrylo.tkachov@arm.com
Thu Feb 12 15:58:00 GMT 2015
Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00232.html
btw, sorry if the diff looks hard to parse. Some patterns are deleted
and replaced with similar-looking ones, which makes the diffs look
weird. I've tried a few diff algorithms but this is the best I got.
Kyrill
On 04/02/15 12:12, Kyrill Tkachov wrote:
> Hi all,
>
> This patch improves the vc<cond> patterns in neon.md to use proper RTL
> operations rather than UNSPECS.
> It is done in a similar way to the analogous aarch64 operations i.e.
> vceq is expressed as
> (neg (eq (...) (...)))
> since we want to write all 1s to the result element when 'eq' holds and
> 0s otherwise.
>
> The catch is that the floating-point comparisons can only be expanded to
> the RTL codes when -funsafe-math-optimizations is given and they must
> continue to use the UNSPECS otherwise.
> For this I've created a define_expand that generates
> the correct RTL depending on -funsafe-math-optimizations and two
> define_insns to match the result: one using the RTL codes and one using
> UNSPECs.
>
> I've also compressed some of the patterns together using iterators for
> the [eq gt ge le lt] cases.
> NOTE: for le and lt before this patch we would never generate
> 'vclt.<type> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'.
> With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly.
> According to the ARM ARM this is just a pseudo-instruction that mapps to
> vcgt with the operands swapped around.
> I've confirmed that gas supports this code.
>
> The vcage and vcagt patterns are rewritten to use the form:
> (neg
> (<cond>
> (abs (...))
> (abs (...))))
>
> and condensed together using iterators as well.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the
> advanced-simd-intrinsics testsuite is passing
> (it did catch some bugs during development of this patch) and tried out
> other NEON intrinsics codebases.
>
> The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn,
> dm, #0' instructions where appropriate instead of the previous vmov of
> #0 into a temp and then a 'vcgt.<type> dn, temp, dm'.
> I think that is correct behaviour since the test was trying to make sure
> that we didn't generate a .u<size>-typed comparison with #0, which is
> what the PR was talking about (from what I can gather).
>
> What do people think of this approach?
> I'm proposing this for next stage1, of course.
>
> Thanks,
> Kyrill
>
>
> 2015-02-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>
> * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code
> iterators.
> (cmp_op, cmp_type): New code attributes.
> (NEON_VCMP, NEON_VACMP): New int iterators.
> (cmp_op_unsp): New int attribute.
> * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand.
> (neon_vceq<mode>): Delete.
> (neon_vc<cmp_op><mode>_insn): New pattern.
> (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise.
> (neon_vcgeu<mode>): Delete.
> (neon_vcle<mode>): Likewise.
> (neon_vclt<mode>: Likewise.
> (neon_vcage<mode>): Likewise.
> (neon_vcagt<mode>): Likewise.
> (neon_vca<cmp_op><mode>): New define_expand.
> (neon_vca<cmp_op><mode>_insn): New pattern.
> (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise.
>
> 2015-02-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>
> * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns
> to look for vcl* where appropriate.
More information about the Gcc-patches
mailing list