[PATCH][ARM] Rewrite vc<cond> NEON patterns to use RTL operations rather than UNSPECs

Ramana Radhakrishnan ramana.gcc@googlemail.com
Thu Apr 23 15:01:00 GMT 2015


On Wed, Feb 4, 2015 at 12:12 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
> Hi all,
>
> This patch improves the vc<cond> patterns in neon.md to use proper RTL
> operations rather than UNSPECS.
> It is done in a similar way to the analogous aarch64 operations i.e. vceq is
> expressed as
> (neg (eq (...) (...)))
> since we want to write all 1s to the result element when 'eq' holds and 0s
> otherwise.
>
> The catch is that the floating-point comparisons can only be expanded to the
> RTL codes when -funsafe-math-optimizations is given and they must continue
> to use the UNSPECS otherwise.
> For this I've created a define_expand that generates
> the correct RTL depending on -funsafe-math-optimizations and two
> define_insns to match the result: one using the RTL codes and one using
> UNSPECs.
>
> I've also compressed some of the patterns together using iterators for the
> [eq gt ge le lt] cases.
> NOTE: for le and lt before this patch we would never generate 'vclt.<type>
> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'.
> With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly.
> According to the ARM ARM this is just a pseudo-instruction that mapps to
> vcgt with the operands swapped around.
> I've confirmed that gas supports this code.
>
> The vcage and vcagt patterns are rewritten to use the form:
> (neg
>   (<cond>
>     (abs (...))
>     (abs (...))))
>
> and condensed together using iterators as well.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the
> advanced-simd-intrinsics testsuite is passing
> (it did catch some bugs during development of this patch) and tried out
> other NEON intrinsics codebases.
>
> The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn, dm,
> #0' instructions where appropriate instead of the previous vmov of #0 into a
> temp and then a 'vcgt.<type> dn, temp, dm'.
> I think that is correct behaviour since the test was trying to make sure
> that we didn't generate a .u<size>-typed comparison with #0, which is what
> the PR was talking about (from what I can gather).
>
> What do people think of this approach?
> I'm proposing this for next stage1, of course.
>


This is OK - thanks.

Ramana
> Thanks,
> Kyrill
>
>
> 2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code
>     iterators.
>     (cmp_op, cmp_type): New code attributes.
>     (NEON_VCMP, NEON_VACMP): New int iterators.
>     (cmp_op_unsp): New int attribute.
>     * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand.
>     (neon_vceq<mode>): Delete.
>     (neon_vc<cmp_op><mode>_insn): New pattern.
>     (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise.
>     (neon_vcgeu<mode>): Delete.
>     (neon_vcle<mode>): Likewise.
>     (neon_vclt<mode>: Likewise.
>     (neon_vcage<mode>): Likewise.
>     (neon_vcagt<mode>): Likewise.
>     (neon_vca<cmp_op><mode>): New define_expand.
>     (neon_vca<cmp_op><mode>_insn): New pattern.
>     (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise.
>
> 2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns
>     to look for vcl* where appropriate.



More information about the Gcc-patches mailing list