This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][ARM] Rewrite vc<cond> NEON patterns to use RTL operations rather than UNSPECs

On Wed, Feb 4, 2015 at 12:12 PM, Kyrill Tkachov <> wrote:
> Hi all,
> This patch improves the vc<cond> patterns in to use proper RTL
> operations rather than UNSPECS.
> It is done in a similar way to the analogous aarch64 operations i.e. vceq is
> expressed as
> (neg (eq (...) (...)))
> since we want to write all 1s to the result element when 'eq' holds and 0s
> otherwise.
> The catch is that the floating-point comparisons can only be expanded to the
> RTL codes when -funsafe-math-optimizations is given and they must continue
> to use the UNSPECS otherwise.
> For this I've created a define_expand that generates
> the correct RTL depending on -funsafe-math-optimizations and two
> define_insns to match the result: one using the RTL codes and one using
> I've also compressed some of the patterns together using iterators for the
> [eq gt ge le lt] cases.
> NOTE: for le and lt before this patch we would never generate 'vclt.<type>
> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'.
> With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly.
> According to the ARM ARM this is just a pseudo-instruction that mapps to
> vcgt with the operands swapped around.
> I've confirmed that gas supports this code.
> The vcage and vcagt patterns are rewritten to use the form:
> (neg
>   (<cond>
>     (abs (...))
>     (abs (...))))
> and condensed together using iterators as well.
> Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the
> advanced-simd-intrinsics testsuite is passing
> (it did catch some bugs during development of this patch) and tried out
> other NEON intrinsics codebases.
> The test now generates 'vclt.<type> dn, dm,
> #0' instructions where appropriate instead of the previous vmov of #0 into a
> temp and then a 'vcgt.<type> dn, temp, dm'.
> I think that is correct behaviour since the test was trying to make sure
> that we didn't generate a .u<size>-typed comparison with #0, which is what
> the PR was talking about (from what I can gather).
> What do people think of this approach?
> I'm proposing this for next stage1, of course.

This is OK - thanks.

> Thanks,
> Kyrill
> 2015-02-04  Kyrylo Tkachov  <>
>     * config/arm/ (GTGE, GTUGEU, COMPARISONS): New code
>     iterators.
>     (cmp_op, cmp_type): New code attributes.
>     (NEON_VCMP, NEON_VACMP): New int iterators.
>     (cmp_op_unsp): New int attribute.
>     * config/arm/ (neon_vc<cmp_op><mode>): New define_expand.
>     (neon_vceq<mode>): Delete.
>     (neon_vc<cmp_op><mode>_insn): New pattern.
>     (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise.
>     (neon_vcgeu<mode>): Delete.
>     (neon_vcle<mode>): Likewise.
>     (neon_vclt<mode>: Likewise.
>     (neon_vcage<mode>): Likewise.
>     (neon_vcagt<mode>): Likewise.
>     (neon_vca<cmp_op><mode>): New define_expand.
>     (neon_vca<cmp_op><mode>_insn): New pattern.
>     (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise.
> 2015-02-04  Kyrylo Tkachov  <>
>     * Update vcg* scan-assembly patterns
>     to look for vcl* where appropriate.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]