This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Ping with testcase: [PATCH][AArch64] Fix __builtin_aarch64_absdi, must not fold to ABS_EXPR
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Alan Lawrence <alan dot lawrence at arm dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 26 Nov 2014 16:35:50 +0000
- Subject: Re: Ping with testcase: [PATCH][AArch64] Fix __builtin_aarch64_absdi, must not fold to ABS_EXPR
- Authentication-results: sourceware.org; auth=none
- References: <546A167C dot 5090902 at arm dot com> <5475F813 dot 9090001 at arm dot com>
On Wed, Nov 26, 2014 at 03:56:03PM +0000, Alan Lawrence wrote:
> So in case there's any confusion about the behaviour expected of *the vabs
> intrinsic*, here's a testcase (failing without patch, passing with it)...
>
> --Alan
>
> Alan Lawrence wrote:
> > ...as the former is defined as returning MIN_VALUE for argument MIN_VALUE,
> > whereas the latter is 'undefined', and gcc can optimize "abs(x)>=0" to "true",
> > which is wrong for __builtin_aarch64_abs.
> >
> > There has been much debate here, although not recently - I think the last was
> > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00387.html . However without a
> > definite solution, we should at least stop the folding, as otherwise this is a bug.
Well, I don't see myself getting round to looking at the solutions
proposed in that thread any time soon, and as you say this is a longstanding
bug in our intrinsics implementation. So I'm in favour of that part of
your patch going in, though I have some comments:
> @@ -1317,9 +1322,6 @@ aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args,
>
> switch (fcode)
> {
> - BUILTIN_VALLDI (UNOP, abs, 2)
> - return fold_build1 (ABS_EXPR, type, args[0]);
> - break;
> BUILTIN_VALLDI (BINOP, cmge, 0)
> return fold_build2 (GE_EXPR, type, args[0], args[1]);
> break;
Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these
intrinsics? There should be no difference between the mid-end definition
and the intrinsic definition of their behaviour.
I also note that the integer forms of these now end up as an "abs" RTL
expression - can we guarantee that preserves the intrinsics behaviour and
no RTL-folder will come along and mis-optimize? Documentation is vague
on this point.
I'm also not convinced by the SIMD_ARG_SCRATCH foo you add. Looking at
the aarch64.md:absdi2 pattern I can't see why we need that scratch at
all. It seems we could get away with marking operand 0 early-clobber and
using it as our scratch register. Then we could drop all the extra
infrastructure from this patch.
Thanks,
James
> >
> > The complication here is that the folding meant we never expanded the call to
> > __builtin_aarch64_absdi, and thus never realized that expanding would cause an
> > ICE: gen_absdi2() takes only two parameters (source and dest operand), and has
> > no parameter corresponding to the scratch operand in the insn_data; but the code
> > in aarch64_simd_expand_args, reads the number of arguments from the insn_data,
> > and so tries to get another operand from the call to __builtin_aarch64_absdi,
> > causing the ICE. Hence, we have to introduce a SIMD_ARG_SCRATCH, used in
> > aarch64_simd_expand_args to skip over the argument.
> >
> > (There is an alternative way of doing this, i.e. to update the for-loop in
> > aarch64_simd_expand_builtin by adding yet another version of 'k'. However, I
> > thought there were enough versions already that I really didn't want to add one
> > more...)
> >
> > To go with this, I've renamed qualifier_internal to qualifier_scratch, as it's
> > not clear that any non-scratch internal operands (whatever such might be) would
> > want the same treatment!
> >
> > Cross-tested check-gcc on aarch64-none-elf.
> >
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers):
> > Rename qualifier_internal to qualifier_scratch.
> > (aarch64_types_unop_qualifiers, aarch64_init_simd_builtins): Follow
> > renaming.
> > (builtin_simd_arg): New SIMD_ARG_SCRATCH enum value.
> > (aarch64_simd_expand_args): Skip over SIMD_ARG_SCRATCHes.
> > (aarch64_simd_expand_builtin): Handle qualifier_scratch.
> > (aarch64_fold_builtin): Remove folding of abs.
> diff --git a/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..12fdd813bc3f58a183b4986c5c9c532c2b608699
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c
> @@ -0,0 +1,17 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +#include <arm_neon.h>
> +
> +extern void abort (void);
> +
> +int
> +main (int argc, char **argv)
> +{
> + int8x8_t a = vabs_s8 (vdup_n_s8 (-128)); /* Should all be -128. */
> + uint8x8_t b = vcltz_s8 (a); /* Should all be true i.e. -1. */
> + if (vget_lane_u8 (b, 1))
> + return 0;
> + abort ();
> +}
> +