This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Ping with testcase: [PATCH][AArch64] Fix __builtin_aarch64_absdi, must not fold to ABS_EXPR


On Wed, Nov 26, 2014 at 03:56:03PM +0000, Alan Lawrence wrote:
> So in case there's any confusion about the behaviour expected of *the vabs 
> intrinsic*, here's a testcase (failing without patch, passing with it)...
> 
> --Alan
> 
> Alan Lawrence wrote:
> > ...as the former is defined as returning MIN_VALUE for argument MIN_VALUE, 
> > whereas the latter is 'undefined', and gcc can optimize "abs(x)>=0" to "true", 
> > which is wrong for __builtin_aarch64_abs.
> > 
> > There has been much debate here, although not recently - I think the last was 
> > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00387.html . However without a 
> > definite solution, we should at least stop the folding, as otherwise this is a bug.


Well, I don't see myself getting round to looking at the solutions
proposed in that thread any time soon, and as you say this is a longstanding
bug in our intrinsics implementation. So I'm in favour of that part of
your patch going in, though I have some comments:

> @@ -1317,9 +1322,6 @@ aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args,
>  
>    switch (fcode)
>      {
> -      BUILTIN_VALLDI (UNOP, abs, 2)
> -	return fold_build1 (ABS_EXPR, type, args[0]);
> -	break;
>        BUILTIN_VALLDI (BINOP, cmge, 0)
>  	return fold_build2 (GE_EXPR, type, args[0], args[1]);
>  	break;

Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these
intrinsics? There should be no difference between the mid-end definition
and the intrinsic definition of their behaviour.

I also note that the integer forms of these now end up as an "abs" RTL
expression - can we guarantee that preserves the intrinsics behaviour and
no RTL-folder will come along and mis-optimize? Documentation is vague
on this point.

I'm also not convinced by the SIMD_ARG_SCRATCH foo you add. Looking at
the aarch64.md:absdi2 pattern I can't see why we need that scratch at
all. It seems we could get away with marking operand 0 early-clobber and
using it as our scratch register. Then we could drop all the extra
infrastructure from this patch.

Thanks,
James


> > 
> > The complication here is that the folding meant we never expanded the call to 
> > __builtin_aarch64_absdi, and thus never realized that expanding would cause an 
> > ICE: gen_absdi2() takes only two parameters (source and dest operand), and has 
> > no parameter corresponding to the scratch operand in the insn_data; but the code 
> > in aarch64_simd_expand_args, reads the number of arguments from the insn_data, 
> > and so tries to get another operand from the call to __builtin_aarch64_absdi, 
> > causing the ICE. Hence, we have to introduce a SIMD_ARG_SCRATCH, used in 
> > aarch64_simd_expand_args to skip over the argument.
> > 
> > (There is an alternative way of doing this, i.e. to update the for-loop in 
> > aarch64_simd_expand_builtin by adding yet another version of 'k'. However, I 
> > thought there were enough versions already that I really didn't want to add one 
> > more...)
> > 
> > To go with this, I've renamed qualifier_internal to qualifier_scratch, as it's 
> > not clear that any non-scratch internal operands (whatever such might be) would 
> > want the same treatment!
> > 
> > Cross-tested check-gcc on aarch64-none-elf.
> > 
> > Ok for trunk?
> > 
> > gcc/ChangeLog:
> > 
> > 	* config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers):
> > 	Rename qualifier_internal to qualifier_scratch.
> > 	(aarch64_types_unop_qualifiers, aarch64_init_simd_builtins): Follow
> > 	renaming.
> > 	(builtin_simd_arg): New SIMD_ARG_SCRATCH enum value.
> > 	(aarch64_simd_expand_args): Skip over SIMD_ARG_SCRATCHes.
> > 	(aarch64_simd_expand_builtin): Handle qualifier_scratch.
> > 	(aarch64_fold_builtin): Remove folding of abs.

> diff --git a/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..12fdd813bc3f58a183b4986c5c9c532c2b608699
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/vabs_intrinsic_2.c
> @@ -0,0 +1,17 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +#include <arm_neon.h>
> +
> +extern void abort (void);
> +
> +int
> +main (int argc, char **argv)
> +{
> +  int8x8_t a = vabs_s8 (vdup_n_s8 (-128)); /* Should all be -128.  */
> +  uint8x8_t b = vcltz_s8 (a); /* Should all be true i.e. -1. */
> +  if (vget_lane_u8 (b, 1))
> +    return 0;
> +  abort ();
> +}
> +


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]