[PATCH, rs6000] PR87532: Bad Results from vec_extract(unsigned char, foo) dependent upon function inline

Segher Boessenkool segher@kernel.crashing.org
Fri Mar 8 19:04:00 GMT 2019


Hi Kelvin,

On Fri, Mar 08, 2019 at 10:59:35AM -0600, Kelvin Nilsen wrote:
> This problem report, though initially motivated by differences in behavior between constant and non-constant selector arguments, uncovered a number of inconsistencies in the implementation of vec_extract.
> 
> This patch provides several fixes to make handling of constant selector expressions the same as the handling of non-constant selector expressions.  In the process of testing, it was observed that certain existing regression tests were looking for the wrong instructions to be emitted and those tests have been updated.
> 
> This has bootstrapped and tested without regressions on powerpc64le-unknown-linux (both P8 and P9) and on powerpc-linux (P7 big-endian, with both -m32 and -m64 target options).

Thanks for the patch.  I have lots of comments/questions, mostly about the
testcases.  Sorry :-/  The actual compiler code part looks good though.


> --- gcc/testsuite/gcc.target/powerpc/vsx-builtin-10a.c	(revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-10a.c	(revision 0)
> @@ -0,0 +1,157 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */

Is there any reason to think this testcase will fail on Darwin?  I mean, it
requires VSX, and that should skip the testcase already on Darwin?

> +/* { dg-require-effective-target dfp_hw } */

Why is that?  I don't see any dfp used here?

> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { } } */

You do not use -mcpu=, so why this dg-skip-if?  And please set
-mdejagnu-cpu= instead.

> --- gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p8.c	(revision 269370)
> +++ gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p8.c	(working copy)
> @@ -18,7 +18,7 @@
>  /* { dg-final { scan-assembler-times {\mxxpermdi\M} 3 { target lp64 } } } */
>  /* { dg-final { scan-assembler-times {\mvslo\M} 3 { target lp64 } } } */
>  /* { dg-final { scan-assembler-times {\mmfvsrd\M} 6 { target lp64 } } } */
> -/* { dg-final { scan-assembler-times {\msradi\M} 3 { target lp64 } } } */
> +/* { dg-final { scan-assembler-times {\msrdi\M} 3 { target lp64 } } } */

Maybe allow both?  So {\msra?di\M}?  Or does sradi make no sense here?

> --- gcc/testsuite/gcc.target/powerpc/vsx-builtin-14a.c	(revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-14a.c	(revision 0)
> @@ -0,0 +1,128 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> +/* { dg-require-effective-target dfp_hw } */
> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { } } */
> +/* { dg-options "-maltivec" } */
> +
> +/* This test should run the same on any target that supports altivec/dfp
> +   instructions.  Intentionally not specifying cpu in order to test
> +   all code generation paths.  */

I don't see where dfp comes in?  For server CPUs it is p6 and up, the same
as AltiVec, but that is coincidental.

> --- gcc/testsuite/gcc.target/powerpc/pr87532-mc.c	(revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/pr87532-mc.c	(revision 0)
> @@ -0,0 +1,260 @@
> +/* { dg-do run { target { powerpc*-*-* && lp64 } } } */

Check for int128, instead?  Or is there another reason to want lp64?

> +  __asm__ (" # %x0" : "+wa" (a));

"wa" requires VSX.

> --- gcc/testsuite/gcc.target/powerpc/vsx-builtin-14b.c	(revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-14b.c	(revision 0)
> @@ -0,0 +1,128 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */

Btw, you can just say { dg-do run }: everything in gcc.target/powerpc is
only run for powerpc*-*-* targets.

> --- gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-v2b.c	(revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-v2b.c	(revision 0)
> @@ -0,0 +1,16 @@
> +/* { dg-do run { target { powerpc*-*-linux* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> +/* { dg-require-effective-target vsx_hw } */
> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { } } */
> +/* { dg-options "-O2 -mvsx -save-temps -dp -g" } */

I think that is debugging code left over?  -dp -g should be harmless, but
-save-temps is littering ;-)

> --- gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-v2.h	(revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-v2.h	(revision 0)

> +static vector TYPE
> +deoptimize (vector TYPE a)
> +{
> +  __asm__ (" # %x0" : "+wa" (a));
> +  return a;
> +}

Is this only ever compiled with VSX?  If not, use "v" instead?

> --- gcc/config/rs6000/rs6000-c.c	(revision 269370)
> +++ gcc/config/rs6000/rs6000-c.c	(working copy)
> @@ -6568,9 +6568,15 @@
>  	  /* If the second argument is an integer constant, if the value is in
>  	     the expected range, generate the built-in code if we can.  We need
>  	     64-bit and direct move to extract the small integer vectors.  */
> -	  if (TREE_CODE (arg2) == INTEGER_CST
> -	      && wi::ltu_p (wi::to_wide (arg2), nunits))
> +
> +	  if (TREE_CODE (arg2) == INTEGER_CST)
>  	    {
> +	      if (!wi::ltu_p (wi::to_wide (arg2), nunits))
> +		{
> +		  wide_int selector = wi::to_wide (arg2);
> +		  selector = wi::umod_trunc (selector, nunits);
> +		  arg2 = wide_int_to_tree (TREE_TYPE (arg2), selector);
> +		}

You can just always do this, no need for the condition?  Makes it easier
to read.

> --- gcc/config/rs6000/rs6000.c	(revision 269370)
> +++ gcc/config/rs6000/rs6000.c	(working copy)
> @@ -6894,7 +6894,6 @@
>  	default:
>  	  break;
>  	case E_V1TImode:
> -	  gcc_assert (INTVAL (elt) == 0 && inner_mode == TImode);

You could leave the inner_mode part?  Or is the assert here pretty
useless anyway?

> @@ -7173,7 +7190,9 @@
>  
>    else if (REG_P (src) || SUBREG_P (src))
>      {
> -      int bit_shift = byte_shift + 3;
> +      int num_elements = GET_MODE_NUNITS (mode);
> +      int bits_in_element = 128 / num_elements;

Use GET_MODE_INNER?  Or will that not help here.

> @@ -14724,8 +14743,18 @@
>    op1 = expand_normal (arg1);
>  
>    /* Call get_element_number to validate arg1 if it is a constant.  */
> +
>    if (TREE_CODE (arg1) == INTEGER_CST)
> -    (void) get_element_number (TREE_TYPE (arg0), arg1);
> +    {
> +      unsigned HOST_WIDE_INT elt;
> +      unsigned HOST_WIDE_INT size = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0));
> +      unsigned int truncated_selector;
> +      if (!tree_fits_uhwi_p (arg1))
> +	error ("selector must fit in HOST_WIDE_INT");

"error" is a user-visible error, and HOST_WIDE_INT is not something a user
knows.


Segher



More information about the Gcc-patches mailing list