This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[PATCH] Fix ICE when generating a vector shift by scalar

From: Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>
To: gcc-patches at gcc dot gnu dot org
Date: Mon, 31 Aug 2015 15:28:02 -0500
Subject: [PATCH] Fix ICE when generating a vector shift by scalar
Authentication-results: sourceware.org; auth=none

Hi,

The following simple test fails when attempting to convert a vector
shift-by-scalar into a vector shift-by-vector.

  typedef unsigned char v16ui __attribute__((vector_size(16)));

  v16ui vslb(v16ui v, unsigned char i)
  {
    return v << i;
  }

When this code is gimplified, the shift amount gets expanded to an
unsigned int:

  vslb (v16ui v, unsigned char i)
  {
    v16ui D.2300;
    unsigned int D.2301;

    D.2301 = (unsigned int) i;
    D.2300 = v << D.2301;
    return D.2300;
  }

In expand_binop, the shift-by-scalar is converted into a shift-by-vector
using expand_vector_broadcast, which produces the following rtx to be
used to initialize a V16QI vector:

(parallel:V16QI [
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
        (subreg/s/v:SI (reg:DI 155) 0)
    ])

The back end eventually chokes trying to generate a copy of the SImode
expression into a QImode memory slot.

This patch fixes this problem by ensuring that the shift amount is
truncated to the inner mode of the vector when necessary.  I've added a
test case verifying correct PowerPC code generation in this case.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Is this ok for trunk?

Thanks,
Bill


[gcc]

2015-08-31  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* optabs.c (expand_binop): Don't create a broadcast vector with a
	source element wider than the inner mode.

[gcc/testsuite]

2015-08-31  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.target/powerpc/vec-shift.c: New test.


Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	(revision 227353)
+++ gcc/optabs.c	(working copy)
@@ -1608,6 +1608,13 @@ expand_binop (machine_mode mode, optab binoptab, r
 
       if (otheroptab && optab_handler (otheroptab, mode) != CODE_FOR_nothing)
 	{
+	  /* The scalar may have been extended to be too wide.  Truncate
+	     it back to the proper size to fit in the broadcast vector.  */
+	  machine_mode inner_mode = GET_MODE_INNER (mode);
+	  if (GET_MODE_BITSIZE (inner_mode)
+	      < GET_MODE_BITSIZE (GET_MODE (op1)))
+	    op1 = simplify_gen_unary (TRUNCATE, inner_mode, op1,
+				      GET_MODE (op1));
 	  rtx vop1 = expand_vector_broadcast (mode, op1);
 	  if (vop1)
 	    {
Index: gcc/testsuite/gcc.target/powerpc/vec-shift.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-shift.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/vec-shift.c	(working copy)
@@ -0,0 +1,20 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power7" } } */
+/* { dg-options "-mcpu=power7 -O2" } */
+
+/* This used to ICE.  During gimplification, "i" is widened to an unsigned
+   int.  We used to fail at expand time as we tried to cram an SImode item
+   into a QImode memory slot.  This has been fixed to properly truncate the
+   shift amount when splatting it into a vector.  */
+
+typedef unsigned char v16ui __attribute__((vector_size(16)));
+
+v16ui vslb(v16ui v, unsigned char i)
+{
+	return v << i;
+}
+
+/* { dg-final { scan-assembler "vspltb" } } */
+/* { dg-final { scan-assembler "vslb" } } */

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]