This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH, rs6000] Handle -maltivec=be on little endian for vec_sums

From: Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>
To: gcc-patches at gcc dot gnu dot org
Cc: dje dot gcc at gmail dot com
Date: Thu, 30 Jan 2014 20:48:25 -0600
Subject: Re: [PATCH, rs6000] Handle -maltivec=be on little endian for vec_sums
Authentication-results: sourceware.org; auth=none
References: <1391136172 dot 14633 dot 11 dot camel at gnopaine>

On Thu, 2014-01-30 at 20:42 -0600, Bill Schmidt wrote:
> Hi,
> 
> This patch adds logic for -maltivec=be with a little endian target when
> generating code for the vec_sums builtin.  This implements the vsumsws
> instruction, which adds the four elements in the first input vector
> operand to element 3 of the second input vector operand, placing the
> result in element 3 of the destination vector operand.
> 
> For little endian, element 3 is the leftmost (most significant) word in
> the vector register, while the instruction treats element 3 as the
> rightmost (least significant) word.  Since there is not a vector
> shift-immediate or rotate-immediate instruction in VMX, we use a splat
> instruction to get LE element 3 (BE element 0) into BE element 3 of a
> scratch register for input to the vsumsws instruction.  Similarly, the
> result of the vsumsws instruction is then splatted from BE element 3
> into BE element 0 (LE element 3) where it is expected to be by any
> builtin that consumes that value.  The destination register is reused
> for this purpose.

Brain check - I meant to say the scratch register is reused for this
purpose.

> 
> As with other patches in this series, an altivec_vsumsws_direct pattern
> is added for uses of vsumsws internal to GCC.
> 
> Two new test cases are added that demonstrate how the vec_vsums builtin
> is expected to behave for BE, LE, and LE with -maltivec=be.
> 
> Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
> regressions.  Is this ok for trunk?
> 
> Thanks,
> Bill
> 
> 
> gcc:
> 
> 2014-01-30  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
> 
> 	* config/rs6000/altivec.md (UNSPEC_VSUMSWS_DIRECT): New unspec.
> 	(altivec_vsumsws): Add handling for -maltivec=be with a little
> 	endian target.
> 	(altivec_vsumsws_direct): New.
> 	(reduc_splus_<mode>): Call gen_altivec_vsumsws_direct instead of
> 	gen_altivec_vsumsws.
> 
> gcc/testsuite:
> 
> 2014-01-30  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
> 
> 	* gcc.dg/vmx/vsums.c: New.
> 	* gcc.dg/vmx/vsums-be-order.c: New.
> 
> 
> Index: gcc/testsuite/gcc.dg/vmx/vsums.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/vmx/vsums.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/vmx/vsums.c	(revision 0)
> @@ -0,0 +1,12 @@
> +#include "harness.h"
> +
> +static void test()
> +{
> +  vector signed int va = {-7,11,-13,17};
> +  vector signed int vb = {0,0,0,128};
> +
> +  vector signed int vd = vec_sums (va, vb);
> +  signed int r = vec_extract (vd, 3);
> +
> +  check (r == 136, "sums");
> +}
> Index: gcc/testsuite/gcc.dg/vmx/vsums-be-order.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/vmx/vsums-be-order.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/vmx/vsums-be-order.c	(revision 0)
> @@ -0,0 +1,19 @@
> +/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
> +
> +#include "harness.h"
> +
> +static void test()
> +{
> +  vector signed int va = {-7,11,-13,17};
> +
> +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> +  vector signed int vb = {128,0,0,0};
> +#else
> +  vector signed int vb = {0,0,0,128};
> +#endif
> +
> +  vector signed int vd = vec_sums (va, vb);
> +  signed int r = vec_extract (vd, 3);
> +
> +  check (r == 136, "sums");
> +}
> Index: gcc/config/rs6000/altivec.md
> ===================================================================
> --- gcc/config/rs6000/altivec.md	(revision 207326)
> +++ gcc/config/rs6000/altivec.md	(working copy)
> @@ -132,6 +132,7 @@
>     UNSPEC_VMRGH_DIRECT
>     UNSPEC_VMRGL_DIRECT
>     UNSPEC_VSPLT_DIRECT
> +   UNSPEC_VSUMSWS_DIRECT
>  ])
> 
>  (define_c_enum "unspecv"
> @@ -1601,6 +1602,27 @@
>          (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
>                        (match_operand:V4SI 2 "register_operand" "v")]
>  		     UNSPEC_VSUMSWS))
> +   (set (reg:SI 110) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))
> +   (clobber (match_scratch:V4SI 3 "=v"))]
> +  "TARGET_ALTIVEC"
> +{
> +  if (BYTES_BIG_ENDIAN || VECTOR_ELT_ORDER_BIG)
> +    return "vsumsws %0,%1,%2";
> +  else
> +    return "vspltw %3,%2,0\n\tvsumsws %3,%1,%3\n\tvspltw %0,%3,3";
> +}
> +  [(set_attr "type" "veccomplex")
> +   (set (attr "length")
> +     (if_then_else
> +       (match_test "(BYTES_BIG_ENDIAN || VECTOR_ELT_ORDER_BIG)")
> +       (const_string "4")
> +       (const_string "12")))])
> +
> +(define_insn "altivec_vsumsws_direct"
> +  [(set (match_operand:V4SI 0 "register_operand" "=v")
> +        (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> +                      (match_operand:V4SI 2 "register_operand" "v")]
> +		     UNSPEC_VSUMSWS_DIRECT))
>     (set (reg:SI 110) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
>    "TARGET_ALTIVEC"
>    "vsumsws %0,%1,%2"
> @@ -2337,7 +2359,7 @@
> 
>    emit_insn (gen_altivec_vspltisw (vzero, const0_rtx));
>    emit_insn (gen_altivec_vsum4s<VI_char>s (vtmp1, operands[1], vzero));
> -  emit_insn (gen_altivec_vsumsws (dest, vtmp1, vzero));
> +  emit_insn (gen_altivec_vsumsws_direct (dest, vtmp1, vzero));
>    DONE;
>  })
> 
> 
>

References:
- [PATCH, rs6000] Handle -maltivec=be on little endian for vec_sums
  - From: Bill Schmidt

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]