This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[commit, spu] Fix vec_perm pattern (Re: [rs6000, spu] Add vec_perm named pattern)


Richard Henderson wrote:

> The generic support for vector permutation will allow for automatic
> lowering to V*QImode, so all we need to add to support for these targets
> is the single V16QI pattern that represents the base permutation insn.
> 
> I'm not touching any of the other ways that the permutation insn 
> could be generated.  After the generic support is added, I'll leave
> it to the port maintainers to determine what they want to keep.  I
> suspect in many cases using the generic __builtin_shuffle plus some
> casting in the target-specific header files would be sufficient,
> eliminating several dozen builtins.


Sorry I didn't get to this earlier, I got side-tracked by a number
of independent regressions on SPU ...

Unfortunately, the semantics of vec_perm do not match 100% those of the
SPU Shuffle Bytes instruction.  vec_perm assumes the selector elements
apply modulo 32, but shufb uses values >= 128 for special purposes.
See the ISA:

  Value in Register RC
  (Expressed in Binary)  Result Byte

  10xxxxxx               0x00
  110xxxxx               0xFF
  111xxxxx               0x80
  Otherwise              The byte of the concatenated register addressed by
                         the rightmost 5 bits of register RC


To implement the vec_perm semantics fully, we therefore need to reduce the
selector modulo 32 explicitly before using shuf.

Tested on spu-elf, fixes various vshuf test cases.
Committed to mainline.

Bye,
Ulrich


ChangeLog:

	* config/spu/spu.md ("vec_permv16qi"): Reduce selector modulo 32
	before using the shufb instruction.

Index: gcc/config/spu/spu.md
===================================================================
*** gcc/config/spu/spu.md	(revision 180240)
--- gcc/config/spu/spu.md	(working copy)
*************** selb\t%0,%4,%0,%3"
*** 4395,4410 ****
    "shufb\t%0,%1,%2,%3"
    [(set_attr "type" "shuf")])
  
  (define_expand "vec_permv16qi"
!   [(set (match_operand:V16QI 0 "spu_reg_operand" "")
  	(unspec:V16QI
  	  [(match_operand:V16QI 1 "spu_reg_operand" "")
  	   (match_operand:V16QI 2 "spu_reg_operand" "")
! 	   (match_operand:V16QI 3 "spu_reg_operand" "")]
  	  UNSPEC_SHUFB))]
    ""
    {
!     operands[3] = gen_lowpart (TImode, operands[3]);
    })
  
  (define_insn "nop"
--- 4395,4416 ----
    "shufb\t%0,%1,%2,%3"
    [(set_attr "type" "shuf")])
  
+ ; The semantics of vec_permv16qi are nearly identical to those of the SPU
+ ; shufb instruction, except that we need to reduce the selector modulo 32.
  (define_expand "vec_permv16qi"
!   [(set (match_dup 4) (and:V16QI (match_operand:V16QI 3 "spu_reg_operand" "")
!                                  (match_dup 6)))
!    (set (match_operand:V16QI 0 "spu_reg_operand" "")
  	(unspec:V16QI
  	  [(match_operand:V16QI 1 "spu_reg_operand" "")
  	   (match_operand:V16QI 2 "spu_reg_operand" "")
! 	   (match_dup 5)]
  	  UNSPEC_SHUFB))]
    ""
    {
!     operands[4] = gen_reg_rtx (V16QImode);
!     operands[5] = gen_lowpart (TImode, operands[4]);
!     operands[6] = spu_const (V16QImode, 31);
    })
  
  (define_insn "nop"

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]