This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [ia64, rfa] vector pattern improvements


On 01/06/2011 09:25 AM, Steve Ellcey wrote:
> -  emit_insn (gen_mix1_r (temp, operands[1], operands[2]));
> +  if (TARGET_BIG_ENDIAN)
> +    emit_insn (gen_mix1_r (temp, operands[2], operands[1]));
> +  else
> +    emit_insn (gen_mix1_r (temp, operands[1], operands[2]));

Oh, I understand the problem now.

The Root Cause is that gcc numbers vector elements in their
memory ordering.  Thus the numbering of the elements as seen
in the register change based on TARGET_BIG_ENDIAN.  Thus just
about all of the (VEC_SELECT * (PARALLEL)) patterns are wrong
for big endian.

Something like the patch you propose may be required, yes.

I'm trying to think of a way to fix the other problems without
doubling the number of instruction patterns though.

One possibility is to do

(define_special_predicate "select_mix1_r_parallel"
  (match_code "parallel")
{
  static const int order[2][8] = {
    { 0, 8, 2, 10, 4, 12, 6, 14 }, /* le */
    { 8, 0, 10, 2, 12, 4, 14, 6 }  /* be */
  };
  return match_select_parallel(op, 8, order[TARGET_BIG_ENDIAN]);
})

(define_insn "mix1_r"
  [(set (match_operand:V8QI 0 "gr_register_operand" "=r")
        (vec_select:V8QI
          (vec_concat:V16QI
            (match_operand:V8QI 1 "gr_reg_or_0_operand" "rU")
            (match_operand:V8QI 2 "gr_reg_or_0_operand" "rU"))
	  (match_operand 3 "select_mix1_r_parallel" "")))]
  ""
  "mix1.r %0 = %r2, %r1"
  [(set_attr "itanium_class" "mmshf")])

bool
match_select_parallel (rtx par, int nelt, const int *order)
{
  int i;

  if (XVECLEN (par, 0) != nelt)
    return false;

  for (i = 0; i < nelt; ++i)
    {
      rtx e = XVECEXP (par, 0, i);
      if (GET_CODE (e) != CONST_INT)
        return false;
      if (INTVAL (e) != order[i])
	return false;
    }

  return true;
}

Although for this specific case it would probably be better
to simply manually swap the operands in the output template:

  {
    if (TARGET_BIG_ENDIAN)
      return "mix1.r %0 = %r1, %r2";
    else
      return "mix1.r %0 = %r2, %r1";
  }

but this would not be true of all of the patterns.

Another possibility is to simply give up on representing these
instructions exactly and instead use UNSPECs.

I think the proper representation is very much preferable,
since then we can do something akin to the i386 port with the
expand_vec_perm scheme, where we search for combinations of
permutations that we can support.

It's somewhat unfortunate that there's no ia64-hpux available
in the compile farm.  I could do the bulk conversion and test
on linux, but I'd still have to rely on you to test hpux.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]