[PATCH 2/2, x86] Add palignr support for AVX2.

Richard Henderson rth@redhat.com
Mon May 19 16:21:00 GMT 2014


On 05/05/2014 09:54 AM, Evgeny Stupachenko wrote:
> @@ -42943,6 +42944,10 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
>    if (expand_vec_perm_vpermil (d))
>      return true;
> 
> +  /* Try palignr on one operand.  */
> +  if (d->one_operand_p && expand_vec_perm_palignr (d))
> +    return true;

No, because unless in_order and SSSE3, expand_vec_perm_palignr generates at
least 2 insns, and by contract expand_vec_perm_1 must generate only one.

I think what might help you out is to have the rotate permutation matched
directly, rather than have to have it converted to a shift.

Thus I think you'd do well to start this series with a patch that adds a
pattern of the form

(define_insn "*ssse3_palignr<mode>_perm"
  [(set (match_operand:V_128 0 "register_operand" "=x,x")
        (vec_select:V_128
           (match_operand:V_128 1 "register_operand" "0,x")
           (match_operand:V_128 2 "nonimmediate_operand" "xm,xm")
           (match_parallel 3 "palign_operand"
             [(match_operand 4 "const_int_operand" "")]
  "TARGET_SSSE3"
{
  enum machine_mode imode = GET_INNER_MODE (GET_MODE (operands[0]));
  operands[3] = GEN_INT (INTVAL (operands[4]) * GET_MODE_SIZE (imode));

  switch (which_alternative)
    {
    case 0:
      return "palignr\t{%3, %2, %0|%0, %2, %3}";
    case 1:
      return "vpalignr\t{%3, %2, %1, %0|%0, %1, %2, %3}";
    default:
      gcc_unreachable ();
    }
}
  [(set_attr "isa" "noavx,avx")
   (set_attr "type" "sseishft")
   (set_attr "atom_unit" "sishuf")
   (set_attr "prefix_data16" "1,*")
   (set_attr "prefix_extra" "1")
   (set_attr "length_immediate" "1")
   (set_attr "prefix" "orig,vex")])

where the palign_operand function verifies that the constants are all in order.
 This is very similar to the way we define the broadcast type patterns.

You'll need a similar pattern with a different predicate for the avx2 palignr,
since it's not a simple increment, but also verifying the cross-lane constraint.

With that as patch 1/1, I believe that will significantly tidy up what else
you're attempting to change with this series.



r~



More information about the Gcc-patches mailing list