[PATCH 2/2, x86] Add palignr support for AVX2.

Richard Henderson rth@redhat.com
Tue Apr 29 17:41:00 GMT 2014


On 04/29/2014 10:13 AM, Evgeny Stupachenko wrote:
> +  /* For a rotaion permutation with one operand like: {5 6 7 0 1 2 3 4}
> +     PALIGNR is better than PSHUFB.  Check for a rotation in permutation.  */
> +  for (i = 0; i < nelt; ++i)
> +    if ((((d->perm[(i + 1) & (nelt - 1)] - d->perm[i])) & (nelt - 1)) != 1)
> +      return false;
> +
> +  min = d->perm[0];

Why are you running this loop NELT times instead of NELT-1 like I suggested?
Why is that test expression so complicated?

Obviously d->perm[0] is the anchor and everything else can be computed relative
to that.  I'd expect no more than

  min = d->perm[0];
  for (i = 1; i < nelt; ++i)
    if (d->perm[i] != ((min + i) & (nelt - 1)))
      return false;

Now that I think of it,

> +  /* PALIGNR of 2 128-bits registers takes only 1 instrucion.
> +     Requires SSSE3.  */
> +  if (GET_MODE_SIZE (d->vmode) == 16)
> +    {
> +      if (!TARGET_SSSE3)
> +       return false;
> +    }
> +  /* PALIGNR of 2 256-bits registers on AVX2 costs only 2 instructions:
> +     PERM and PALIGNR.  It is more profitable than 2 PSHUFB and PERM.  */
> +  else if (GET_MODE_SIZE (d->vmode) == 32)
> +    {
> +      if (!TARGET_AVX2)
> +       return false;
> +    }
> +  else
> +    return false;
> +
> +  if (!d->one_operand_p)
> +    return false;

The comments are wrong.  Move the operand_p test to the top,
where it should be, and they're more obviously wrong:

  "must have one operand"
  "palignr of two operands..."

I think your comment

  /* For a rotaion permutation with one operand like: {5 6 7 0 1 2 3 4}
     we want to use PALIGNR.  */

belongs up there instead of those two incorrect comments.



r~



More information about the Gcc-patches mailing list