This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, x86] Improves x86 permutation expand


On 06/09/2014 03:13 AM, Evgeny Stupachenko wrote:
> +  /* First we apply one operand permutation to the part where
> +     elements stay not in their respective lanes.  */
> +  dcopy = *d;
> +  if (which == 2)
> +    dcopy.op0 = dcopy.op1 = d->op1;
> +  else
> +    dcopy.op0 = dcopy.op1 = d->op0;
> +  dcopy.one_operand_p = true;
> +
> +  for (i = 0; i < nelt; ++i)
> +    {
> +      unsigned e = d->perm[i];
> +      if (which == 2)
> +       dcopy.perm[i] = ((e >= nelt) ? (e - nelt) : e);

This is wrong for which == 1.  For both cases this simplifies to

  dcopy.perm[i] = e & (nelt - 1);

> +
> +  for (i = 0; i < nelt; ++i)
> +    {
> +      unsigned e = d->perm[i];
> +      if (which == 2)
> +       dcopy1.perm[i] = ((e >= nelt) ? (nelt + i) : e);
> +      else
> +       dcopy1.perm[i] = ((e < nelt) ? i : e);
> +    }

This is known to be a blend, so you know the value of E.
Simplifies to

  dcopy1.perm[i] = (e >= nelt ? nelt + i : i);


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]