This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, x86] Improves x86 permutation expand
- From: Richard Henderson <rth at redhat dot com>
- To: Evgeny Stupachenko <evstupac at gmail dot com>
- Cc: Uros Bizjak <ubizjak at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 09 Jun 2014 09:30:02 -0700
- Subject: Re: [PATCH, x86] Improves x86 permutation expand
- Authentication-results: sourceware.org; auth=none
- References: <CAOvf_xwh02FHmgQDe+7Qj0Dec0c8LuqBjP5UaMkc4TwrdP11nA at mail dot gmail dot com> <53909F88 dot 7070009 at redhat dot com> <CAOvf_xwdO2rn+GWiBNF9=9LJBCepfttc5UYEct10aXmT6aCvjQ at mail dot gmail dot com>
On 06/09/2014 03:13 AM, Evgeny Stupachenko wrote:
> + /* First we apply one operand permutation to the part where
> + elements stay not in their respective lanes. */
> + dcopy = *d;
> + if (which == 2)
> + dcopy.op0 = dcopy.op1 = d->op1;
> + else
> + dcopy.op0 = dcopy.op1 = d->op0;
> + dcopy.one_operand_p = true;
> +
> + for (i = 0; i < nelt; ++i)
> + {
> + unsigned e = d->perm[i];
> + if (which == 2)
> + dcopy.perm[i] = ((e >= nelt) ? (e - nelt) : e);
This is wrong for which == 1. For both cases this simplifies to
dcopy.perm[i] = e & (nelt - 1);
> +
> + for (i = 0; i < nelt; ++i)
> + {
> + unsigned e = d->perm[i];
> + if (which == 2)
> + dcopy1.perm[i] = ((e >= nelt) ? (nelt + i) : e);
> + else
> + dcopy1.perm[i] = ((e < nelt) ? i : e);
> + }
This is known to be a blend, so you know the value of E.
Simplifies to
dcopy1.perm[i] = (e >= nelt ? nelt + i : i);
r~