This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH AVX512] Fix dg.torture tests with avx512
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Ilya Tocar <tocarip dot intel at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Jakub Jelinek <jakub at redhat dot com>
- Date: Fri, 31 Oct 2014 11:17:07 +0100
- Subject: Re: [PATCH AVX512] Fix dg.torture tests with avx512
- Authentication-results: sourceware.org; auth=none
- References: <20141030145532 dot GA56974 at msticlxl7 dot ims dot intel dot com>
On Thu, Oct 30, 2014 at 3:55 PM, Ilya Tocar <tocarip.intel@gmail.com> wrote:
> Hi,
>
> I've run gcc.dg/torture/* tests with -mavx512bw -mavx512vl -mavx512dq
> flags, and got a bunch of fails (mostly in permutes autogen).
> Patch below fixes them.
> Ok for trunk?
>
> 2014-10-30 Ilya Tocar <ilya.tocar@intel.com>
>
> * config/i386/i386.c (expand_vec_perm_pshufb): Try vpermq/vpermd
> for 512-bit wide modes.
> (expand_vec_perm_1): Use correct versions of patterns.
> * config/i386/sse.md (avx512f_vec_dup_<mode>_1): New.
> (vashr<mode>3<mask_name>): Split V8HImode and V16QImode.
Please name new patterns ..._vec_dup<mode>... , without space between
vec_dup and <mode>.
> ---
> gcc/config/i386/i386.c | 59 ++++++++++++++++++++++++++++++++++++++++++++------
> gcc/config/i386/sse.md | 54 ++++++++++++++++++++++++++++++++++++++-------
> 2 files changed, 98 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 71a4f6a..74ff894 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -45889,6 +45889,42 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d)
> {
> if (!TARGET_AVX512BW)
> return false;
> +
> + /* If vpermq didn't work, vpshufb won't work either. */
> + if (d->vmode == V8DFmode || d->vmode == V8DImode)
> + return false;
> +
> + vmode = V64QImode;
> + if (d->vmode == V16SImode
> + || d->vmode == V32HImode
> + || d->vmode == V64QImode)
> + {
> + /* First see if vpermq can be used for
> + V16SImode/V32HImode/V64QImode. */
> + if (valid_perm_using_mode_p (V8DImode, d))
> + {
> + for (i = 0; i < 8; i++)
> + perm[i] = (d->perm[i * nelt / 8] * 8 / nelt) & 7;
> + if (d->testing_p)
> + return true;
> + target = gen_reg_rtx (V8DImode);
> + if (expand_vselect (target, gen_lowpart (V8DImode, d->op0),
> + perm, 8, false))
> + {
> + emit_move_insn (d->target,
> + gen_lowpart (d->vmode, target));
> + return true;
> + }
> + return false;
> + }
> +
> + /* Next see if vpermd can be used. */
> + if (valid_perm_using_mode_p (V16SImode, d))
> + vmode = V16SImode;
> + }
> + /* Or if vpermps can be used. */
> + else if (d->vmode == V16SFmode)
> + vmode = V16SImode;
> if (vmode == V64QImode)
> {
> /* vpshufb only works intra lanes, it is not
> @@ -45908,6 +45944,9 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d)
> if (vmode == V8SImode)
> for (i = 0; i < 8; ++i)
> rperm[i] = GEN_INT ((d->perm[i * nelt / 8] * 8 / nelt) & 7);
> + else if (vmode == V16SImode)
> + for (i = 0; i < 16; ++i)
> + rperm[i] = GEN_INT ((d->perm[i * nelt / 16] * 16 / nelt) & 15);
> else
> {
> eltsz = GET_MODE_SIZE (GET_MODE_INNER (d->vmode));
I'd like to ask Jakub for a review of the above two parts, other parts
are OK with a rename (as mentioned above).
Uros.