This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix SSE5 pperm/perm* constraints and check_effective_target_sse5
- From: "Richard Guenther" <richard dot guenther at gmail dot com>
- To: "Jakub Jelinek" <jakub at redhat dot com>
- Cc: "Michael Meissner" <michael dot meissner at amd dot com>, "Uros Bizjak" <ubizjak at gmail dot com>, gcc-patches at gcc dot gnu dot org
- Date: Mon, 31 Dec 2007 16:50:55 +0100
- Subject: Re: [PATCH] Fix SSE5 pperm/perm* constraints and check_effective_target_sse5
- References: <20071229210706.GO20451@devserv.devel.redhat.com>
On Dec 29, 2007 10:07 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> gcc.target/i386/sse5-permpX.c testcase fails to assemble, generates
> e.g.
> permpd %xmm1, %xmm0, src1(%rip), %xmm0
> which is invalid, because destination must match either first or third
> source operand (and gas segfaults on it after issuing an error, patch
> for that posted to binutils ml).
>
> The SSE5 docs in Intel syntax say:
>
> PPERM xmm1, xmm1, xmm2, xmm3/mem128 0F 24 23 /r /drex0 For each byte position of the 16-
> byte result, uses corresponding
> PPERM xmm1, xmm1, xmm3/mem128, xmm2 0F 24 23 /r /drex1 control byte in fourth operand to
> perform logical operation on one of
> PPERM xmm1, xmm2, xmm3/mem128, xmm1 0F 24 27 /r /drex0 32 bytes from the second and third
> source operands and writes result
> PPERM xmm1, xmm3/mem128, xmm2, xmm1 0F 24 27 /r /drex1 in destination (xmm1 register).
>
> PERMPD xmm1, xmm1, xmm2, xmm3/mem128 0F 24 21 /r /drex0 For each double-precision result,
> uses corresponding control byte
> PERMPD xmm1, xmm1, xmm3/mem128, xmm2 0F 24 21 /r /drex1 in the fourth operand to perform
> an operation on one of 4 double-
> PERMPD xmm1, xmm2, xmm3/mem128, xmm1 0F 24 25 /r /drex0 precision operands from the
> second and third source operands
> PERMPD xmm1, xmm3/mem128, xmm2, xmm1 0F 24 25 /r /drex1 and writes result in destination
> (xmm1 register).
>
> so destination must be the same as src1 or src3. The various fmadd*
> etc. constraints in sse.md honor this, but the third alternative for pperm/perm
> insns does not - it has a dup of destination in src2, reg or memory in src1 and
> reg in src3. The following patch fixes that, though I don't have any hw to actually
> test it at runtime, but at least all tests in make check-gcc RUNTESTFLAGS=i386.exp
> now assemble.
>
> The other fix is for the check_effective_target_sse5 tcl test,
> which fails (__v2di isn't compatible with __v8hi) and so all SSE5 runtime tests
> are UNSUPPORTED even when assembler supports SSE5.
>
> Ok for trunk?
Ok.
Thanks,
Richard.
> 2007-12-29 Jakub Jelinek <jakub@redhat.com>
>
> * config/i386/sse.md (sse5_pperm, sse5_pperm_pack_v2di_v4si,
> sse5_pperm_pack_v4si_v8hi, sse5_pperm_pack_v8hi_v16qi,
> sse5_perm<mode>): Fix constraints.
>
> * gcc.target/i386/i386.exp (check_effective_target_sse5): Use __v8hi
> rather than __v2di type.