This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix SSE5 pperm/perm* constraints and check_effective_target_sse5


On Dec 29, 2007 10:07 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> gcc.target/i386/sse5-permpX.c testcase fails to assemble, generates
> e.g.
> permpd  %xmm1, %xmm0, src1(%rip), %xmm0
> which is invalid, because destination must match either first or third
> source operand (and gas segfaults on it after issuing an error, patch
> for that posted to binutils ml).
>
> The SSE5 docs in Intel syntax say:
>
> PPERM xmm1, xmm1, xmm2, xmm3/mem128 0F 24 23 /r /drex0 For each byte position of the 16-
>                                                        byte result, uses corresponding
> PPERM xmm1, xmm1, xmm3/mem128, xmm2 0F 24 23 /r /drex1 control byte in fourth operand to
>                                                        perform logical operation on one of
> PPERM xmm1, xmm2, xmm3/mem128, xmm1 0F 24 27 /r /drex0 32 bytes from the second and third
>                                                        source operands and writes result
> PPERM xmm1, xmm3/mem128, xmm2, xmm1 0F 24 27 /r /drex1 in destination (xmm1 register).
>
> PERMPD xmm1, xmm1, xmm2, xmm3/mem128 0F 24 21 /r /drex0 For each double-precision result,
>                                                         uses corresponding control byte
> PERMPD xmm1, xmm1, xmm3/mem128, xmm2 0F 24 21 /r /drex1 in the fourth operand to perform
>                                                         an operation on one of 4 double-
> PERMPD xmm1, xmm2, xmm3/mem128, xmm1 0F 24 25 /r /drex0 precision operands from the
>                                                         second and third source operands
> PERMPD xmm1, xmm3/mem128, xmm2, xmm1 0F 24 25 /r /drex1 and writes result in destination
>                                                         (xmm1 register).
>
> so destination must be the same as src1 or src3.  The various fmadd*
> etc. constraints in sse.md honor this, but the third alternative for pperm/perm
> insns does not - it has a dup of destination in src2, reg or memory in src1 and
> reg in src3.  The following patch fixes that, though I don't have any hw to actually
> test it at runtime, but at least all tests in make check-gcc RUNTESTFLAGS=i386.exp
> now assemble.
>
> The other fix is for the check_effective_target_sse5 tcl test,
> which fails (__v2di isn't compatible with __v8hi) and so all SSE5 runtime tests
> are UNSUPPORTED even when assembler supports SSE5.
>
> Ok for trunk?

Ok.

Thanks,
Richard.

> 2007-12-29  Jakub Jelinek  <jakub@redhat.com>
>
>         * config/i386/sse.md (sse5_pperm, sse5_pperm_pack_v2di_v4si,
>         sse5_pperm_pack_v4si_v8hi, sse5_pperm_pack_v8hi_v16qi,
>         sse5_perm<mode>): Fix constraints.
>
>         * gcc.target/i386/i386.exp (check_effective_target_sse5): Use __v8hi
>         rather than __v2di type.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]