This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, middle-end/RTL, i386]: Simplify nested VEC_SELECT RTX
- From: "Richard Guenther" <richard dot guenther at gmail dot com>
- To: "Uros Bizjak" <ubizjak at gmail dot com>
- Cc: "GCC Patches" <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 28 Aug 2007 10:49:42 +0200
- Subject: Re: [PATCH, middle-end/RTL, i386]: Simplify nested VEC_SELECT RTX
- References: <5787cf470707130623o6d256b5cn24aade73705235cb@mail.gmail.com>
On 7/13/07, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
> This patch optimizes VEC_SELECT nested RTL expressions (with optional
> VEC_CONCAT), when top VEC_SELECT extracts scalar operand.
>
> Currently, following testcase
>
> --cut here--
> int fooSI_1(__v4si *val)
> {
> return __builtin_ia32_vec_ext_v4si(*val, 1);
> }
>
> int fooSI_2(__v4si *val)
> {
> return __builtin_ia32_vec_ext_v4si(*val, 2);
> }
> --cut here--
>
> compiles with '-O2 -msse (-fomit-frame-pointer)' into:
>
> fooSI_2:
> subl $4, %esp
> movl 8(%esp), %eax
> movdqa (%eax), %xmm0
> punpckhdq %xmm0, %xmm0
> movd %xmm0, (%esp)
> movl (%esp), %eax
> addl $4, %esp
> ret
>
> fooSI_1:
> subl $4, %esp
> movl 8(%esp), %eax
> pshufd $85, (%eax), %xmm0
> movd %xmm0, (%esp)
> movl (%esp), %eax
> addl $4, %esp
> ret
>
> Attached patch simplifies this into:
>
> fooSI_2:
> movl 4(%esp), %eax
> movl 8(%eax), %eax
> ret
>
> fooSI_1:
> movl 4(%esp), %eax
> movl 4(%eax), %eax
> ret
>
> Similar effects can be obtained for SFmode operands.
>
> Patch was bootstrapped and regression tested on i686-pc-linux-gnu.
This is ok with the added comment.
Thanks,
Richard.
> 2007-07-13 Uros Bizjak <ubizjak@gmail.com>
>
> PR target/32661
> * simplify-rtx.c (simplify_binary_operation_1) [VEC_SELECT]:
> Simplify nested VEC_SELECT (with optional VEC_CONCAT operator as
> operand) when top VEC_SELECT extracts scalar element.
> * config/i386/sse.md (*vec_extract_v4si_mem): New.
> (*vec_extract_v4sf_mem): Ditto.
>
> testsuite/ChangeLog:
>
> 2007-07-13 Uros Bizjak <ubizjak@gmail.com>
>
> PR target/32661
> * gcc.target/i386/pr32661.c: New test.
>
> OK for mainline (patch needs middle-end/RTL approval) ?
>
> Uros.
>
>