This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: Use ix86_expand_vector_set if possible
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: "Uros Bizjak" <ubizjak at gmail dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 13 May 2008 11:29:26 -0700
- Subject: Re: PATCH: Use ix86_expand_vector_set if possible
- References: <20080513133419.GA19854@lucon.org> <4829D4C6.5010800@gmail.com>
On Tue, May 13, 2008 at 10:49 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> H.J. Lu wrote:
>
> > This patch uses ix86_expand_vector_set if possible when initializing
> > a vector with only one non-zero value. I am testing it on Linux/ia32
> > and Linux/Intel64. OK to install on trunk if all pass?
> >
> >
>
> How does asm using ix86_expand_vector set differ from current one?
>
It doesn't make a difference with 2 element vectors since vec_concat
is used. For vectors with more than 2 elements, we may save one
instruction. For example
[hjl@gnu-6 sse-1]$ cat v4sf-2.c
#include <emmintrin.h>
__m128
foo2 (float x)
{
return _mm_set_ps (0, 0, x, 0);
}
[hjl@gnu-6 sse-1]$ cat old/v4sf-2.s
.file "v4sf-2.c"
.text
.p2align 4,,15
.globl foo2
.type foo2, @function
foo2:
xorps %xmm1, %xmm1
movss %xmm0, %xmm1
movaps %xmm1, %xmm0
shufps $81, %xmm1, %xmm0
ret
.size foo2, .-foo2
.ident "GCC: (GNU) 4.4.0 20080509 (experimental) [trunk
revision 135128]"
.section .note.GNU-stack,"",@progbits
[hjl@gnu-6 sse-1]$ cat new/v4sf-2.s
.file "v4sf-2.c"
.text
.p2align 4,,15
.globl foo2
.type foo2, @function
foo2:
xorps %xmm1, %xmm1
insertps $16, %xmm0, %xmm1
movaps %xmm1, %xmm0
ret
.size foo2, .-foo2
.ident "GCC: (GNU) 4.4.0 20080510 (experimental)
[stack-internal revision 2533]"
.section .note.GNU-stack,"",@progbits
[hjl@gnu-6 sse-1]$
H.J.