This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug c/79938] gcc unnecessarily spills xmm register to stack when inserting vector items
- From: "postmaster at raasu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 07 Mar 2017 14:25:49 +0000
- Subject: [Bug c/79938] gcc unnecessarily spills xmm register to stack when inserting vector items
- Auto-submitted: auto-generated
- References: <bug-79938-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79938
--- Comment #2 from postmaster at raasu dot org ---
(In reply to Richard Biener from comment #1)
> The situation is slightly better with GCC 7, only two spill/loads are
> remaining.
> Possibly BIT_INSERT_EXPR helps here.
With gcc 6.2.0 and
gcc -msse4.1 -mtune=core2 -O3 -S hadd.c -Wall -Wextra -fno-strict-aliasing
-fwrapv -o hadd.s
The resulting assembler output is almost perfect, but adding -mtune=core2 kinda
makes the code optimal only for Intel processors.
---
...
pxor %xmm1, %xmm1
movl $1, %edi
movd %eax, %xmm0
pshufb %xmm1, %xmm0
pextrb $1, %xmm0, %edx
pextrb $0, %xmm0, %eax
addl %edx, %eax
pextrb $2, %xmm0, %edx
addl %edx, %eax
pextrb $4, %xmm0, %ecx
pextrb $3, %xmm0, %edx
addl %eax, %edx
pextrb $5, %xmm0, %eax
addl %eax, %ecx
pextrb $6, %xmm0, %eax
addl %eax, %ecx
pextrb $9, %xmm0, %esi
pextrb $7, %xmm0, %eax
addl %eax, %ecx
pextrb $8, %xmm0, %eax
addl %esi, %eax
pextrb $10, %xmm0, %esi
addl %esi, %eax
pextrb $11, %xmm0, %esi
addl %esi, %eax
pextrb $13, %xmm0, %esi
movd %eax, %xmm1
pextrb $12, %xmm0, %eax
addl %esi, %eax
pextrb $14, %xmm0, %esi
addl %eax, %esi
pextrb $15, %xmm0, %eax
movd %edx, %xmm0
addl %esi, %eax
pinsrd $1, %ecx, %xmm0
movl $.LC0, %esi
pinsrd $1, %eax, %xmm1
xorl %eax, %eax
punpcklqdq %xmm1, %xmm0
...