This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/80819] [6/7/8 regression] Useless store to the stack in _mm_set_epi64x with SSE4 -mno-avx
- From: "jakub at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 28 Nov 2017 13:25:13 +0000
- Subject: [Bug target/80819] [6/7/8 regression] Useless store to the stack in _mm_set_epi64x with SSE4 -mno-avx
- Auto-submitted: auto-generated
- References: <bug-80819-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80819
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu.org,
| |jakub at gcc dot gnu.org
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Honza, thoughts on this? Given:
movq %rdi, -16(%rsp)
movq -16(%rsp), %xmm0
pinsrq $1, %rsi, %xmm0
I'd say if pinsrq $1, %rsi, %xmm0 is not too slow on recent AMD, then either
movq %rdi, %xmm0 should be also not too slow, or pinsrq $0, %rdi, %xmm0 should
be the way to go.
Note current trunk still emits a dead store with -mtune=intel -O2 -msse4:
movq %rsi, -16(%rsp)
movq %rdi, %xmm0
pinsrq $1, %rsi, %xmm0
and with -mtune=generic -O2 -msse4:
movq %rdi, -16(%rsp)
movq %rsi, -24(%rsp)
movq -16(%rsp), %xmm0
pinsrq $1, %rsi, %xmm0
Wonder why doesn't DSE eliminate it.