[Bug target/80819] [6/7/8 regression] Useless store to the stack in _mm_set_epi64x with SSE4 -mno-avx
jakub at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Tue Nov 28 13:25:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80819
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu.org,
| |jakub at gcc dot gnu.org
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Honza, thoughts on this? Given:
movq %rdi, -16(%rsp)
movq -16(%rsp), %xmm0
pinsrq $1, %rsi, %xmm0
I'd say if pinsrq $1, %rsi, %xmm0 is not too slow on recent AMD, then either
movq %rdi, %xmm0 should be also not too slow, or pinsrq $0, %rdi, %xmm0 should
be the way to go.
Note current trunk still emits a dead store with -mtune=intel -O2 -msse4:
movq %rsi, -16(%rsp)
movq %rdi, %xmm0
pinsrq $1, %rsi, %xmm0
and with -mtune=generic -O2 -msse4:
movq %rdi, -16(%rsp)
movq %rsi, -24(%rsp)
movq -16(%rsp), %xmm0
pinsrq $1, %rsi, %xmm0
Wonder why doesn't DSE eliminate it.
More information about the Gcc-bugs
mailing list