[Bug target/80819] [6/7/8 regression] Useless store to the stack in _mm_set_epi64x with SSE4 -mno-avx

jakub at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Nov 28 13:25:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80819

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org,
                   |                            |jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Honza, thoughts on this?  Given:
        movq    %rdi, -16(%rsp)
        movq    -16(%rsp), %xmm0
        pinsrq  $1, %rsi, %xmm0
I'd say if pinsrq $1, %rsi, %xmm0 is not too slow on recent AMD, then either
movq %rdi, %xmm0 should be also not too slow, or pinsrq $0, %rdi, %xmm0 should
be the way to go.
Note current trunk still emits a dead store with -mtune=intel -O2 -msse4:
        movq    %rsi, -16(%rsp)
        movq    %rdi, %xmm0
        pinsrq  $1, %rsi, %xmm0
and with -mtune=generic -O2 -msse4:
        movq    %rdi, -16(%rsp)
        movq    %rsi, -24(%rsp)
        movq    -16(%rsp), %xmm0
        pinsrq  $1, %rsi, %xmm0
Wonder why doesn't DSE eliminate it.


More information about the Gcc-bugs mailing list