[Bug target/80820] _mm_set_epi64x shouldn't store/reload for -mtune=haswell, Zen should avoid store/reload, and generic should think about it.

venkataramanan.kumar at amd dot com gcc-bugzilla@gcc.gnu.org
Tue Aug 22 10:01:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80820

Venkataramanan <venkataramanan.kumar at amd dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |venkataramanan.kumar at amd dot co
                   |                            |m

--- Comment #4 from Venkataramanan <venkataramanan.kumar at amd dot com> ---
(In reply to Peter Cordes from comment #0)
> gcc with -mtune=generic likes to bounce through memory when moving data from
> integer registers to xmm for things like _mm_set_epi32.
> 
> There are 3 related tuning issues here:
> 
> * -mtune=haswell -mno-sse4 still uses one store/reload for _mm_set_epi64x.
> 
> * -mtune=znver1 should definitely favour movd/movq instead of store/reload.
>   (Ryzen has 1 m-op movd/movq between vector and integer with 3c latency,
> shorter than store-forwarding.  All the reasons to favour store/reload on
> other AMD uarches are gone.)
> 

Yes for Ryzen, using direct move instructions should be better than using
store-forwarding.


More information about the Gcc-bugs mailing list