[Bug target/80820] _mm_set_epi64x shouldn't store/reload for -mtune=haswell, Zen should avoid store/reload, and generic should think about it.
venkataramanan.kumar at amd dot com
gcc-bugzilla@gcc.gnu.org
Tue Aug 22 10:01:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80820
Venkataramanan <venkataramanan.kumar at amd dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |venkataramanan.kumar at amd dot co
| |m
--- Comment #4 from Venkataramanan <venkataramanan.kumar at amd dot com> ---
(In reply to Peter Cordes from comment #0)
> gcc with -mtune=generic likes to bounce through memory when moving data from
> integer registers to xmm for things like _mm_set_epi32.
>
> There are 3 related tuning issues here:
>
> * -mtune=haswell -mno-sse4 still uses one store/reload for _mm_set_epi64x.
>
> * -mtune=znver1 should definitely favour movd/movq instead of store/reload.
> (Ryzen has 1 m-op movd/movq between vector and integer with 3c latency,
> shorter than store-forwarding. All the reasons to favour store/reload on
> other AMD uarches are gone.)
>
Yes for Ryzen, using direct move instructions should be better than using
store-forwarding.
More information about the Gcc-bugs
mailing list