This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/81496] AVX load from adjacent memory location followed by concatenation
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 21 Jul 2017 08:57:40 +0000
- Subject: [Bug target/81496] AVX load from adjacent memory location followed by concatenation
- Auto-submitted: auto-generated
- References: <bug-81496-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81496
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Target| |x86_64-*-*, i?86-*-*
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
We should definitely aim at comment#2, everything else will be notoriously slow
because of STLF not working. So the "win" to use a 256bit move will be
marginal at best (code size).
As of first building xmms and then merging them, ICC always uses a series of
inserts into the final ymm. Bulldozer/Zen might benefit from the xmm variant
though.