This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/81496] AVX load from adjacent memory location followed by concatenation


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81496

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Target|                            |x86_64-*-*, i?86-*-*

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
We should definitely aim at comment#2, everything else will be notoriously slow
because of STLF not working.  So the "win" to use a 256bit move will be
marginal at best (code size).

As of first building xmms and then merging them, ICC always uses a series of
inserts into the final ymm.  Bulldozer/Zen might benefit from the xmm variant
though.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]