This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/82136] x86: -mavx256-split-unaligned-load should try to fold other shuffles into the load/vinsertf128
- From: "peter at cordes dot ca" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 08 Sep 2017 02:08:08 +0000
- Subject: [Bug target/82136] x86: -mavx256-split-unaligned-load should try to fold other shuffles into the load/vinsertf128
- Auto-submitted: auto-generated
- References: <bug-82136-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82136
--- Comment #1 from Peter Cordes <peter at cordes dot ca> ---
Whoops, the compiler-explorer link had aligned=1. This one produces the asm I
showed in the original report: https://godbolt.org/g/WsZ5S9
See bug 82137 for a much more efficient vectorization strategy gcc should use
instead, with just in-lane shuffle + blend and some duplicated work.