[Bug target/97366] [8/9/10/11 Regression] Redundant load with SSE/AVX vector intrinsics

Mon Oct 12 11:04:00 GMT 2020

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97366

--- Comment #5 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
afaict LRA is just following IRA decisions, and IRA allocates that pseudo to
memory due to costs.

Not sure where strange cost is coming from, but it depends on x86 tuning
options: with -mtune=skylake we get the expected code, with -mtune=haswell we
get 128-bit vectors right and extra load for 256-bit, with -mtune=generic both
cases have extra loads.