[Bug target/89967] Inefficient code generation for vld2q_lane_u8 under aarch64
tnfchris at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Aug 23 15:57:59 GMT 2023
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89967
Tamar Christina <tnfchris at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |tnfchris at gcc dot gnu.org
See Also| |https://gcc.gnu.org/bugzill
| |a/show_bug.cgi?id=106106
--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
This is caused by SRA scalarizing the structural registers. i.e. it breaks
apart the uint8x16x2_t into two uint8x16_t, for use with vld2 we need them as a
whole, and so we recreate the type again.
This causes a copy through scalarization and then constructing the type again
in RTL. Reload is able to remove one copy but not the other.
The fix for #106106 will also fix this.
More information about the Gcc-bugs
mailing list