[Bug target/95265] aarch64: suboptimal code generation for common neon intrinsic sequence involving shrn and mull
ktkachov at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Feb 10 12:36:14 GMT 2021
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95265
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Known to work| |11.0
Target Milestone|--- |11.0
Status|NEW |RESOLVED
CC| |ktkachov at gcc dot gnu.org
--- Comment #2 from ktkachov at gcc dot gnu.org ---
This is fixed in GCC 11. It now generates:
func:
smull v2.2d, v0.2s, v1.2s
smull2 v1.2d, v0.4s, v1.4s
shrn v0.2s, v2.2d, 12
shrn2 v0.4s, v1.2d, 12
ret
More information about the Gcc-bugs
mailing list