[Bug target/95265] aarch64: suboptimal code generation for common neon intrinsic sequence involving shrn and mull

ktkachov at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Feb 10 12:36:14 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95265

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
      Known to work|                            |11.0
   Target Milestone|---                         |11.0
             Status|NEW                         |RESOLVED
                 CC|                            |ktkachov at gcc dot gnu.org

--- Comment #2 from ktkachov at gcc dot gnu.org ---
This is fixed in GCC 11. It now generates:
func:
        smull   v2.2d, v0.2s, v1.2s
        smull2  v1.2d, v0.4s, v1.4s
        shrn    v0.2s, v2.2d, 12
        shrn2   v0.4s, v1.2d, 12
        ret


More information about the Gcc-bugs mailing list