This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/43724] New: GCC produces suboptimal ARM NEON code for zero vector assignment


The intrinsic family for vdupq_n_XXX with argument of 0.

The code generated is:

        mov     r0, #0
        vdup.32 q8, r0

Instead of the faster

        veor.32 q8, q8, q8

Thing to note is that GCC will use xorps on x86[_64] for SSE when using
_mm_setzero_ps() or _mm_set1_ps(0).


-- 
           Summary: GCC produces suboptimal ARM NEON code for zero vector
                    assignment
           Product: gcc
           Version: 4.4.3
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: liranuna at gmail dot com
 GCC build triplet: x86_64-linux-gnu
  GCC host triplet: x86_64-linux-gnu
GCC target triplet: arm-linux-gnueabi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43724


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]