This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/43724] New: GCC produces suboptimal ARM NEON code for zero vector assignment
- From: "liranuna at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 12 Apr 2010 05:39:25 -0000
- Subject: [Bug target/43724] New: GCC produces suboptimal ARM NEON code for zero vector assignment
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
The intrinsic family for vdupq_n_XXX with argument of 0.
The code generated is:
mov r0, #0
vdup.32 q8, r0
Instead of the faster
veor.32 q8, q8, q8
Thing to note is that GCC will use xorps on x86[_64] for SSE when using
_mm_setzero_ps() or _mm_set1_ps(0).
--
Summary: GCC produces suboptimal ARM NEON code for zero vector
assignment
Product: gcc
Version: 4.4.3
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: liranuna at gmail dot com
GCC build triplet: x86_64-linux-gnu
GCC host triplet: x86_64-linux-gnu
GCC target triplet: arm-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43724