This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/67577] Trivial float-vectorization foiled by a loop
- From: "pinskia at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 18 Dec 2015 01:20:11 +0000
- Subject: [Bug tree-optimization/67577] Trivial float-vectorization foiled by a loop
- Auto-submitted: auto-generated
- References: <bug-67577-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67577
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|rtl-optimization |tree-optimization
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
For aarch64-linux-gnu on the trunk (GCC 6), we are able to produce the
vectorized code correctly:
adrp x1, .LANCHOR0
add x0, x1, :lo12:.LANCHOR0
ldr q0, [x1, #:lo12:.LANCHOR0]
ldr q1, [x0, 16]
ldr q4, [x0, 64]
ldr q3, [x0, 48]
ldr s2, [x0, 32]
fsub v4.4s, v4.4s, v1.4s
fsub v3.4s, v3.4s, v0.4s
dup v2.4s, v2.s[0]
fmla v1.4s, v2.4s, v4.4s
fmla v0.4s, v2.4s, v3.4s
str q1, [x0, 96]
str q0, [x0, 80]