This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/51499] vectorizer missing simple case
- From: "fb.programming at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sun, 11 Dec 2011 08:33:40 +0000
- Subject: [Bug tree-optimization/51499] vectorizer missing simple case
- Auto-submitted: auto-generated
- References: <bug-51499-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51499
--- Comment #2 from fb.programming at gmail dot com 2011-12-11 08:33:40 UTC ---
(In reply to comment #1)
g++-4.6.2 -S -Wall -O3 -ftree-vectorize -ftree-vectorizer-verbose=2 \
-ffast-math -fno-vect-cost-model
gives me exactly the same assembly code as above (which I'm surprised
a bit as -funsafe-math-optimizations might as well have eliminated the
loop completely).
The optimal assembly, however, I would expect to be something like:
.L3:
addq $1, %rax
addpd %xmm0, %xmm3
cmpq %rdi, %rax
addpd %xmm0, %xmm2
addpd %xmm0, %xmm1
jne .L3
Where the vector (sum1,sum2) is stored in xmm1, (sum3,sum4) stored in
xmm2, etc and (a,a) stored in xmm0. This speeds it up by a factor of 2
and is completely equivalent to the scalar case so I don't see why
-ffast-math (which implies -funsafe-math-optimizations) should be
necessary in this case, either.