This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
- From: "rguenth at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 4 Jul 2009 12:36:36 -0000
- Subject: [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
- References: <bug-40648-1649@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #3 from rguenth at gcc dot gnu dot org 2009-07-04 12:36 -------
Tuned for Core2 I get for the innermost loop
.L19:
leal (%eax,%ebx), %edx
movsd (%eax,%ecx), %xmm1
movsd (%edx), %xmm7
movhpd 8(%eax,%ecx), %xmm1
movhpd 8(%edx), %xmm7
movapd %xmm1, %xmm0
incl %esi
mulpd %xmm3, %xmm0
addl $16, %eax
addpd %xmm7, %xmm0
cmpl %edi, %esi
movlpd %xmm0, (%edx)
movhpd %xmm0, 8(%edx)
jb .L19
which is slower than with vectorization disabled (which is what happened
before the patch?).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648