This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/53346] [4.6/4.7/4.8 Regression] Bad vectorization in the proc cptrf2 of rnflow.f90
- From: "ubizjak at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 18 May 2012 18:24:43 +0000
- Subject: [Bug tree-optimization/53346] [4.6/4.7/4.8 Regression] Bad vectorization in the proc cptrf2 of rnflow.f90
- Auto-submitted: auto-generated
- References: <bug-53346-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53346
--- Comment #16 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-18 18:24:43 UTC ---
Perf confirms this findings, the first loop:
0.02 : 401e10: movslq %edx,%rbx
5.04 : 401e13: movss -0x4(%rdi,%rbx,4),%xmm0
24.97 : 401e19: ucomiss (%r9),%xmm0
14.66 : 401e1d: cmova %ecx,%edx
15.37 : 401e20: sub $0x1,%ecx
0.00 : 401e23: sub $0x4,%r9
0.00 : 401e27: cmp %r10d,%ecx
0.00 : 401e2a: jne 401e10 <cptrf2_+0x230>
the second:
0.00 : 401e60: movslq %ecx,%r10
1.69 : 401e63: movss -0x4(%rdi,%r10,4),%xmm0
7.78 : 401e6a: ucomiss (%r9),%xmm0
4.75 : 401e6e: cmova %r11d,%ecx
4.52 : 401e72: sub $0x1,%r11d
0.00 : 401e76: sub $0x4,%r9
0.05 : 401e7a: cmp %eax,%r11d
0.00 : 401e7d: jne 401e60 <cptrf2_+0x280>
the third:
0.00 : 401ff8: movslq %edx,%r10
0.78 : 401ffb: movss -0x4(%rdi,%r10,4),%xmm0
3.14 : 402002: ucomiss (%r9),%xmm0
2.04 : 402006: cmova %ecx,%edx
1.89 : 402009: sub $0x4,%r9
0.00 : 40200d: sub $0x1,%ecx
0.00 : 402010: jne 401ff8 <cptrf2_+0x418>