This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code
- From: "bonzini at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 18 Oct 2005 10:07:25 -0000
- Subject: [Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code
- References: <bug-19672-8689@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #11 from bonzini at gcc dot gnu dot org 2005-10-18 10:07 -------
The code I get from -mbranch-cost=1 is extremely similar to what 3.2.2
produced:
.L7:
cmpl %ebx, %edi
jae .L2
cmpl %ecx, %esi
jae .L2
decl %ebx
movsbl -1(%ecx),%eax
decl %ecx
movsbl (%ebx),%edx
subl %eax, %edx
movl %edx, %eax
je .L7
.L2:
The front-end is correctly producing a TRUTH_AND_EXPR, i.e. the
short-circuiting is not required. RTL expansion will use jae (short-circuited
evaluation) or seta, depending on the branch cost.
However, the tuning here is wrong because on a pentium4 I do get a 40% speedup
from -mbranch-cost=1.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672