This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code

From: "bonzini at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: 18 Oct 2005 10:07:25 -0000
Subject: [Bug target/19672] [3.4/4.0/4.1 Regression] Performance regression in simple loop code
References: <bug-19672-8689@http.gcc.gnu.org/bugzilla/>
Reply-to: gcc-bugzilla at gcc dot gnu dot org


------- Comment #11 from bonzini at gcc dot gnu dot org  2005-10-18 10:07 -------
The code I get from -mbranch-cost=1 is extremely similar to what 3.2.2
produced:

.L7:
        cmpl    %ebx, %edi
        jae     .L2
        cmpl    %ecx, %esi
        jae     .L2
        decl    %ebx
        movsbl  -1(%ecx),%eax
        decl    %ecx
        movsbl  (%ebx),%edx
        subl    %eax, %edx
        movl    %edx, %eax
        je      .L7
.L2:

The front-end is correctly producing a TRUTH_AND_EXPR, i.e. the
short-circuiting is not required.  RTL expansion will use jae (short-circuited
evaluation) or seta, depending on the branch cost.

However, the tuning here is wrong because on a pentium4 I do get a 40% speedup
from -mbranch-cost=1.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19672

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]