This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/23305] [4.0/4.1/4.2/4.3 Regression] Inlining related regression for gcc-4.x
- From: "jakub at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 22 Nov 2007 16:41:09 -0000
- Subject: [Bug tree-optimization/23305] [4.0/4.1/4.2/4.3 Regression] Inlining related regression for gcc-4.x
- References: <bug-23305-10914@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #9 from jakub at gcc dot gnu dot org 2007-11-22 16:41 -------
On x86_64-linux -m64 with -O2 gcc doesn't hoist movabsq insns out of the loops,
which can give some performance back:
time ./pr23305-slow
real 0m4.028s
user 0m4.023s
sys 0m0.003s
time ./pr23305-slow2
real 0m3.436s
user 0m3.434s
sys 0m0.001s
when I hoist it by hand in assembly:
--- pr23305-slow.s 2007-11-22 17:14:09.000000000 +0100
+++ pr23305-slow2.s 2007-11-22 17:31:31.000000000 +0100
@@ -222,16 +222,16 @@ _Z13s000005a_testv:
.LVL2:
.LBB329:
.LBB330:
.loc 1 28697 0
cmpq %rax, %rdx
je .L13
+ movabsq $4613937818241073152, %r8
.p2align 4,,10
.p2align 3
.L14:
- movabsq $4613937818241073152, %r8
movq %r8, (%rax)
addq $8, %rax
cmpq %rax, %rdx
jne .L14
.L13:
.LBE330:
@@ -242,17 +242,17 @@ _Z13s000005a_testv:
.LVL3:
.LBB326:
.LBB327:
.loc 1 28697 0
cmpq %rax, %rdx
je .L15
+ movabsq $4613937818241073152, %rdi
.p2align 4,,10
.p2align 3
.L16:
.LBE327:
- movabsq $4613937818241073152, %rdi
movq %rdi, (%rax)
.LBB328:
addq $8, %rax
cmpq %rax, %rdx
jne .L16
.L15:
but still the -O2 -fno-inline-small-functions version is much faster:
time ./pr23305-fast
real 0m1.591s
user 0m1.588s
sys 0m0.001s
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23305