[Bug target/26290] [4.1/4.2 Regression]: code pessimization wrt. GCC 4.0 probably due to TARGET_MEM_REF
rguenth at gcc dot gnu dot org
gcc-bugzilla@gcc.gnu.org
Sun Nov 4 11:45:00 GMT 2007
------- Comment #20 from rguenth at gcc dot gnu dot org 2007-11-04 11:45 -------
With mainline we now get
.p2align 4,,7
.p2align 3
.L6:
addl $1, %eax
cmpl %eax, %edi
movl %eax, -20(%ebp)
jle .L3
movl %eax, %ecx
movl %esi, %edx
.p2align 4,,7
.p2align 3
.L5:
movl -4(%esi), %ebx
movl (%edx), %eax
cmpl %eax, %ebx
jle .L4
movl %eax, -4(%esi)
movl %ebx, (%edx)
.L4:
addl $1, %ecx
addl $4, %edx
cmpl %ecx, %edi
jg .L5
.L3:
movl -20(%ebp), %eax
addl $4, %esi
cmpl -16(%ebp), %eax
jl .L6
which looks good, apart from the issue Andrew pointed out (but that's
PR26726):
lsti_11 = MEM[index: ivtmp.27_14, offset: 0x0fffffffc];
MEM[index: ivtmp.27_14, offset: 0x0fffffffc] = lstj_15;
4.0 is still faster with the original testcase, but the only difference I
can spot is that mainline uses addl $1, %eax while 4.0 uses incl here. Oh,
and 4.0 uses an extra induction variable(!) for the exit test (and less
loop alignment):
.L3:
incl %eax
cmpl %eax, 12(%ebp)
movl %eax, -20(%ebp)
jle .L4
movl 12(%ebp), %edi
movl %esi, %edx
xorl %ebx, %ebx
subl %eax, %edi
.p2align 4,,15
.L6:
movl -4(%esi), %ecx
movl (%edx), %eax
cmpl %eax, %ecx
jle .L7
movl %eax, -4(%esi)
movl %ecx, (%edx)
.L7:
incl %ebx
addl $4, %edx
cmpl %edi, %ebx
jne .L6
.L4:
movl -20(%ebp), %eax
addl $4, %esi
cmpl -16(%ebp), %eax
jl .L3
Using -mtune=core2 on trunk get's back the incl and makes the code faster
than 4.0 (on my Core CPU, that is). So the generic tuning here makes the
difference for trunk.
4.2 is still broken, though. I would say let's close this as fixed.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work|4.0.4 |4.0.4 4.3.0
Last reconfirmed|2006-02-24 15:20:29 |2007-11-04 11:45:07
date| |
Summary|[4.1/4.2/4.3 Regression]: |[4.1/4.2 Regression]: code
|code pessimization wrt. GCC |pessimization wrt. GCC 4.0
|4.0 probably due to |probably due to
|TARGET_MEM_REF |TARGET_MEM_REF
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26290
More information about the Gcc-bugs
mailing list