Strange optimization results
Dmitry Gorbachev
d.g.gorbachev@gmail.com
Mon Apr 11 05:25:00 GMT 2011
Ian Lance Taylor wrote:
> I don't think you said what the missed optimization was. You also
> didn't mention which version of gcc you are using, or what target you
> are compiling for.
Testcase #2 shows how __builtin_memcmp is not moved outside the loop.
This is what GCC 4.7.0, target i686-pc-linux-gnu, option -O3,
generates:
main:
pushl %ebp
movl _ZL2s1+12, %edx
pushl %edi
pushl %esi
xorl %esi, %esi
cmpl _ZL2s2+12, %edx
pushl %ebx
je .L11
.L2:
popl %ebx
movl %esi, %eax
popl %esi
popl %edi
popl %ebp
ret
.L11:
movl $18, %ebp
movl $1, %ebx
.L7:
movl $_ZL2s1, %esi
movl %edx, %ecx
cmpl %edx, %edx
movl $_ZL2s2, %edi
repz cmpsb
movl $0, %esi
setb %cl
seta %al
subb %cl, %al
movl %ebx, %ecx
movsbl %al, %eax
movzbl %cl, %ecx
testl %eax, %eax
cmovne %esi, %ebx
testl %eax, %eax
cmovne %esi, %ecx
subl $1, %ebp
jne .L7
movl %ecx, %esi
jmp .L2
Now what happens without "str1.size == str2.size" in operator==:
main:
movl _ZL2s1+12, %ecx
movl $18, %edx
movl $1, %eax
pushl %edi
movl $_ZL2s2, %edi
pushl %esi
movl $_ZL2s1, %esi
cmpl %ecx, %ecx
repz cmpsb
sete %cl
movzbl %cl, %ecx
.L2:
andl %ecx, %eax
subl $1, %edx
jne .L2
movzbl %al, %eax
popl %esi
popl %edi
ret
The loop, however, is still here, which is also exemplified by testcase #1:
main:
movl _ZL1n, %ecx
movl $18, %edx
movl $1, %eax
.L2:
andl %ecx, %eax
subl $1, %edx
jne .L2
rep
ret
Tested versions 4.4.4 and later.
Dmitry
More information about the Gcc-help
mailing list