Strange optimization results

Dmitry Gorbachev d.g.gorbachev@gmail.com
Mon Apr 11 05:25:00 GMT 2011


Ian Lance Taylor wrote:

> I don't think you said what the missed optimization was.  You also
> didn't mention which version of gcc you are using, or what target you
> are compiling for.

Testcase #2 shows how __builtin_memcmp is not moved outside the loop.
This is what GCC 4.7.0, target i686-pc-linux-gnu, option -O3,
generates:

main:
        pushl   %ebp
        movl    _ZL2s1+12, %edx
        pushl   %edi
        pushl   %esi
        xorl    %esi, %esi
        cmpl    _ZL2s2+12, %edx
        pushl   %ebx
        je      .L11
.L2:
        popl    %ebx
        movl    %esi, %eax
        popl    %esi
        popl    %edi
        popl    %ebp
        ret
.L11:
        movl    $18, %ebp
        movl    $1, %ebx
.L7:
        movl    $_ZL2s1, %esi
        movl    %edx, %ecx
        cmpl    %edx, %edx
        movl    $_ZL2s2, %edi
        repz cmpsb
        movl    $0, %esi
        setb    %cl
        seta    %al
        subb    %cl, %al
        movl    %ebx, %ecx
        movsbl  %al, %eax
        movzbl  %cl, %ecx
        testl   %eax, %eax
        cmovne  %esi, %ebx
        testl   %eax, %eax
        cmovne  %esi, %ecx
        subl    $1, %ebp
        jne     .L7
        movl    %ecx, %esi
        jmp     .L2

Now what happens without "str1.size == str2.size" in operator==:

main:
        movl    _ZL2s1+12, %ecx
        movl    $18, %edx
        movl    $1, %eax
        pushl   %edi
        movl    $_ZL2s2, %edi
        pushl   %esi
        movl    $_ZL2s1, %esi
        cmpl    %ecx, %ecx
        repz cmpsb
        sete    %cl
        movzbl  %cl, %ecx
.L2:
        andl    %ecx, %eax
        subl    $1, %edx
        jne     .L2
        movzbl  %al, %eax
        popl    %esi
        popl    %edi
        ret

The loop, however, is still here, which is also exemplified by testcase #1:

main:
        movl    _ZL1n, %ecx
        movl    $18, %edx
        movl    $1, %eax
.L2:
        andl    %ecx, %eax
        subl    $1, %edx
        jne     .L2
        rep
        ret

Tested versions 4.4.4 and later.

Dmitry



More information about the Gcc-help mailing list