Bug 43992 - Suboptimal x86 pre/postamble emitted
Summary: Suboptimal x86 pre/postamble emitted
Status: RESOLVED DUPLICATE of bug 42778
Alias: None
Product: gcc
Classification: Unclassified
Component: other (show other bugs)
Version: 4.5.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-05 14:38 UTC by Hugo van der Sanden
Modified: 2010-05-05 15:44 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Example C source code (176 bytes, text/plain)
2010-05-05 14:39 UTC, Hugo van der Sanden
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hugo van der Sanden 2010-05-05 14:38:25 UTC
zen% /opt/gcc-4.5.0/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/opt/gcc-4.5.0/bin/gcc
COLLECT_LTO_WRAPPER=/opt/gcc-4.5.0/libexec/gcc/i686-pc-linux-gnu/4.5.0/lto-wrapper
Target: i686-pc-linux-gnu
Configured with: /src/package/lang/other/gcc-4.5.0/configure --prefix=/opt/gcc-4.5.0 --with-gmp=/opt/gmp-4.2.2 --with-mpfr=/opt/mpfr-2.4.1 --with-mpc=/opt/mpc-0.8.1
Thread model: posix
gcc version 4.5.0 (GCC) 
zen% 

Compiling with an appropriate LD_LIBRARY_PATH and "/opt/gcc-4.5.0/bin/gcc -o gcctest.s -S -O3 gcctest.c", the assembler produced for the following C code includes unnecessary stack manipulation instructions. I suspect this is x86-specific preamble/postamble generation being confused by the tail-call optimization.

zen% cat gcctest.c
typedef int (f2_t)(int pi1, int pi2);
typedef int (f3_t)(f2_t* pf2, int pi1, int pi2);
typedef struct { f2_t* sf2; f3_t* sf3; } cmp_t;
int* gip;
int f2(int pi1, int pi2) { return 0; }
int f3(f2_t* pf2, int pi1, int pi2) {
    return pf2(gip[pi1], gip[pi2]);
}
int main(void) {
    cmp_t ct = { &f2, &f3 };
    return ct.sf3(ct.sf2, 0, 0);
}
zen% 

The assembler emitted for f3() is:
    .p2align 4,,15
.globl f3   
    .type   f3, @function
f3:
    pushl   %ebp
    movl    %esp, %ebp
    pushl   %ebx
    subl    $4, %esp  ; this is not needed
    movl    gip, %eax
    movl    16(%ebp), %ebx
    movl    12(%ebp), %ecx
    movl    8(%ebp), %edx
    movl    (%eax,%ebx,4), %ebx
    movl    %ebx, 12(%ebp)
    movl    (%eax,%ecx,4), %eax
    movl    %eax, 8(%ebp)
    addl    $4, %esp  ; this is not needed
    popl    %ebx
    popl    %ebp
    jmp *%edx
    .size   f3, .-f3

I spotted this initially with gcc-4.4.3, and have just verified that the same code is emitted for 4.5.0.

Given that in similar code without a tail-call the stack manipulation is completely elided, this could maybe be classed as a bug. However, the code as generated is not incorrect.
Comment 1 Hugo van der Sanden 2010-05-05 14:39:37 UTC
Created attachment 20562 [details]
Example C source code
Comment 2 H.J. Lu 2010-05-05 15:44:27 UTC

*** This bug has been marked as a duplicate of 42778 ***