Incorrect optimization of inlined functions?
miguell@teleline.es
miguell@teleline.es
Sun Oct 24 04:21:00 GMT 1999
I don't know if this is really a bug, or only some feature that has to be
implemented, but it contradicts the documentation ("An Inline Function is
As Fast As a Macro").
First of all, I'm using RedHat 6.0 on a Pentium MMX. GCC is gcc-19991004,
which I've just downloaded and compiled. I've also tried gcc-2.7.2.3, with
similar results.
------------------------------------------------------------------------
$ gcc -v
Reading specs from /usr/local/lib/gcc-lib/i586-pc-linux-gnu/2.96/specs
gcc version 2.96 19991004 (experimental)
------------------------------------------------------------------------
Now, the problem. In the following program "test.c" I define an inlined
function, f1, which is called by a second function f2. I also define
another function, f3, which is the same as f2 except that the call to f1
has been replaced by its code:
------------------------------------------------------------------------
#include "stdio.h"
static inline char *f1(char *s) {
return (s && (*s)) ? s : 0;
}
void f2(char *s) {
if (f1(s))
*s = 1;
}
void f3(char *s) {
if ((s && (*s)) ? s : 0)
*s = 1;
}
------------------------------------------------------------------------
And the output of objdump is:
------------------------------------------------------------------------
$ gcc -W -Wall -O2 -o test.o -c test.c
$ objdump -dr test.o
test.o: file format elf32-i386
Disassembly of section .text:
00000000 <f2>:
0: 55 pushl %ebp
1: b8 00 00 00 00 movl $0x0,%eax
6: 89 e5 movl %esp,%ebp
8: 8b 55 08 movl 0x8(%ebp),%edx
b: 85 d2 testl %edx,%edx
d: 74 0e je 1d <f2+0x1d>
f: 80 3a 00 cmpb $0x0,(%edx)
12: 0f 94 c0 sete %al
15: 25 ff 00 00 00 andl $0xff,%eax
1a: 48 decl %eax
1b: 21 d0 andl %edx,%eax
1d: 85 c0 testl %eax,%eax
1f: 74 03 je 24 <f2+0x24>
21: c6 02 01 movb $0x1,(%edx)
24: 89 ec movl %ebp,%esp
26: 5d popl %ebp
27: c3 ret
28: 90 nop
29: 8d b4 26 00 00 00 00 leal 0x0(%esi,1),%esi
00000030 <f3>:
30: 55 pushl %ebp
31: 89 e5 movl %esp,%ebp
33: 8b 45 08 movl 0x8(%ebp),%eax
36: 85 c0 testl %eax,%eax
38: 74 08 je 42 <f3+0x12>
3a: 80 38 00 cmpb $0x0,(%eax)
3d: 74 03 je 42 <f3+0x12>
3f: c6 00 01 movb $0x1,(%eax)
42: 89 ec movl %ebp,%esp
44: 5d popl %ebp
45: c3 ret
------------------------------------------------------------------------
As you can see, f3 is quite shorter and more optimized than f2. Shouldn't
both functions generate similar code?
If you need any information I haven't provided, please ask. I'm not
subscribed, so please CC me any answers.
Thank you.
Miguel
More information about the Gcc-bugs
mailing list