Incorrect optimization of inlined functions?

miguell@teleline.es miguell@teleline.es
Sun Oct 24 04:21:00 GMT 1999


I don't know if this is really a bug, or only some feature that has to be
implemented, but it contradicts the documentation ("An Inline Function is
As Fast As a Macro").

First of all, I'm using RedHat 6.0 on a Pentium MMX. GCC is gcc-19991004,
which I've just downloaded and compiled. I've also tried gcc-2.7.2.3, with
similar results.

------------------------------------------------------------------------

$ gcc -v

Reading specs from /usr/local/lib/gcc-lib/i586-pc-linux-gnu/2.96/specs
gcc version 2.96 19991004 (experimental)

------------------------------------------------------------------------

Now, the problem. In the following program "test.c" I define an inlined
function, f1, which is called by a second function f2. I also define
another function, f3, which is the same as f2 except that the call to f1
has been replaced by its code:

------------------------------------------------------------------------

#include "stdio.h"

static inline char *f1(char *s) {
    return (s && (*s)) ? s : 0;
}

void f2(char *s) {
    if (f1(s))
	*s = 1;
}

void f3(char *s) {
    if ((s && (*s)) ? s : 0)
	*s = 1;
}

------------------------------------------------------------------------

And the output of objdump is:

------------------------------------------------------------------------

$ gcc -W -Wall -O2 -o test.o -c test.c
$ objdump -dr test.o

test.o:     file format elf32-i386

Disassembly of section .text:

00000000 <f2>:
   0:	55                   	pushl  %ebp
   1:	b8 00 00 00 00       	movl   $0x0,%eax
   6:	89 e5                	movl   %esp,%ebp
   8:	8b 55 08             	movl   0x8(%ebp),%edx
   b:	85 d2                	testl  %edx,%edx
   d:	74 0e                	je     1d <f2+0x1d>
   f:	80 3a 00             	cmpb   $0x0,(%edx)
  12:	0f 94 c0             	sete   %al
  15:	25 ff 00 00 00       	andl   $0xff,%eax
  1a:	48                   	decl   %eax
  1b:	21 d0                	andl   %edx,%eax
  1d:	85 c0                	testl  %eax,%eax
  1f:	74 03                	je     24 <f2+0x24>
  21:	c6 02 01             	movb   $0x1,(%edx)
  24:	89 ec                	movl   %ebp,%esp
  26:	5d                   	popl   %ebp
  27:	c3                   	ret    
  28:	90                   	nop    
  29:	8d b4 26 00 00 00 00 	leal   0x0(%esi,1),%esi

00000030 <f3>:
  30:	55                   	pushl  %ebp
  31:	89 e5                	movl   %esp,%ebp
  33:	8b 45 08             	movl   0x8(%ebp),%eax
  36:	85 c0                	testl  %eax,%eax
  38:	74 08                	je     42 <f3+0x12>
  3a:	80 38 00             	cmpb   $0x0,(%eax)
  3d:	74 03                	je     42 <f3+0x12>
  3f:	c6 00 01             	movb   $0x1,(%eax)
  42:	89 ec                	movl   %ebp,%esp
  44:	5d                   	popl   %ebp
  45:	c3                   	ret    

------------------------------------------------------------------------

As you can see, f3 is quite shorter and more optimized than f2. Shouldn't
both functions generate similar code?

If you need any information I haven't provided, please ask. I'm not
subscribed, so please CC me any answers.

Thank you.

Miguel



More information about the Gcc-bugs mailing list