Compiler bug in gcc?

Ronald Landheer-Cieslak blytkerchan@users.sourceforge.net
Mon Sep 1 11:38:00 GMT 2003


Hello all,

I'm currently building a Newlib-based linux-x-freebsd cross-compiler and am
having a wee bit of trouble with what I think might be a compiler/optimiser
bug in gcc.

The function __do_global_ctors_aux gets called from the .init section, from 
a _init function from _start, which I wrote. (On i386-unknown-freebsd4.7 the
.init section contents apparently doesn't get run automatically, but needs to
be called from _start, so I wrapped a function definition around it and call
it from _start.

The code in gcc of the body of this function is in a macro called 
DO_GLOBAL_CTORS_BODY, which looks like this:
do {                                                                    \
  unsigned long nptrs = (unsigned long) __CTOR_LIST__[0];               \
  unsigned i;                                                           \
  if (nptrs == (unsigned long)-1)                                       \
    for (nptrs = 0; __CTOR_LIST__[nptrs + 1] != 0; nptrs++);            \
  for (i = nptrs; i >= 1; i--)                                          \
    __CTOR_LIST__[i] ();                                                \
} while (0)

gcc, configured with 
$ ../src/configure --prefix=$PREFIX --target=$TARGET --with-newlib \
  --with-gnu-as --with-gnu-ld --disable-shared --disable-threads \
  --enable-languages=c,c++

produces this: (disassembled by gdb)

Dump of assembler code for function __do_global_ctors_aux:
0x8054b98 <__do_global_ctors_aux>:      push   %ebp
0x8054b99 <__do_global_ctors_aux+1>:    mov    %esp,%ebp
0x8054b9b <__do_global_ctors_aux+3>:    push   %ebx
0x8054b9c <__do_global_ctors_aux+4>:    push   %edx
0x8054b9d <__do_global_ctors_aux+5>:    mov    0x8057194,%eax
0x8054ba2 <__do_global_ctors_aux+10>:   cmp    $0xffffffff,%eax
0x8054ba5 <__do_global_ctors_aux+13>:   mov    $0x8057194,%ebx
0x8054baa <__do_global_ctors_aux+18>:   je     0x8054bb8 <__do_global_ctors_aux+32>
0x8054bac <__do_global_ctors_aux+20>:   sub    $0x4,%ebx
0x8054baf <__do_global_ctors_aux+23>:   call   *%eax
0x8054bb1 <__do_global_ctors_aux+25>:   mov    (%ebx),%eax
0x8054bb3 <__do_global_ctors_aux+27>:   cmp    $0xffffffff,%eax
0x8054bb6 <__do_global_ctors_aux+30>:   jne    0x8054bac <__do_global_ctors_aux+20>
0x8054bb8 <__do_global_ctors_aux+32>:   pop    %eax
0x8054bb9 <__do_global_ctors_aux+33>:   pop    %ebx
0x8054bba <__do_global_ctors_aux+34>:   leave
0x8054bbb <__do_global_ctors_aux+35>:   ret
End of assembler dump.

The error is that %ebx is not saved before thet call to *%eax, and is 
not guaranteed to be preserved which, in fact, it is not.

I have attached the transcript of a gdb session on a simple test case, and 
the simple test case itself, to this E-mail. My questions are obviously:
* am I right in presuming this is a bug?
* if so, should I add it to the bugzilla?
* is there any known work-around for this?

Any pointers, comments, etc. will be very welcome.

rlc

-- 
There's an old proverb that says just about whatever you want it to.
-------------- next part --------------
GDB session log - whatever starts with ... is a comment from me

$ gdb main
GNU gdb 5.3
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd4.7"...
(gdb) break _start
Breakpoint 1 at 0x8048083
(gdb) r
Starting program: /usr/home/ronald/main

Breakpoint 1, 0x08048083 in _start ()
(gdb) disassemble
Dump of assembler code for function _start:
0x8048080 <_start>:     push   %ebp
0x8048081 <_start+1>:   mov    %esp,%ebp
0x8048083 <_start+3>:   call   0x8048074 <_init>
0x8048088 <_start+8>:   call   0x804cf5c <tzset>
0x804808d <_start+13>:  lea    0x4(%ebp),%eax
0x8048090 <_start+16>:  mov    %eax,%ebx
0x8048092 <_start+18>:  mov    (%eax),%eax
0x8048094 <_start+20>:  add    $0x4,%ebx
0x8048097 <_start+23>:  mov    %eax,%ecx
0x8048099 <_start+25>:  imul   $0x4,%ecx,%ecx
0x804809c <_start+28>:  add    %ebx,%ecx
0x804809e <_start+30>:  add    $0x4,%ecx
0x80480a1 <_start+33>:  mov    %ecx,0x8056000
0x80480a7 <_start+39>:  mov    (%ecx),%edx
0x80480a9 <_start+41>:  mov    %edx,0x8056004
0x80480af <_start+47>:  push   %ecx
0x80480b0 <_start+48>:  push   %ebx
0x80480b1 <_start+49>:  push   %eax
0x80480b2 <_start+50>:  call   0x8048164 <main>
0x80480b7 <_start+55>:  push   %eax
0x80480b8 <_start+56>:  call   0x8054bbc <_fini>
0x80480bd <_start+61>:  call   0x804c5c4 <exit>
0x80480c2 <_start+66>:  nop
0x80480c3 <_start+67>:  nop
End of assembler dump.

... as you can see, there isn't much special in the _start that would screw 
    anything up: the first thing it does is save the base pointer and call _init

(gdb) stepi
0x08048074 in _init ()
(gdb) disassemble
Dump of assembler code for function _init:
0x8048074 <_init>:      call   0x8048118 <frame_dummy>
0x8048079 <_init+5>:    call   0x8054b98 <__do_global_ctors_aux>
0x804807e <_init+10>:   ret
End of assembler dump.

... _init calls frame_dummy and __do_global_ctors_aux - both gcc internals. I've 
    already verified, with another STC, that the .init section really doesn't get
    run automatically on i386-unknown-freebsd4.7, so calling it like this is 
    necessary - and is, BTW, the way the native libc does it
    
(gdb) nexti
0x08048079 in _init ()

... we skip over frame_dummy (it is run, of course)

(gdb) stepi
0x08054b98 in __do_global_ctors_aux ()
(gdb) disassemble
Dump of assembler code for function __do_global_ctors_aux:
0x8054b98 <__do_global_ctors_aux>:      push   %ebp
0x8054b99 <__do_global_ctors_aux+1>:    mov    %esp,%ebp
0x8054b9b <__do_global_ctors_aux+3>:    push   %ebx
0x8054b9c <__do_global_ctors_aux+4>:    push   %edx
0x8054b9d <__do_global_ctors_aux+5>:    mov    0x8057194,%eax
0x8054ba2 <__do_global_ctors_aux+10>:   cmp    $0xffffffff,%eax
0x8054ba5 <__do_global_ctors_aux+13>:   mov    $0x8057194,%ebx
0x8054baa <__do_global_ctors_aux+18>:   je     0x8054bb8 <__do_global_ctors_aux+32>
0x8054bac <__do_global_ctors_aux+20>:   sub    $0x4,%ebx
0x8054baf <__do_global_ctors_aux+23>:   call   *%eax
0x8054bb1 <__do_global_ctors_aux+25>:   mov    (%ebx),%eax

... As you can see, %ebx is presumed to contain something sensible here, 
    but that is by no means guaranteed: the constructor that has just been 
    called may have done whatever it liked with %ebx

0x8054bb3 <__do_global_ctors_aux+27>:   cmp    $0xffffffff,%eax
0x8054bb6 <__do_global_ctors_aux+30>:   jne    0x8054bac <__do_global_ctors_aux+20>
0x8054bb8 <__do_global_ctors_aux+32>:   pop    %eax
0x8054bb9 <__do_global_ctors_aux+33>:   pop    %ebx
0x8054bba <__do_global_ctors_aux+34>:   leave
0x8054bbb <__do_global_ctors_aux+35>:   ret
End of assembler dump.

... let's watch all of the relevant registers - especially %eax and %ebx are 
    interesting in this context
    
(gdb) disp/x $eax
1: /x $eax = 0x0
(gdb) disp/x $ebx
2: /x $ebx = 0x0
(gdb) disp/x $edx
3: /x $edx = 0x80571c4
(gdb) disp/x $esp
4: /x $esp = 0xbfbffca8
(gdb) disp/x $ebp
5: /x $ebp = 0xbfbffcb0
(gdb) stepi
0x08054b99 in __do_global_ctors_aux ()
...	0x8054b98 <__do_global_ctors_aux>:      push   %ebp
...	0x8054b99 <__do_global_ctors_aux+1>:    mov    %esp,%ebp

5: /x $ebp = 0xbfbffcb0
4: /x $esp = 0xbfbffca4
3: /x $edx = 0x80571c4
2: /x $ebx = 0x0
1: /x $eax = 0x0
(gdb) nexti
0x08054b9b in __do_global_ctors_aux ()
...	0x8054b9b <__do_global_ctors_aux+3>:    push   %ebx

5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffca4
3: /x $edx = 0x80571c4
2: /x $ebx = 0x0
1: /x $eax = 0x0
(gdb)
0x08054b9c in __do_global_ctors_aux ()
...	0x8054b9c <__do_global_ctors_aux+4>:    push   %edx

5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffca0
3: /x $edx = 0x80571c4
2: /x $ebx = 0x0
1: /x $eax = 0x0
(gdb)
0x08054b9d in __do_global_ctors_aux ()
...	0x8054b9d <__do_global_ctors_aux+5>:    mov    0x8057194,%eax

5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0x0
1: /x $eax = 0x0
(gdb)
0x08054ba2 in __do_global_ctors_aux ()
...	0x8054ba2 <__do_global_ctors_aux+10>:   cmp    $0xffffffff,%eax
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0x0
1: /x $eax = 0x80481fa

... The function that is now pointed to by %eax is indeed what we'll want to 
    run - it calls the constructor of our almost-empty class..

(gdb) disas 0x80481fa
Dump of assembler code for function _GLOBAL__I_main:
0x80481fa <_GLOBAL__I_main>:    push   %ebp
0x80481fb <_GLOBAL__I_main+1>:  mov    %esp,%ebp
0x80481fd <_GLOBAL__I_main+3>:  sub    $0x8,%esp
0x8048200 <_GLOBAL__I_main+6>:  sub    $0x8,%esp
0x8048203 <_GLOBAL__I_main+9>:  push   $0xffff
0x8048208 <_GLOBAL__I_main+14>: push   $0x1
0x804820a <_GLOBAL__I_main+16>: call   0x80481b4 <__static_initialization_and_destruction_0>
0x804820f <_GLOBAL__I_main+21>: add    $0x10,%esp
0x8048212 <_GLOBAL__I_main+24>: leave
0x8048213 <_GLOBAL__I_main+25>: ret
End of assembler dump.

(gdb) nexti
0x08054ba5 in __do_global_ctors_aux ()
...	0x8054ba5 <__do_global_ctors_aux+13>:   mov    $0x8057194,%ebx
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0x0
1: /x $eax = 0x80481fa
(gdb)
0x08054baa in __do_global_ctors_aux ()
...	0x8054baa <__do_global_ctors_aux+18>:   je     0x8054bb8 <__do_global_ctors_aux+32>
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0x8057194
1: /x $eax = 0x80481fa
(gdb)
0x08054bac in __do_global_ctors_aux ()
...	0x8054bac <__do_global_ctors_aux+20>:   sub    $0x4,%ebx
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0x8057194
1: /x $eax = 0x80481fa

... The next step will call the function pointed to by %eax. Watch %ebx as 
    the gcc-generated code assumes it will not change

(gdb)
0x08054baf in __do_global_ctors_aux ()
...	0x8054baf <__do_global_ctors_aux+23>:   call   *%eax
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0x8057190
1: /x $eax = 0x80481fa
(gdb)
Mine constructor

... The "Mine constructor" is the output of the constructor

0x08054bb1 in __do_global_ctors_aux ()
...	0x8054bb1 <__do_global_ctors_aux+25>:   mov    (%ebx),%eax
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0xbfbffc10
1: /x $eax = 0x80571dc

... And as you can see, %ebx has changed from 0x8057190 to 0xbfbffc10
    The gcc-generated assembly code now reads from that "address" into 
    %eax and expects to find either a function address, or -1 in there.
    Note, by the way, that some platforms may put the number of pointers
    to be expected before the actual pointers. Though 
    i386-unknown-freebsd4.7 doesn't do that, this same code, when fixed
    for the changing %ebx, may still cause problems

(gdb)
0x08054bb3 in __do_global_ctors_aux ()
...	0x8054bb3 <__do_global_ctors_aux+27>:   cmp    $0xffffffff,%eax
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0xbfbffc10
1: /x $eax = 0x0

... Whatever was at 0xbfbffc10 was NULL, which is now put in %eax as "either
    a function address or -1". It is of course neither (as it is NULL/0) so
    we loop back to 0x08054bac

(gdb)
0x08054bb6 in __do_global_ctors_aux ()
...	0x8054bb6 <__do_global_ctors_aux+30>:   jne    0x8054bac <__do_global_ctors_aux+20>
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0xbfbffc10
1: /x $eax = 0x0
(gdb)
0x08054bac in __do_global_ctors_aux ()
...	0x8054bac <__do_global_ctors_aux+20>:   sub    $0x4,%ebx
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0xbfbffc10
1: /x $eax = 0x0
(gdb)

... and we call *%eax (i.e. NULL) as a function

0x08054baf in __do_global_ctors_aux ()
...	0x8054baf <__do_global_ctors_aux+23>:   call   *%eax
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc9c
3: /x $edx = 0x80571c4
2: /x $ebx = 0xbfbffc0c
1: /x $eax = 0x0
(gdb)

... which crashes

Program received signal SIGSEGV, Segmentation fault.
0x00000000 in ?? ()
5: /x $ebp = 0xbfbffca4
4: /x $esp = 0xbfbffc98
3: /x $edx = 0x80571c4
2: /x $ebx = 0xbfbffc0c
1: /x $eax = 0x0
(gdb)

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
(gdb)

FWIW, the two symbols __do_global_ctors_aux should be working on live here:
08057198 d __CTOR_END__
08057190 d __CTOR_LIST__

(gdb) p/x *0x08057190
$2 = 0xffffffff

if %ebx hasn't changed in the call to the constructor, the code would have 
worked like a charm.

-------------- next part --------------
#include <unistd.h>

class Mine
{
private :
	int i;
public :
	Mine(void)
	{
		write(1, "Mine constructor\n", 17);

		i = 0;
	}

	~Mine(void)
	{
		write(1, "Mine destructor\n", 16);

		i = 2;
	}

	int function(void)
	{
		write(1, "Mine function\n", 14);

		i = 1;

		return(i);
	}
};

static Mine mine;

int main(void)
{
	write(1, "main function start\n", 20);
	mine.function();
	write(1, "main function end\n", 18);

	return(0);
}



More information about the Gcc-bugs mailing list