Bug 68738 - call to overridden function segfaults
Summary: call to overridden function segfaults
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 5.2.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-06 18:39 UTC by Rian Quinn
Modified: 2016-01-08 13:19 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rian Quinn 2015-12-06 18:39:40 UTC
Using the TARGET=elf-x86_64 compiler (OS development), I get a strange crash with C++. The class definition is as follows:

class Blah1
{
public:
    Blah1() {}
    virtual ~Blah1() {}

    virtual int foo() { return 0; }
};

class Blah2 : public Blah1
{
public:
    Blah2() {}
    ~Blah2() {}

    int boo() { return 1; }
    int foo() override { return 1; }
};

Blah2 g_blah2;

int do_something()
{
    Blah2 *p_blah2 = &g_blah2;
    int i = p_blah2->foo();      // <----- crash here
}

The compiled assembly for this looks something like:

 c68:	48 89 45 e8          	mov    %rax,-0x18(%rbp)
 c6c:	48 8b 45 e8          	mov    -0x18(%rbp),%rax
 c70:	48 8b 00             	mov    (%rax),%rax
 c73:	48 83 c0 10          	add    $0x10,%rax
 c77:	48 8b 00             	mov    (%rax),%rax
 c7a:	48 8b 55 e8          	mov    -0x18(%rbp),%rdx
 c7e:	48 89 d7             	mov    %rdx,%rdi
 c81:	ff d0                	callq  *%rax

What's strange to me is it's not attempting to lookup the global symbol from the GOT. If I change the code to:

int do_something()
{
    Blah2 &p_blah2 = g_blah2;
    int i = p_blah2.foo();      // <----- works fine
}

And the compiled assembly looks like:

ca3:	e8 88 fe ff ff       	callq  b30 <_ZN5Blah23fooEv@plt>

Which has the GOT lookup like you would expect. Not sure what's going on here, but it seems like a bug with G++. 

Thanks,
- Rian
Comment 1 Marek Polacek 2015-12-07 17:18:28 UTC
Doesn't crash on x86_64-linux.
Comment 2 Rian Quinn 2015-12-07 17:32:14 UTC
Yeah.... I'm am pretty sure it is specific to TARGET=elf-x86_64 (i.e. no OS specified). When I ran the same test on Ubuntu's native GCC it ran fine. objdump showed pretty different assembly for the Ubuntu case, vs. the cross-compiled case. For the code that does not crash (using the reference instead of the pointer) the code is identical.
Comment 3 Rian Quinn 2015-12-07 17:38:36 UTC
Just for completeness, here is the exact code out objdump output:

class Blah1
{
public:
    Blah1() {}
    virtual ~Blah1() {}

    virtual int foo() { return 0; }
};

class Blah2 : public Blah1
{
public:
    Blah2() {}
    ~Blah2() {}

    int foo() override { return 1; }
};

Blah2 g_blah2;

void
do_something()
{
    Blah2 *bp1 = &g_blah2;
    Blah2 &bp2 = g_blah2;
    bp1->foo();               // Crashes
    bp2.foo();                // Does not crash
}

Using the cross-compiler (TARGET=x86_64-elf) you get the following:

0000000000000cd5 <_Z12do_somethingv>:
 cd5:	55                   	push   %rbp
 cd6:	48 89 e5             	mov    %rsp,%rbp
 cd9:	48 83 ec 10          	sub    $0x10,%rsp
 cdd:	48 8b 05 3c 07 20 00 	mov    0x20073c(%rip),%rax        # 201420 <_DYNAMIC+0x150>
 ce4:	48 89 45 f8          	mov    %rax,-0x8(%rbp)
 ce8:	48 8b 05 31 07 20 00 	mov    0x200731(%rip),%rax        # 201420 <_DYNAMIC+0x150>
 cef:	48 89 45 f0          	mov    %rax,-0x10(%rbp)
 cf3:	48 8b 45 f8          	mov    -0x8(%rbp),%rax
 cf7:	48 8b 00             	mov    (%rax),%rax
 cfa:	48 83 c0 10          	add    $0x10,%rax
 cfe:	48 8b 00             	mov    (%rax),%rax
 d01:	48 8b 55 f8          	mov    -0x8(%rbp),%rdx
 d05:	48 89 d7             	mov    %rdx,%rdi
 d08:	ff d0                	callq  *%rax
 d0a:	48 8b 45 f0          	mov    -0x10(%rbp),%rax
 d0e:	48 89 c7             	mov    %rax,%rdi
 d11:	e8 5a fe ff ff       	callq  b70 <_ZN5Blah23fooEv@plt>
 d16:	90                   	nop
 d17:	c9                   	leaveq
 d18:	c3                   	retq

For the Native Ubuntu compiler I get:

0000000000400b58 <_Z12do_somethingv>:
  400b58:	55                   	push   %rbp
  400b59:	48 89 e5             	mov    %rsp,%rbp
  400b5c:	48 83 ec 10          	sub    $0x10,%rsp
  400b60:	48 c7 45 f0 50 22 60 	movq   $0x602250,-0x10(%rbp)
  400b67:	00
  400b68:	48 c7 45 f8 50 22 60 	movq   $0x602250,-0x8(%rbp)
  400b6f:	00
  400b70:	48 8b 45 f0          	mov    -0x10(%rbp),%rax
  400b74:	48 8b 00             	mov    (%rax),%rax
  400b77:	48 83 c0 10          	add    $0x10,%rax
  400b7b:	48 8b 00             	mov    (%rax),%rax
  400b7e:	48 8b 55 f0          	mov    -0x10(%rbp),%rdx
  400b82:	48 89 d7             	mov    %rdx,%rdi
  400b85:	ff d0                	callq  *%rax
  400b87:	48 8b 45 f8          	mov    -0x8(%rbp),%rax
  400b8b:	48 89 c7             	mov    %rax,%rdi
  400b8e:	e8 9f 06 00 00       	callq  401232 <_ZN5Blah23fooEv>
  400b93:	90                   	nop
  400b94:	c9                   	leaveq
  400b95:	c3                   	retq


The flags I am passing to the cross-compiler are:

-fpic -fno-rtti -fno-sized-deallocation -fno-exceptions -fno-use-cxa-atexit -fno-threadsafe-statics

- Rian
Comment 4 Rian Quinn 2015-12-28 13:22:44 UTC
To expand on this issue, any attempt to use the following pattern will result in instability:

some_type *p = &var;
*p or p->     // Crash

A couple of situations that I have seen include:
- allocate memory in global or local space (i.e. static). In this case, the crash does not occur all the time. I would have to run our code several times before the code crashes. What was strange with this example was the pointers were all fine (i.e. if I printed out the pointers, they were what I expected, but random attempting to access that pointer using the pointer to reference pattern above would result in a crash.... sometimes. 

- Getting a pointer to a reference to a member variable of a class would result in a crash consistently. For example, create two member variables in a class p_blah and m_blah, and then in the constructor set as follows: p_blah(&m_blah), the pointer is the correct memory address, but attempts to access the pointer segfault. 

Also, running all of this code on the Linux version of TARGET=elf-x86_64 works fine (at least on Ubuntu). Also, everything else seems to be working great. The only issue we ever see is any attempt to use the above pattern. Once this is done, the code will either crash instantly, or will crash at random times. For now, we are avoiding this pattern completely.
Comment 5 Rian Quinn 2016-01-08 13:19:40 UTC
It appears to be resolved by simply executing the symbols located in .ctors / .dtors. Although these sections are documented as being dedicated for globally defined constructors / destructors, on x86_64, they actually point to a set of functions labeled:

_GLOBAL__sub_I_XXX
_GLOBAL__sub_D_XXX

which appear to call:

_Z41__static_initialization_and_destruction_0ii

Once this symbol is executed, not only are the globally defined constructors / destructors executed, but the crashes that were identified in this bug report are also addressed. This includes crashes for pointers to globally defined classes, but also pointers to member variables.