53225 – static operator new in multiple inheritance carries incorrect type information for the class

Bug 53225 - static operator new in multiple inheritance carries incorrect type information for the class

Summary: static operator new in multiple inheritance carries incorrect type informatio...

Status:	RESOLVED INVALID

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	c++ (show other bugs)
Version:	4.6.1

Importance:	P3 minor
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:	http://stackoverflow.com/questions/10...
Keywords:

Depends on:
Blocks:

Reported:	2012-05-04 03:29 UTC by Thomas W. Lynch
Modified:	2012-05-05 00:48 UTC (History)
CC List:	1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2012-05-04 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Thomas W. Lynch 2012-05-04 03:29:40 UTC

Comment 1 Thomas W. Lynch 2012-05-04 03:35:22 UTC

h$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/4.6.1/lto-wrapper
Target: i686-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.1-9ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --enable-targets=all --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=i686-linux-gnu --host=i686-linux-gnu --target=i686-linux-gnu
Thread model: posix
gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) 

g++ -g3 -std=c++11 -fPIC -I. -I../../include -c try_offsets.ex.cc

See attached URL for source code and gdb memory map outputs.

In the discussion there was some confusion between an class and an instance; however, the class exists before new is called, though, of course, the instance does not exist until new has finished.  

Operator new has information for the parent, though it was inherited into and called to allocate a child instance.  The code sought to leave information for deletion in explicitly in the instance.

Comment 2 Andrew Pinski 2012-05-04 03:40:51 UTC

I don't think this is valid as the memory which is done after the operator new is considered as unitialized.

Comment 3 Thomas W. Lynch 2012-05-04 05:20:46 UTC

> I don't think this is valid as the memory which is done after the operator new
is considered as unitialized.

The code does not use any uninitialized memory.  It does not read the memory.  However, it accesses the _type_  and the type exists before an instance of that type is made.  

Part of the type information is the layout inside the class. The operator, which has been copied into the child via inheritance,  i.e. the child's version of the operator should be seeing the type of the child, not the type of the parent.  The parent is irrelevant once its type is constructed - i.e. after the inheritance operation - which occurs before any instances are instantiated.

-Thomas

Comment 4 Jonathan Wakely 2012-05-04 08:54:20 UTC

A stackoverflow URL is not a valid bug report, please follow the bug reporting guidelines as requested when you created this report: http://gcc.gnu.org/bugs/ 

There are several code examples on that page and your original post doesn't even compile, provide a self-contained testcase here, not as a URL to another site.

Comment 5 Jonathan Wakely 2012-05-04 09:44:13 UTC

(In reply to comment #3)
> > I don't think this is valid as the memory which is done after the operator new
> is considered as unitialized.
> 
> The code does not use any uninitialized memory.  It does not read the memory. 
> However, it accesses the _type_  and the type exists before an instance of that
> type is made.  

No, it accesses the memory before the constructor begins:

      this_type *new_pt = (this_type *)malloc(total_size);
      new_pt->count = count;

That accesses uninitialized memory before the object's lifetime begins.  And it accesses it through a pointer of type "this_type*" so will use the offset of this_type::count, for whatever type this_type is in that scope (which depends on which of the examples on the stackoverflow page you're claiming is a compiler bug, in some of them this_type is an alias for A, in others it's a template parameter instantiated as type B, that's why you need to provide code here, not vageuly refer to some other site with several code examples.)

> Part of the type information is the layout inside the class. The operator,
> which has been copied into the child via inheritance, 

No, inheritance doesn't mean anything is copied into the child, it means the function is visible in the scope of the child.

> i.e. the child's version
> of the operator should be seeing the type of the child, not the type of the
> parent.

No, the function "A::operator new" is a member of A and when it uses this_type it refers to whichever this_type is in scope.  There is no "B::operator new", there is no child's version ofthe function, it sees the same version i.e. "A::operator new"

>  The parent is irrelevant once its type is constructed - i.e. after the
> inheritance operation - which occurs before any instances are instantiated.

That's not how C++ works. Inheritance is not an operation.

Comment 6 Thomas W. Lynch 2012-05-04 18:12:10 UTC

>> Part of the type information is the layout inside the class. The operator,
>> which has been copied into the child via inheritance, 
>
>No, inheritance doesn't mean anything is copied into the child, it means the
>function is visible in the scope of the child.

Data type provides information for offsets into fields and the functions that work on the data. That is not part of the instance, it is separate and distinct.  It is 'extrinsic' to an instance.

Before an instance is created the _compiler_ does operations on 'data type'.  In the case of C++ nonvirtual inheritance the compiler copies the parent type information (data actually), with modifications into the child *type*.  There are other ways to do inheritance.  This is done at compile time and only affects the type, not the instance.

So for example, if an parent class has a method, say increment(), that affects a value to be found in a field at offset, say, 4.  and then that parent method is inherited into the child, but the inherited field has moved, say now found at offset 8,  the method is modified to access offset 8 - or to more technically accurate its type information is modified, the control flow remains identical. 

So yes, inheritance means a lot of copying and modifications are done to _type_ at _compile_time_.  After all this work is done at compile time, in static typing, the type is embedded in the generated code, the unresolved part of type is embedded as symbols to be linked in the link map, and the rest is thrown away.  For some reason, this embedding for the inherited operator new is incorrect.

There are phases in a static type compilation process, first the macro expansion (templates and statics),  then the calculation of offsets for the type stuff, then then code generation.  Now way down the line after linking and loading we are running.  That is when our operator new gets called.  By that time type issues are supposed to be settled - but as the code example reported shows, for operator new in multiple inheritance with this version of g++ - it has not been.

About the access - the original poster noted that uninitialized memory was being used - uninitialized memory is bad to read, it is not bad to write .. that is how it gets initialized after all.  And in cases of doing heap management it is convention to put the size of the block above the allocation.  This code example does that explicitly.   The standard says that the object exists after new finishes, and this is true even in this example.  But even the standard new operator writes to the memory it allocates before returning -- check it out, just walk in the debugger you can see it.

If you are going to outlaw writing to memory in new, then you will have to outlaw the standard new as well.  That said, this code could be implemented by calculating an offset from the pointer to the malloced block, just as the heap code does ..

I apologize for the 'nonstandard' report per the a prior post, I will update that this evening.

Comment 7 Jonathan Wakely 2012-05-04 18:31:31 UTC

(In reply to comment #6)
> So for example, if an parent class has a method, say increment(), that affects
> a value to be found in a field at offset, say, 4.  and then that parent method
> is inherited into the child, but the inherited field has moved, say now found
> at offset 8,  the method is modified to access offset 8 - or to more
> technically accurate its type information is modified, the control flow remains
> identical. 

No, that's not how it works. If Base::increment() writes to Base::field then it is always at the same offset into the Base object. Whether that Base object is a sub-object of another clas is irrelevant.

In your broken example on stackoverflow the "A::operator new" function always wrote to offset 4 into the A object. When that A object is 4 bytes into a B object then it writes 4+4=8, but "A::operator new" knows that A::count is 4 bytes into an A, i.e. this+4 where this has type A*


> So yes, inheritance means a lot of copying and modifications are done to _type_
> at _compile_time_.  After all this work is done at compile time, in static
> typing, the type is embedded in the generated code, the unresolved part of type
> is embedded as symbols to be linked in the link map, and the rest is thrown
> away.  For some reason, this embedding for the inherited operator new is
> incorrect.

No, your understanding of how G++ works is incorrect.  That might be a valid C++ implementation, but it's not how G++ works, and it's certainly not required to work like that by the standard.

You seem to think that in "A::operator new" the static type of "this" depends on whether the dynamic type of the object being allocated is A or B.  That's incorrect.  The static type of "*this" in "A::operator new" is always A, and that's why your program doesn't work.  When you use "A::operator new" to allocate a B, the member B::count will be 8 bytes into the allocated memory, but you cast the memory to (this_type*) i.e. (A*) and then access A::count, i.e. 4 bytes into the allocated memory.  When you later construct a B at that location the B::count member is not at the location you thought it was.

Comment 8 Thomas W. Lynch 2012-05-04 18:57:40 UTC

>No, that's not how it works. If Base::increment() writes to Base::field then it
>is always at the same offset into the Base object. Whether that Base object is
>a sub-object of another clas is irrelevant.

I apologize, as I don't know how else to put it but in multiple inheritance that statement is just plain wrong.  Fields move.  You can see it in the debug output at the link, the field for 'count' moved by 4 bytes from what it was in the parent to where it is located in the child.

I should note, also, that C++ uses name mangling to accomplish the effect of the type modifications I mentioned before.  (I was a user of the original that made expanded macros into an intermediate directory ;-)  So the manifest problem, if you could look at the macro output ;-) is probably an operator new mangled to be a child, but the internal type information still as a parent.

The typedef for 'this_type' in the exmaple is simply a short hand that appears at the top of each of our templated classes so that there is a shorter name to use in the code.  'this_type' for class A  is 'class A'   for class B it is 'class B' etc.  It does nothing. Look at the compiling example with the debug output there that shows the wrong offset, please don't quote the work arounds proposed as we are talking about the problematic case of course.

Comment 9 Thomas W. Lynch 2012-05-04 19:25:08 UTC

I went to take 'this_type' out of the source. You were correct to focus on that.

As malloc() returns a void *  there must be a cast to access the fields in the instance.  We routinely use 'this_type' in our shop, which is defined to be the same as the class name.  However, if you put the class name in,  e.g. (A *) malloc(..)  then you would definitely get the offset for an A, even after the method is inherited into a B.

The question comes down to, then, if when inheriting a method into a child, if the typedefs in the child apply or the typedefs in the parent apply. It should be the typedefs of the child, as offsets (type) is adjusted in inheritance. It would eliminate a lot of otherwise valid looking code otherwise.

This looks to be a general issue with multiple inheritance having nothing to do with operator new. Let me check that..

Though lets also keep in mind, the compiler makes no noise at all in this situation.

Comment 10 Jonathan Wakely 2012-05-04 19:44:31 UTC

(In reply to comment #9)
> As malloc() returns a void *  there must be a cast to access the fields in the
> instance.  We routinely use 'this_type' in our shop, which is defined to be the
> same as the class name.  However, if you put the class name in,  e.g. (A *)
> malloc(..)  then you would definitely get the offset for an A, even after the
> method is inherited into a B.

Yes. 
 
> The question comes down to, then, if when inheriting a method into a child, if
> the typedefs in the child apply or the typedefs in the parent apply. It should

Yes, that's the question, and the answer is they refer to the typedefs in the base class, where the function is defined.  There is no copying a base class' member function into the derived class (that's not how inheritance works) and no magical adjustment of members when you derive from a class.

In the A member function only names declared in A are visible.

> be the typedefs of the child, as offsets (type) is adjusted in inheritance. It
> would eliminate a lot of otherwise valid looking code otherwise.

Maybe valid looking, but it's not C++. 

> This looks to be a general issue with multiple inheritance having nothing to do
> with operator new. Let me check that..

Please do, but it's nothing to do with multiple inheritance - the same thing happens with single inheritance, typedefs in a base class' member function always refer to the definition in the base class.  With single inheritance you might not notice your code is wrong, because A::count and B::count would be at the same offset, but that doesn't change the fact the base class' member function can only see names declared in the base class.

I'm closing this bug as invalid, the compiler behaves a required by the C++ standard. 
 
> Though lets also keep in mind, the compiler makes no noise at all in this
> situation.

It can't really know what you're trying to do, so can't warn you that your mental model of C++ is not how the language really works.  The answer is not to muck about with uninitialized memory when you aren't an expert in the language and the object model.

Comment 11 Thomas W. Lynch 2012-05-04 19:59:12 UTC

Jonathan, but there is "magical adjustment" as you put it, as the following code works correctly:

#include <cstddef>
#include <stdlib.h>
typedef unsigned int uint;

class C{ // just here to be faithful to the original code
  int y;
};


class A{
public:
  typedef A this_type;

  void* method(uint count){
    void *vp;
    vp = (void *)this;
    this_type *p;
    p = (this_type *)vp;
    p->count = count;   
  };

  uint count;
};

class B : public C, public A{
public:
    typedef B this_type;
    int i;
};

int main(){
  int j;

  A a;
  a.method(5);
  j++;

  B b;
  b.method(5);
  j++;

};


note the output in gdb where the '5' jumps to a new offset:

(gdb) b main
Breakpoint 1 at 0x80484a1: file try_offsets_2.ex.cc, line 36.
(gdb) r
Starting program: /try_offsets_2 
Breakpoint 1, main () at try_offsets_2.ex.cc:36
(gdb) n
(gdb) 
$1 = (A *) 0xbffff758
(gdb) x /10x &a
0xbffff758:	0x00000005	0x003dbff4	0x08048500	0x003dbff4
0xbffff768:	0x00000000	0x0027d113	0x00000001	0xbffff804
0xbffff778:	0xbffff80c	0x0012eff4
(gdb) n
(gdb) x /10x &b
0xbffff74c:	0x003dc324	0x00000005	0x00296c55	0x00000005
0xbffff75c:	0x003dbff5	0x08048500	0x003dbff4	0x00000000
0xbffff76c:	0x0027d113	0x00000001
(gdb)

Comment 12 Thomas W. Lynch 2012-05-04 20:02:25 UTC

Johnathan,  I think you have mistakenly closed this bug, I am asking you to reopen it.

Comment 13 Daniel Krügler 2012-05-04 20:09:55 UTC

(In reply to comment #11)
> Jonathan, but there is "magical adjustment" as you put it, as the following
> code works correctly:

The difference in your modified example is that the conversions are performed *after* the complete object initialization. As Jonathan described in comment 5, your original problem did not work, because you asked the compiler for the adjustment at a point where this is not valid. This is described by the life-time rules in C++.

Comment 14 Thomas W. Lynch 2012-05-04 20:18:02 UTC

I went through the life time issues in detail in prior comments. The C++ rules for life time of an instance do not apply to those of the life time of a class. Or are you saying these are not distinct?  For what reason is static type information known at compile time not available to operator new - which is called at run time?  

Also, if operator new is to be an such an exception that static type information can not be accessed in it, as it can be in every other operator or method - then why doesn't the compiler say anything about it?  Should there not be a warning or some sort that an artificial constraint is being applied in order to satisfy someone's interpretation of object lifetimes according to the standard?

Comment 15 Jonathan Wakely 2012-05-04 20:35:12 UTC

(In reply to comment #14)
> I went through the life time issues in detail in prior comments.

Yes, but you don't seem to understand C++, as evidenced by statements like "The operator, which has been copied into the child via inheritance,  i.e. the child's version of the operator"

> The C++ rules
> for life time of an instance do not apply to those of the life time of a class.

Classes don't have a lifetime - I don't know what you're talking about.

> Or are you saying these are not distinct?  For what reason is static type
> information known at compile time not available to operator new - which is
> called at run time?  

The static type of what?

In "A::operator new" the name "this_type" refers to A::this_type.

> Also, if operator new is to be an such an exception that static type
> information can not be accessed in it, as it can be in every other operator or
> method - then why doesn't the compiler say anything about it?  Should there not
> be a warning or some sort that an artificial constraint is being applied in
> order to satisfy someone's interpretation of object lifetimes according to the
> standard?

operator new isn't an exception, all member functions work like that.

This report is invalid.  Please find somewhere else to learn C++, stackoverflow is a good place for it, but reporting compiler bugs isn't the right way to improve your understanding of the language.

Comment 16 Jonathan Wakely 2012-05-04 20:58:28 UTC

OK, I'm waiting.  Please provide some code, as requested, to show exactly what you're talking about.

Comment 17 Jonathan Wakely 2012-05-04 21:06:34 UTC

Also, see the code I provided at SO:
http://stackoverflow.com/a/10449212/981959
That demonstrates that a base class member function knows nothing about the derived class.

here's another:

#include <iostream>

void g(int i) { std::cout << "int: " << i << '\n'; }
void g(short i) { std::cout << "short: " << i << '\n'; }

struct A
{
  static const int i = 1;
  typedef int int_type;

  void f() {
    int_type j = i;
    g(j);
  }
};

struct B : A
{
  static const short i = 0;
  typedef short int_type;
};

int main()
{
  B b;
  b.f();
}

A::f() calls g(int) and prints 1, not g(short) printing 0, because in A::f() int_type is int and i has the value 1.

That's a normal member function, not operator new.  You could make A::f() static and the result will be exactly the same.

Does that help?

Comment 18 Thomas W. Lynch 2012-05-04 21:16:18 UTC

This code compiles:

#include <cstddef>
#include <stdlib.h>
typedef unsigned int uint;

class C{ // just here to be faithful to the original code
  int y;
};

class A{
public:
  typedef A this_type;

  void* operator new(size_t enfacia_size, uint count){
      size_t total_size 
    = enfacia_size
    + sizeof(int) * count; // the 'tail'
    ;
      this_type *new_pt = (this_type *)malloc(total_size);
      new_pt->count = count;
      return new_pt;
  };
  uint count;
};

class B : public C, public A{
public:
    int i;
};

int main(){
  B *b_pt = new(5) B;  
  uint j=0;
  j++;
};

And this is the debugger output:

(gdb) r
Starting program try_offsets 
Breakpoint 1, main () at try_offsets.cc:32
(gdb) n
(gdb) p &(b_pt->count)
$1 = (uint *) 0x804a00c
(gdb) x/10 b_pt
0x804a008:  5   0   0   0
0x804a018:  0   0   0   0
0x804a028:  0   135129
(gdb) p b_pt
$2 = (B *) 0x804a008
(gdb) 

Compare this to the prior code and debugger output of the same form, and you will notice that operator new, from a point of view of type, behaves differently than other method.  Though it was inherited, its understanding of this_type has not changed, as it does in the case for other methods.

In the least, in a strongly typed system I would expect there to be a warning about inheriting operator new when the type was not going to be updated as it is for other opertors and methods, though I suspect it is a small matter for the compiler to give it the right type info.

From a language design point of view, I could see not having operator new accept references to the object as it blurs the line between allocation and intialization (operator new and construction).  It such a constraint were made it should not be by implied contract with the programming because it is such a subtle issue, there should be an error IMHO.  Though, I don't like seeing non-orthogonal artificial constraints, there is no reason operator new shouldn't see the correct static type.  Let the programmer determine the appropriate conceptual partitioning.  

In any case, what we have now is neither of these.  The compiler lets you do, though it is easy check for, and then it generates code that doesn't work - then when coming back to the compiler guys one engages in a very technical discussion.  Seems a small change would increase productivity and quality code ..

I see our comments crossed, let me look at what you just posted here now

Comment 19 Thomas W. Lynch 2012-05-04 21:27:18 UTC

Johnathan, const static members are compile time beasts.  There are almost macros like #define, but not quite as the compiler will give them storage if you take their address.  I don't think it is a good idea to add another paradigm into this. Nor does your example explain how the field moved and was correctly assigned in the code I gave above with method() - if I were to accept your example as valid, then that code would not work - but it does, there is is.  You can compile it, and get the memory dumps yourself in gdb.  Can you speak to that example with method()?  That would be very useful.

Comment 20 Jonathan Wakely 2012-05-04 21:29:43 UTC

(In reply to comment #18)
> This code compiles:
> 
> #include <cstddef>
> #include <stdlib.h>
> typedef unsigned int uint;
> 
> class C{ // just here to be faithful to the original code
>   int y;
> };
> 
> class A{
> public:
>   typedef A this_type;
> 
>   void* operator new(size_t enfacia_size, uint count){
>       size_t total_size 
>     = enfacia_size
>     + sizeof(int) * count; // the 'tail'
>     ;
>       this_type *new_pt = (this_type *)malloc(total_size);
>       new_pt->count = count;

The bug is here.

>       return new_pt;
>   };
>   uint count;
> };
> 
> class B : public C, public A{
> public:
>     int i;
> };
> 
> int main(){
>   B *b_pt = new(5) B;  
>   uint j=0;
>   j++;
> };
> 
> And this is the debugger output:
> 
> (gdb) r
> Starting program try_offsets 
> Breakpoint 1, main () at try_offsets.cc:32
> (gdb) n
> (gdb) p &(b_pt->count)
> $1 = (uint *) 0x804a00c
> (gdb) x/10 b_pt
> 0x804a008:  5   0   0   0
> 0x804a018:  0   0   0   0
> 0x804a028:  0   135129
> (gdb) p b_pt
> $2 = (B *) 0x804a008
> (gdb) 
> 
> Compare this to the prior code and debugger output of the same form, and you
> will notice that operator new, from a point of view of type, behaves
> differently than other method.  Though it was inherited, its understanding of
> this_type has not changed, as it does in the case for other methods.

No it doesn't!

That never happens for any type or any member function!


#include <iostream>

class C{ // just here to be faithful to the original code
  int y;
};

class A{
public:
  void method(void* p){
    std::cout << "this=" << this << " p=" << p << '\n';
  }
  int count;
};

class B : public C, public A{
};

int main(){

  A a;
  a.method(&a);

  B b;
  b.method(&b);

}

this=0x7fffe08b1640 p=0x7fffe08b1640
this=0x7fffe08b1634 p=0x7fffe08b1630

Note that when calling b.method(&b) the "this" pointer is not the same as &b

The member count is always at the same address within the A object, but the address of that A object might not be the same as the address of the B object.

A B object looks like this in memory:

-----   <------ C*      <---- B*
| C |
|---|   <------ A*
| A |
-----

When you call A::f() the 'this' pointer is adjusted to point to the A sub-object, which is not at the same address as the complete B object that contains it.

 
> In the least, in a strongly typed system I would expect there to be a warning
> about inheriting operator new when the type was not going to be updated as it
> is for other opertors and methods, though I suspect it is a small matter for
> the compiler to give it the right type info.

The type is never "updated" when a function is inherited!  The base class function is just visible and accessible in the derived class.  It's not copied or update, the same, original function is used unaltered.

Please, go and learn C++ somewhere else.  I don't know what language your mental model corresponds to, but it's not C++.

Comment 21 Thomas W. Lynch 2012-05-04 21:40:46 UTC

So you are not going to say anything about the code in comment 11, which you say shouldn't work, but does?  Due to this, and fact you are now demeaning me, I request a different set of eyes.

Comment 22 Jonathan Wakely 2012-05-04 21:51:39 UTC

(In reply to comment #19)
> Johnathan, const static members are compile time beasts.  There are almost
> macros like #define, but not quite as the compiler will give them storage if
> you take their address.

Please don't try to C++ to me.

How do you explain that "int_type" is int in A::f() and not short?

>  I don't think it is a good idea to add another
> paradigm into this.

Ignore the static const, explain the typedef.

> Nor does your example explain how the field moved and was
> correctly assigned in the code I gave above with method() - if I were to accept
> your example as valid, then that code would not work - but it does, there is
> is.  You can compile it, and get the memory dumps yourself in gdb.  Can you
> speak to that example with method()?  That would be very useful.

It didn't "move"

> (gdb) p &(b_pt->count)
> $1 = (uint *) 0x804a00c
> (gdb) x/10 b_pt
> 0x804a008:  5   0   0   0
> 0x804a018:  0   0   0   0
> 0x804a028:  0   135129
> (gdb) p b_pt
> $2 = (B *) 0x804a008

At 0x804a008 there is a B object.

Inside that B object is a C object, also at address 0x804a008, and an A object at address 0x804a00c.  Think of the B as a little box with two object in it.

Inside the A at 0x804a00c is an integer called "count" and that is also at address 0x804a00c.

The memory looks like this:

0x804a008      0x804a00c      
B-----------------------------
C--------------A--------------
y--------------count----------

Inside operator new you get a piece of memory, big enough for a B, then you cast that memory to A* and dereference it to access A::count.  Because you cast the memory to A* the compiler access "count" at that address (see the diagram, the "count" member is at offset zero into an A.


here's your invalid example adjusted to use a normal member instead of operator new, it fails in the same way:

#include <cstddef>
#include <stdlib.h>
#include <assert.h>
typedef unsigned int uint;

class C{ // just here to be faithful to the original code
  int y;
};

class A{
public:
  typedef A this_type;

  static void* create(size_t enfacia_size, uint count){
      size_t total_size 
    = enfacia_size
    + sizeof(int) * count; // the 'tail'
    ;
      this_type *new_pt = (this_type *)calloc(total_size, 1);
      new_pt->count = count;
      return new_pt;
  }
  uint count;
};

class B : public C, public A{
public:
    int i;
};

int main(){
  B *b_pt = (B*) B::create(sizeof(B), 5);
  assert(b_pt->count == 5);
}

There's nothing special about operator new, you just don't understand C++

Comment 23 Jonathan Wakely 2012-05-04 21:56:25 UTC

You don't even have a B::this_type typedef, how could (this_type*) possibly refer to anything except A*?  Just by calling it "this_type" doesn't make it magic that the compiler adjusts to refer to teh dynamic type.

Daniel, can you explain it to him better?

Comment 24 Thomas W. Lynch 2012-05-04 22:00:37 UTC

"don't talk C++ to me" hah I love it!  Sounds like my mother you just need to add the "young man". Though I think you are getting personally engaged here, unless that is joke. I will wait for another reviewer.  As for your example of multiple inheritance above showing A and C stacked, indeed you can see that A moved relative to the base of the object, and the mechanism for finding the field again, by your own example.  Anyway Johnny let me continue this with someone else and hopefully pull it back to the original topic.

Comment 25 Daniel Krügler 2012-05-04 22:04:58 UTC

(In reply to comment #23)
> You don't even have a B::this_type typedef, how could (this_type*) possibly
> refer to anything except A*?  Just by calling it "this_type" doesn't make it
> magic that the compiler adjusts to refer to teh dynamic type.
> 
> Daniel, can you explain it to him better?

I found your explanations just excellent and I have no idea at the moment. Thomas, if you think that we did not really understand your point, please make something more observable from your program mentioned in comment 11, e.g. produce variables and put them out to via printf/cout make clearer what you want to say. Keep in mind that operator new *is* a static function even though it does not need to be declared as static.

Comment 26 Andrew Pinski 2012-05-04 22:06:44 UTC

Please stop reopening bugs.  Also if you really want some C++ help please head over to comp.lang.c++

Comment 27 Jonathan Wakely 2012-05-04 22:09:10 UTC

(In reply to comment #8)
> >No, that's not how it works. If Base::increment() writes to Base::field then it
> >is always at the same offset into the Base object. Whether that Base object is
> >a sub-object of another clas is irrelevant.
> 
> I apologize, as I don't know how else to put it but in multiple inheritance
> that statement is just plain wrong.  Fields move.  You can see it in the debug
> output at the link, the field for 'count' moved by 4 bytes from what it was in
> the parent to where it is located in the child.

Try this:
#include <iostream>
typedef unsigned int uint;

class C{ // just here to be faithful to the original code
  int y;
};

class A{
public:
  uint count;
};

class B : public C, public A{
public:
    int i;
};

int main(){
  B b;
  B* bp = &b;
  A* ap = &b;
  std::cout << "B* " << bp << '\n';
  std::cout << "A* " << ap << '\n';
  std::cout << "B::count " << &bp->count << '\n';
  std::cout << "A::count " << &ap->count << '\n';
}

The address of the same object gives two separate values. See?

B* 0x7fffb5ba8f90
A* 0x7fffb5ba8f94
B::count 0x7fffb5ba8f94
A::count 0x7fffb5ba8f94


But the count member accessed through both is the same, because there's a different offset to access A::count inside an A (offset 0) and to access A::count inside a B (offset 4)

So if you access bp->count and ap->count you are accessing the same member, at the same location, but through a different offset.

So inside your operator new where you are trying to allocate a B, when you access the count member through (A*) you get the wrong location, because you have memory for a B (which contains an A, at offset 4) but are treating it as though it was just an A.

Comment 28 Thomas W. Lynch 2012-05-04 22:31:36 UTC

I see that example, I understand it, and I appreciate your writing it.

Though we are going a long way from the original "minor bug".

Had it not been operator new, but another operator or method it would have correctly assigned the count, as was shown.

The argument made was that operator new was different because "the object did not exist", however, it is the "type" that determines the offsets (or as you have artfully pointed out, the 'this' pointers). The type information is known at compile time. The compiler is capable of endowing operator new like other methods.

It has been pointed out by a couple of people that the C++ standard says that an object only exists after new finishes, and that is the reason. However, an 'object' in that context is an instance, and yes, the instance does not exist until it is allocated -- but the type information exists before new is called. New is not bared from the standard from seeing the type.

So someone else pointed out, that the instance does not exist, so it should not be assigned to. That has probably been the best answer I have heard, however, by convention, allocators do write to an allocated area. In addition, ** the compiler is not flagging access to the object from inside new ** - it allows it - in this case ** it just generates the wrong code ** (and you have artfully explained why). This is a minor bug.

So I continue to say it is wrong to close this. Either the compiler should be giving at least an error for illegal access to the object, according to the standard, or the correct type information should be there. I would interpret the standard, and convention, to favor the latter. Currently the compiler does neither - so there is a minor bug.

This is a faithful summary of the discussion on the topic, where is your disagreement here?

Comment 29 Jonathan Wakely 2012-05-04 23:20:59 UTC

The compiler does what the standard says it should.

If you want to change C++ then this is the wrong place, stop reopening the bug and take your suggestion to comp.std.c++ instead.

Comment 30 Thomas W. Lynch 2012-05-04 23:36:48 UTC

Sorry to be thick headed Jon. Perhaps you could boil it down to the essentials here, are you saying that assignment is illegal in operator new so it is proper that there be no error and one gets the wrong result? Is that your point? Answer me this, and I will drop the report.

Comment 31 Andrew Pinski 2012-05-05 00:04:16 UTC

If you want to change C++ then this is the wrong place, stop reopening the bug
and take your suggestion to comp.std.c++ instead.

Again take your suggestions to comp.std.c++ since GCC follows the C++ standard in this place already.

Comment 32 Jonathan Wakely 2012-05-05 00:14:38 UTC

(In reply to comment #30)
> Sorry to be thick headed Jon. Perhaps you could boil it down to the essentials
> here, are you saying that assignment is illegal in operator new so it is proper
> that there be no error and one gets the wrong result? Is that your point?
> Answer me this, and I will drop the report.

No, I'm saying you are not getting the wrong result.  What you think is correct is not correct.

I'm saying the behaviour you see is not a compiler bug, it's how C++ works.

This bug report is *not* the right place to discuss it, take it to the gcc-help mailing list or *anywhere* else.  "I don't understand C++" is not a valid bug.  

So I'm moving this to gcc-help and I'm closing the report again, if the description Please stop re-opening this report. I will discuss it elsewhere but not here.

See http://gcc.gnu.org/ml/gcc-help/2012-05/msg00029.html for anyone who isn't sick of this yet.

Comment 33 Thomas W. Lynch 2012-05-05 00:48:57 UTC

You say accessing type in operator new is illegal by the standard, but the compiler doesn't give an error though doing so is bloody obvious and I have the strick checking turned on- but that is ok, and I am somehow causing trouble by reporting this.

You say that I got the correct answer when I inherited a method and used it in a child and it gave me the result a parent would have but a child wouldn't.  

You say that I did something wrong and so I deserve the results - then why do we have type checking?  Now you parade me around as though I'm causing trouble for reporting this and acting like I need lessons in C++.  That will teach me to make a report.

I don't know Jon, I just don't get it.