This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ELF interposition and One Definition Rule


On Aug 26, 2013, at 8:21 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> My understanding of C++ One Definition Rule, in a strict sense, does not a
> allow in to define two functions of the same name and different semantics in a
> valid program . I also think that all DSOs eventually linked together or
> dlopenned are part of the same program.  So theoretically, for C++ produced
> code, we may go with AVAIL_AVAILABLE everywhere.

So, I think you're on firm ground wrt the standard.  I think LTO naturally wants see and make use of semantics, and once you accept that as valid, which, reasonably it is, I think you get to see and understand quite a lot about the code.  Replacing anything comes with a heavy constraint that it is reasonably the same and the user will die if it is not.  When an allocation function that the LTO optimizer can see is 32 byte aligned on the returned pointer, it is reasonable to make use of this on the client side code-gen.  If the user replaces that allocation function with one that was not 32-byte aligned, bad things would happen.

I think what the optimizer can see is open ended, and any use it wants to make of what it sees is fine.  Functions, data, classes, methods, ctors, dtors, templates, everything.

Now, that the standard perspective.  From the QOI viewpoint, you will have users that want to do various things, and they should explain what they want, and we should document and vend them the good stuff.  I defer to the interposing types on what they want to do and why.  Roughly, they need a way to hide things from the optimizer.  Assembly can do this, but, we'd also want to hide (or mark as please don't peer into) any function, method or variable.  Separate compilation not using the -flto flag seems a reasonable way to do this.  I don't know if it is enough… I think those types of people will scream when they discover they want more control.

> On IRC we got into an agreement that we may disallow interposition for
> virtuals,

Hum…  I'm not one of those people that want to interpose virtuals, but as a tool vendor, it would seem like some would like to be able to interpose virtuals.  I think separate compilation with no -flto should be enough to hide enough details to make the interposition of virtuals possible.  For example, someone has a nice C++ abi that includes a virtual function for open, and one wants to interpose open to trace it to merely help debug a problem.  Doesn't strike me as wrong.

For comdat (template functions), I can't help but think having a way to mark definitions as, please don't peer into this, would be nice to have.  One can separate declaration and definition and explicitly instantiate, but doing this might be a pain.  I'd defer, again, to the interposers.

Now, when the cost of allowing interposing is high (dynamic relocs for example), disallowing interposition by default is fine, not arguing that one must always have the cost.  Just seems nice from a theoretic perspective to allow the user to say, yes, we do want to allow interposing on these virtuals.

> Does the following patch seems sane?

Easier to review the change in semantics of a sample bit of code…  I think I understand the effects of the change.

> Of course I would be happier with a stronger rule - for instance allowing
> interposition only on plain functions not on methods.

Hum, I like the orthogonal rules that apply generally.  Meaning, I don't like the notion of treating functions and methods (or virtual methods) differently.  For example, a don't peer into for a template function definition, should be used to not peer into a normal inline function.

I think I like letting the optimizer do anything, and making the user responsible for not using -flto, or ensuring enough separate compilation, or otherwise marking the boundaries that don't want to peer though…  I could also be burned alive by a linux distributor with existing code if I tried this…  :-)  Good luck.


Oh, so keep in mind, if you do something like

template <class T>
class actor {
ctor() { 
	static int i = 100;
	printf("%p\n", &i);
}
};

and don't smash all the ctors together, you (can) wind up with multiple ctor::i objects.  The standard says there is one of them.  The usual way to ensure there is only one of them is to collapse all the the ctors together into one, then, trivially, that one can only reference one of the i's that exist.  This is one way (beyond equality) to tell if there are multiples (if the function is replicated).  Trying to think if there were any other ways…  ah yes, here it is:

6 A static local variable in a member function always refers to the same
  object, whether or not the member function is inline.

  A  static local variable in an extern inline function always refers to
  the same object.  A string literal in an extern inline function is the
  same object in different translation units.

So, string literals can also be used to notice the uniqueness of the function (method).  Curious, we didn't do the same for string literals in an member function, not sure why, feels like an oversight, not on purpose.  I'd have to dig through all the papers to find when it went in, and the paper that brought it in to see.  I don't recall that we talked about making it different.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]