This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Minimal GCC/Linux shared lib + EH bug example



----- Original Message -----
From: "Sean Parent" <sparent@adobe.com>


> To: C++ extensions mailing list
>
> on 5/14/02 1:22 PM, David Abrahams at david.abrahams@rcn.com wrote:
>
> >> By "shared" I will assume you mean "dynamically linked"...
> >
> > Could you be more specific about the differences you are mentioning? It
> > seems to me that the current model is identical to static linking when
all
> > the objects are linked to one another directly (i.e. no use of dlopen
with
> > RTLD_LOCAL).


This is weird. You snipped out the very thing I was asking about above,
leaving it looking as though I was asking about your use of terminology...
and then you answered a question I wasn't asking. Specifically, I was
asking about this passage:

> David's proposal, I believe amounts to requiring that dynamic linking is
> handled much more like static linking with a resolution mechanism for
> duplicate symbols. This proposal would require significantly more
> sophisticated loaders (than current loaders which usually
> do a very simple name binding).

And I was asking you to be more specific about the significant increase in
sophistication you think would be demanded of loaders.

I also went on to claim that Linux dynamic linking already provides
*precisely* the semantics of static linking in the simple case where there
is no use of RTLD_LOCAL. In other words I challenge your assertions above.

> Certainly, "shared" is usually used to denote that the code in a single
> library file is shared by multiple applications. That sharing isn't
relevant
> to this thread.

No, we don't care (from a standards point-of-view) whether any code is
actually shared. However, "identity sharing" is absolutely relevant to this
thread: the observable behavior differences that can arise in the context
of shared libraries (if we ignore unloading for the moment) are ALL cases
of duplication of things which are "supposed" to arise as a single copy:
two different addresses (and thus values) for the same static member of an
inline function or function template, two different addresses for the same
function, two different run-time type identities for the same type, etc.

> But other systems support other loading models of libraries
> - you point out RTLD_LOCAL for Linux. Other systems have other models -
CFM
> on the Macintosh allows a library to be singly instantiated across
multiple
> processes. Palm (and I would suspect many small systems that run single
> address space) have a similar notion. A "shared library" is just an
overly
> broad term and the canonical meaning of the code being shared has no
> relevance.

I'm not sure I agree. I think the CFM model where the library is shared
across processes is strikingly similar to the case we're discussing on
Linux if you take Martin's view that each library loaded with RTLD_LOCAL
should be viewed as a separate "program".

> To narrow the scope I wanted to focus on dynamically linked - not
explicitly
> loaded (which can have other semantics and runs into lifetime issues). I
> would like to tackle dlopen (and dlclose) at some point but first we need
a
> solid notion of "dynamically linked".

If you ignore dlopen and dlclose I don't think there's anything mysterious
about what it means on Unix: from as standards POV, there's nothing to
discuss because it works like static linking.

I don't think Windows can be discussed in the same breath, since it is a
different model and we're not going to be able to force them both into the
same box, if the box is going to be concrete enough to be useful.

> >> This opens the question - is it necessary to force a new model for
> > loading
> >> libraries to get reasonable semantics for C++ dynamic linking?
> >
> > It depends on your definition of "reasonable". If you want to support a
> > model which includes RTLD_LOCAL and doesn't add new restrictions on
what
> > may be linked together, then I think the answer is yes.
>
> I would like to start with being able to use _any_ C++ (outside of the
> subset that is C) from within a dynamically linked library. That
currently
> isn't possible without a thorough understanding of the runtime
environment,
> language, standard libraries, and application so that you can hack some
> workable subset.

Please be specific about the language features you feel you can't use on
Linux in shared libs, and why you think you can't use them... or why
special considerations about the runtime, etc., are required.

> I'm perfectly fine with there being restrictions about what
> can be linked together - in fact I think restrictions are probably
desirable
> if they contribute to the encapsulation provided by the library.

Why is it better for the language designer, rather than the designer of
library X, to say "You can't link Y to X"?

> >> If the answer
> >> is "yes" then we will have a very difficult adoption going forward as
the
> >> library loader is a relatively fundamental piece of any operating
system.
> >
> > It's not clear. I wonder whether any software that relies on the
current
> > behavior (as opposed to what I'm proposing) can possibly work right. I
> > think the only legal programs that could be broken by the change I'm
> > proposing would be those that unload libraries, and I'm not even
certain of
> > that: it depends on what the semantics of unloading are.
>
> Since the C++ language doesn't currently define any behavior in regards
to
> how C++ works in dynamically linked code I don't know that there is any
such
> thing as a "legal program". However, there is a lot of code written using
> aspects of C++ that makes heavy use of dynamic linking - Mat can chime in
> but InDesign is a great example.

Of course; I didn't claim otherwise.

> Current code frequently relies on the fact that "global" symbols _are
not_
> visible or shared outside the DLL. If they were, the implementation would
be
> tightly revision locked with the library it is linked against. Rev
locking
> components is something to be avoided.

The way you avoid that kind of visibility on Unix is with dlopen and
RTLD_LOCAL. Otherwise, you're discussing a windows-model concept. As a
cross-platform developer, I think it's important to be able to have this
kind of hiding, and that's one reason I don't think brushing aside dlopen
is appropriate.

> >> Beyond those issue there is the fact that dynamically linked libraries
> >> currently provide a degree of encapsulation - and that encapsulation
is a
> >> major reason developers use dynamic linking. Forcing a broad notion of
> >> symbol resolution potentially defeats many of the benefits.
> >
> > I think if you look at my proposal closely, you'll see that it is
fairly
> > conservative. It doesn't introduce any new concepts, just a small
change to
> > the existing behavior which removes an order-dependency. Symbol sharing
> > across library boundaries would only happen where it would have
happened
> > with the current semantics if the library load order were changed.
>
> I'm not familiar enough with how Linux works but your proposal seems to
> change the current dynamic linking semantics to be equivalent to static
> linking.

No, you've misinterpreted it. Furthermore, as I say above, if you don't use
dlopen it's already equivalent to static linking.

> That doesn't get you any of the benefits of dynamic linking other
> than saving bytes on disk.

Not true; it gets you component-based development.

> I'm not sure what you mean about "current
> semantics if the library load order were changed."

Let me review, then: In the case I'm talking about, the executable A opens
two libs B and C with dlopen. B and C each link dynamically to D in the
usual way. B, C, and D all contain calls to the same inline function which
has a static counter:

inline int count()
{
    static int n = 0;
    return n++;
}

D also contains the definition of:
int count2() { return count(); }

B and C each contain this definition:

namespace {
  void check_count()
  {
    int x = count2()
    assert(x + 1 == count());
  }
}

Calling check_count() in B always works, but in C it always asserts. That
behavior depends on the order in which B and C were loaded. The change I'm
proposing makes check_count() work in both B and C.

> Today, in every loader
> I'm using, conflicting symbols are an error - not a load order issue.

Are you sure? The ones we want to be shared without errorare usually hidden
from you: template instantiations, static variables in inline functions and
static data members of class templates, type_info, EH info, etc... Most
implementations use some sort of notion of "weak" symbols to ensure that
these things always get a single identity in the usual cases.

> In
> fact I can usually specify the load order to be anything that I want.

Yes, we're not talking about the usual cases.

> I'm
> also not certain how far you expect your proposal to go - it is targeted
at
> some set of symbol sharing but is it isolated to symbols defined by the
> application, or does it include "implicit" symbols defined by the
compiler
> and runtime libraries? Exception handling tables, RTTI, overloads to
> operator new and delete all fall outside the notion of user defined
symbols.

Not new/delete; those can be replaced. My proposal is explicitly concerned
with those runtime-support symbols, though.

> I agree with gaining implementation experience - but I don't think we
should
> start with pursuing changes to loaders but should start with changes to
the
> language and the compiler implementations.

I guess I just disagree with you there. I don't think the problem on Linux
is really in the compilers. We can make the compiler do something which
works around a few of the problems (i.e. by comparing typeinfo::name() for
EH) but we can't really solve the problems in any meaningful way without
changing the loader.

> I'll try to start with an example (this one bit me trying to integrate
some
> code into InDesign - so it's "real world", and a related issue caused me
> problems with Photoshop 7.).
>
>
> -----
> Problem: Behavior of overrides of operator new and delete within an
> application are not defined with regards to dynamic libraries.
>
> Discussion: Given an application that globally overrides operator new and
> delete as allowed by the standard. Said application is also dynamically
> linked to a library.
>
> Under current compiler implementations (VC++ 6 and CodeWarrior 7), the
> overrides are not visible to the library.

Okay, now we're in Windows land. That's a completely different domain and
may require different solutions... but I'm out of time for tonight.

-Dave



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]