This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Minimal GCC/Linux shared lib + EH bug example


on 5/14/02 1:22 PM, David Abrahams at david.abrahams@rcn.com wrote:

>> By "shared" I will assume you mean "dynamically linked"...
> 
> Could you be more specific about the differences you are mentioning? It
> seems to me that the current model is identical to static linking when all
> the objects are linked to one another directly (i.e. no use of dlopen with
> RTLD_LOCAL).

Certainly, "shared" is usually used to denote that the code in a single
library file is shared by multiple applications. That sharing isn't relevant
to this thread. But other systems support other loading models of libraries
- you point out RTLD_LOCAL for Linux. Other systems have other models - CFM
on the Macintosh allows a library to be singly instantiated across multiple
processes. Palm (and I would suspect many small systems that run single
address space) have a similar notion. A "shared library" is just an overly
broad term and the canonical meaning of the code being shared has no
relevance.

To narrow the scope I wanted to focus on dynamically linked - not explicitly
loaded (which can have other semantics and runs into lifetime issues). I
would like to tackle dlopen (and dlclose) at some point but first we need a
solid notion of "dynamically linked".

>> This opens the question - is it necessary to force a new model for
> loading
>> libraries to get reasonable semantics for C++ dynamic linking?
> 
> It depends on your definition of "reasonable". If you want to support a
> model which includes RTLD_LOCAL and doesn't add new restrictions on what
> may be linked together, then I think the answer is yes.

I would like to start with being able to use _any_ C++ (outside of the
subset that is C) from within a dynamically linked library. That currently
isn't possible without a thorough understanding of the runtime environment,
language, standard libraries, and application so that you can hack some
workable subset. I'm perfectly fine with there being restrictions about what
can be linked together - in fact I think restrictions are probably desirable
if they contribute to the encapsulation provided by the library.
 
>> If the answer
>> is "yes" then we will have a very difficult adoption going forward as the
>> library loader is a relatively fundamental piece of any operating system.
> 
> It's not clear. I wonder whether any software that relies on the current
> behavior (as opposed to what I'm proposing) can possibly work right. I
> think the only legal programs that could be broken by the change I'm
> proposing would be those that unload libraries, and I'm not even certain of
> that: it depends on what the semantics of unloading are.

Since the C++ language doesn't currently define any behavior in regards to
how C++ works in dynamically linked code I don't know that there is any such
thing as a "legal program". However, there is a lot of code written using
aspects of C++ that makes heavy use of dynamic linking - Mat can chime in
but InDesign is a great example.

Current code frequently relies on the fact that "global" symbols _are not_
visible or shared outside the DLL. If they were, the implementation would be
tightly revision locked with the library it is linked against. Rev locking
components is something to be avoided.

>> Beyond those issue there is the fact that dynamically linked libraries
>> currently provide a degree of encapsulation - and that encapsulation is a
>> major reason developers use dynamic linking. Forcing a broad notion of
>> symbol resolution potentially defeats many of the benefits.
> 
> I think if you look at my proposal closely, you'll see that it is fairly
> conservative. It doesn't introduce any new concepts, just a small change to
> the existing behavior which removes an order-dependency. Symbol sharing
> across library boundaries would only happen where it would have happened
> with the current semantics if the library load order were changed.

I'm not familiar enough with how Linux works but your proposal seems to
change the current dynamic linking semantics to be equivalent to static
linking. That doesn't get you any of the benefits of dynamic linking other
than saving bytes on disk. I'm not sure what you mean about "current
semantics if the library load order were changed." Today, in every loader
I'm using, conflicting symbols are an error - not a load order issue. In
fact I can usually specify the load order to be anything that I want. I'm
also not certain how far you expect your proposal to go - it is targeted at
some set of symbol sharing but is it isolated to symbols defined by the
application, or does it include "implicit" symbols defined by the compiler
and runtime libraries? Exception handling tables, RTTI, overloads to
operator new and delete all fall outside the notion of user defined symbols.
Are these "symbols" somehow exported (not in the template export sense)?

> Actually, I *do* want to mess with our language. At least, I'm among those
> who think the language definition should say something about the semantics
> of shared libraries. However, messing with the language has to happen at
> both ends: you have to gain implementation experience in addition to
> thinking about what the standard should mandate.

I agree with gaining implementation experience - but I don't think we should
start with pursuing changes to loaders but should start with changes to the
language and the compiler implementations.

I'll try to start with an example (this one bit me trying to integrate some
code into InDesign - so it's "real world", and a related issue caused me
problems with Photoshop 7.).


-----
Problem: Behavior of overrides of operator new and delete within an
application are not defined with regards to dynamic libraries.

Discussion: Given an application that globally overrides operator new and
delete as allowed by the standard. Said application is also dynamically
linked to a library.

Under current compiler implementations (VC++ 6 and CodeWarrior 7), the
overrides are not visible to the library.

Under VC++ 6 this means that the application and the library are executing
out of separate memory allocators. Any memory allocation which can
"straddle" the boundary may fail. Because the std::string library is
supplied by Microsoft pre-built, and the inlines may cause items to straddle
the boundary, this means that std::strings or any objects containing them
cannot straddle the dll boundary (by straddle I mean allocated on one side
and invoked from the other).

With CodeWarrior 7 a workable solution was found by refactoring and
rebuilding the standard runtime libraries. However, even with that
workaround the standard libraries are initialized prior to _any_
initialization happening within the application. The static initializers for
std::local call operator new and delete (which are cross linked in from the
main application) - which means that operator new and delete are invoked
prior to the exception handling tables being initialized. A careful review
of the behavior of try and catch was required to determine that this work
around was "safe" so long as an exception is not thrown while calling
operator new, including in the constructor for the objects relied upon by
the implementation of std::local - or by any other initializer in any
library.

Solutions:

One possible, though undesirable, solution is to simply say that overrides
of operator new and delete are not allowed from applications using dynamic
linking.

If it is allowed - what are some "reasonable semantics"? I would state the
following:

1. Any runtime initialization should happen prior to any static initializers
being executed from any library in the linkage closure.

2. A single override of operator new and delete should be allowed to exist
anywhere within the linkage closure. Duplicate overrides give undefined
behavior (or a failure could be required).

3. Overrides are visible to all libraries loaded as part of the closure.

Could something that meets these semantics be implemented given the current
implementation for most loaders? I believe so - (I'm certain I could
implement this for Metrowerks w/ CFM but it may rely on being able to cross
link libraries which I'm not certain is generally available). In this case,
all that may be required is a statement in the standard that these are the
semantics that can be relied upon.

Sean

-- 
Sean Parent
Sr. Computer Scientist II
Advanced Technology Group
Adobe Systems Incorporated
sparent@adobe.com



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]