This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Minimal GCC/Linux shared lib + EH bug example



----- Original Message -----
From: "Sean Parent" <sparent@adobe.com>

> > And I was asking you to be more specific about the significant increase
in
> > sophistication you think would be demanded of loaders.
> >
> > I also went on to claim that Linux dynamic linking already provides
> > *precisely* the semantics of static linking in the simple case where
there
> > is no use of RTLD_LOCAL. In other words I challenge your assertions
above.
>
> Ahh - I don't have experience with Linux (as I had stated above) I had
> assumed that RTLD_LOCAL vs. RTLD_GLOBAL simply referred to whether or not
> the data section was reinstantiated. I assume that RTLD_LOCAL is only
> available for dlopen and not a mode for a dynamic link.

As far as I know, that's the case.

> I believe my assertion holds for many loaders in common use.

Maybe. There are certainly a few different models out there. In
standardization we'll need to at least look at all of them.

> > No, we don't care (from a standards point-of-view) whether any code is
> > actually shared. However, "identity sharing" is absolutely relevant to
this
> > thread: the observable behavior differences that can arise in the
context
> > of shared libraries (if we ignore unloading for the moment) are ALL
cases
> > of duplication of things which are "supposed" to arise as a single
copy:
> > two different addresses (and thus values) for the same static member of
an
> > inline function or function template, two different addresses for the
same
> > function, two different run-time type identities for the same type,
etc.
>
> Problems also arise from two items that are supposed to be joined into a
> single item - such as exception handling tables - this may not be an
issue
> with Linux - it is with CodeWarrior on the Mac however. Also replacements
> for operator new and delete lead to two different items of the same name
of
> which one is supposed to be correct.

Those are all exactly the sort of things that I'm talking about also.

> You also have issues with
> initialization order - the runtime environment may not be fully
initialized
> prior to statics being initialized.

What, specifically, do you mean by "the runtime environment may not be
fully initialized"?

> > I'm not sure I agree. I think the CFM model where the library is shared
> > across processes is strikingly similar to the case we're discussing on
> > Linux if you take Martin's view that each library loaded with
RTLD_LOCAL
> > should be viewed as a separate "program".
>
> It's a little bit reversed - it's a property of the library being linked
> against rather than an option on the library being loaded - but it is
> similar.

It sounds like you're saying that the difference is in which entity
intitiates the sharing (or lack thereof). Except in the case where sharing
is directional (as in the Windows model) I think we can ignore the question
of where it is initiated.

> > If you ignore dlopen and dlclose I don't think there's anything
mysterious
> > about what it means on Unix: from as standards POV, there's nothing to
> > discuss because it works like static linking.
>
> That may be what it means on Linux - I think it's a stretch to generalize
> that to UNIX.

You might be right. It seems to be the same on Solaris. Is ELF a Posix
standard, anybody? That would mean it could be generalized to Posix.

> It certainly isn't what it means on Mac or Windows - and I'm
> not convinced it is a desirable definition.

Whether or not it's desirable, it's an important model in wide use.
Whatever is standardized for C++ has to have a place for those.

> > I don't think Windows can be discussed in the same breath, since it is
a
> > different model and we're not going to be able to force them both into
the
> > same box, if the box is going to be concrete enough to be useful.
>
> Oh boy - if you can't include Windows in the standard you really don't
have
> a standard. I hate that (like I hate VPC) but it is reality.

I don't begin to suggest that we shouldn't include Windows in the standard.
What I /am/ saying is that the standard has to accomodate a few different
models, because Windows and Linux very different.

In particular, Windows sharing has two properties that you don't get on
Linux:

1. Specificity - only symbols which are named explicitly (in the source
code, for practical C++ development) are resolved externally or made
available for external resolution. Some Unices (e.g. AIX) also have this
property.

2. Directionality - a symbol is explicitly marked for export from or import
to a given object file. This is unique to Windows AFAIK, but I'm not
intimate with CFM (I stopped doing Mac development years ago).

I am convinced that any C++ standard for dynamic linking needs to
accomodate at least these two axes of variability by describing what you
can expect when they are/aren't supported by the platform. We might
simplify things a bit by dealing with RTLD_LOCAL as a special case of "bulk
specificity", but I'm jumping ahead here...

> I can't use any of the features portably. I'm not a Linux developer - I
> develop primarily on the Mac but everything I do also will have to run on
> Windows. It would be nice if it could run on Unix for a couple of our
> products - and Palm and PocketPC (really another Windows). Linux isn't
> currently on the list. I'll probably move my primary development off CFM
to
> Mach-O at some point, basically I'm waiting for better tool and library
> support for Metrowerks before I do that.

Good. I hope that when we're done, people like you and I will be able to
develop and standardize a reasonably portable programming model which
doesn't prevent us from using any significant portion of the C++ language
in dynamic libraries, and that allows us to take advantage of the
techniques for isolating symbol spaces on various common OSes.

> >> I'm perfectly fine with there being restrictions about what
> >> can be linked together - in fact I think restrictions are probably
> > desirable
> >> if they contribute to the encapsulation provided by the library.
> >
> > Why is it better for the language designer, rather than the designer of
> > library X, to say "You can't link Y to X"?
>
> If the language gives me the control to specify it in my library design -
> great. If the language requires that all my static symbols and runtime
> symbols be exported than I can't use it. I can't afford to build an
economy
> where all of the add-ons to my product are revision locked to my product,
my
> compiler, my runtime. Those become handcuffs to the adoption of my next
> release. Solid encapsulation is a good thing.

You're not talking about restricting what can be linked together, at least
not the way I understood the phrase. What you mean (AFAICT) is that you
want some way to control symbol visibility across a shared library
boundary. IOW, you want Specificity. I support that. You may not be happy
with it, but AFAICT on Linux, visibility control is an all-or-nothing
proposal at each library boundary.

> > The way you avoid that kind of visibility on Unix is with dlopen and
> > RTLD_LOCAL. Otherwise, you're discussing a windows-model concept. As a
> > cross-platform developer, I think it's important to be able to have
this
> > kind of hiding, and that's one reason I don't think brushing aside
dlopen
> > is appropriate.
>
> I think we should come back to dlopen - I agree it is very important. But
I
> think in order to make it work we first need to settle what it means to
> dynamically link a C++ application. The issues of dlopen only add
complexity
> with regards to scoping and lifespan.

I don't think so. dlopen() is a special case of the more-general visibility
controls you get with __declspec on Windows (you essentially get a single
visible entry point and that's all).

> > No, you've misinterpreted it. Furthermore, as I say above,
> > if you don't use dlopen it's already equivalent to static
    ^^^^^^^^^^^^^^^^^^^^^^^
> > linking.
>
> Really? You have all these issue with duplicate symbols that don't get
> merged with static linking?

No, you don't have those issues with static linking. You don't have them
with dynamic linking either when you're not using dlopen().
I don't think you're reading what I'm writing very carefully.

> I guess I'm running with a very different model
> - all my static linking "just works" - and dynamic linking isn't
> even close to an equivalent.

Yes, that's a common situation. Some people have said that we shouldn't
even talk about dynamic linking in the standard if it isn't going to work
just like static linking; I think that's shortsighted, so we need to
accomodate your model as well.

> >> That doesn't get you any of the benefits of dynamic linking other
> >> than saving bytes on disk.
> >
> > Not true; it gets you component-based development.
>
> What does that buy you if everything is revision locked? You can't afford
to
> allow separate companies to develop components and you would have to give
> them your sources to make it work. Might as static linking it and ship an
> updater.

Not so; we have namespaces. Also, I know of single development groups that
like to do CBD within a single organization. Anyway, I'm not arguing that
we shouldn't discuss models for strong isolation. I'm just saying that the
Linux shared linking model is far from useless. Lots of people use it
happily. Also, it seems to me that your insistence that isolation is
important is in direct conflict with your insistence that dlopen() is
unimportant, unless you hope to get Linux to implement a completely
different linking model.

> >> I'm not sure what you mean about "current
> >> semantics if the library load order were changed."
> >
> > Let me review, then: In the case I'm talking about, the executable A
opens
> > two libs B and C with dlopen. B and C each link dynamically to D in the
> > usual way. B, C, and D all contain calls to the same inline function
which
> > has a static counter:
> >
> > inline int count()
> > {
> >   static int n = 0;
> >   return n++;
> > }
> >
> > D also contains the definition of:
> > int count2() { return count(); }
> >
> > B and C each contain this definition:
> >
> > namespace {
> > void check_count()
> > {
> >   int x = count2()
> >   assert(x + 1 == count());
> > }
> > }
> >
> > Calling check_count() in B always works, but in C it always asserts.
That
> > behavior depends on the order in which B and C were loaded. The change
I'm
> > proposing makes check_count() work in both B and C.
>
> And it would also work that way with a static library?

No, of course not: A opens B and C with dlopen(), which is what causes this
problem. I have said repeatedly that Linux static and dynamic linking are
semantically equivalent IN THE ABSENCE OF DLOPEN.

> With CFM B, C, and D
> would all have unique copies of count and the static so it would always
> assert.

That's similar to Windows.

> Unless you exported that symbol (which you would have to look at the
> link map to find the name) - in which case you couldn't load because you
> would always have a conflict.

Also similar to Windows, except that exporting is simpler.

> "Weak" linking wouldn't help - that would only
> allow you to load of no copies of the static were present.

Does CFM have a "weak" linking model at all? If so, is it different from
ELF weak links?

[Also, how relevant is CFM to a discussion of future dynamic linking
standards? Does it have a long enough future at Apple to make it worth
investigating? Matt?]

> To make this work you would have to make count() not be an inlined
function,
> put it into D, and export.

Again, similar to Windows. That's nice because it means the number of
distinct models is converging.

> > Not new/delete; those can be replaced. My proposal is explicitly
concerned
> > with those runtime-support symbols, though.
>
> Okay - except they aren't always just "symbols" to be aliased (maybe they
> are in Linux).

Please be specific about what you mean. What, if not "symbols" (and does it
make any difference or is it just terminology)?

> > I guess I just disagree with you there. I don't think the problem on
Linux
> > is really in the compilers. We can make the compiler do something which
> > works around a few of the problems (i.e. by comparing typeinfo::name()
for
> > EH) but we can't really solve the problems in any meaningful way
without
> > changing the loader.
>
> I can see that - it sounds like with Linux a lot already just works -
great.
> But what does work isn't defined to work in the standard, and I'm not
sure
> it's a reasonable extension to say "because it works on Linux it could be
> made to work anywhere."

Nobody's claiming that it is.

> I'm also still not convinced that the Linux
> direction is the direction the standard should be going in.

The standard is going go go in a direction that accomodates existing
important platforms (including Linux) - there's nothing you or I could do
to change that. Obviously I hope that it is only /close/ to accomodating
Linux as it exists today, because I want the Loader behavior fixed.

> > Okay, now we're in Windows land. That's a completely different domain
and
> > may require different solutions... but I'm out of time for tonight.
>
> Windows and Mac land - and most of what you are taking for granted just
> doesn't work that way on these platforms.

Please, don't underestimate me. I'm intimately familiar with dynamic
linking on Windows and I'm not taking anything for granted that doesn't
apply there.

> Before we jump in to solve the
> last bits for Linux I think we need to step back and define what the
first
> bits are for the standard.

Since several conversational paths are crossing here, I'm going to continue
to press for Linux fixes on the GNU front, though it may well be premature
for the C++ standards thread.

-Dave



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]