This is the mail archive of the
mailing list for the GCC project.
RFA: Objective-C patch fixing libobjc/10742
- From: Nicola Pero <nicola at brainstorm dot co dot uk>
- To: shebs at apple dot com
- Cc: gcc-patches at gcc dot gnu dot org, Richard Frith-Macdonald <richard at brainstorm dot co dot uk>
- Date: Tue, 13 May 2003 18:58:25 +0100 (BST)
- Subject: RFA: Objective-C patch fixing libobjc/10742
could you please approve the following patch ?
It fixes libobjc/10742.
libobjc/10742 reports a problem during the initialization stage of the
ObjC runtime. The original problem occurred during loading a bundle in a
pretty complex situation - the runtime would crash.
Anyway, what happens is simply that the runtime is building a tree
representation of part of the class hierarchy in order to send +load
methods in the proper order - superclasses first, then down to subclasses.
It's building the tree starting from the classes loaded from the module,
which are not yet "resolved" - so the super_class pointer in the classes
is still a string representing the class name. When the classes are
resolved (which happens the first time a method is sent to the
class/instance), the dispatch tables are inserted, and the super_class
name is replaced with a pointer to the actual superclass.
So, the +load code, quite correctly, builds the tree, and gets the
super_class of the classes (which it needs to build the tree walking
upwards from the classes in the module) assuming the super_class pointer
in the classes is still unresolved, and so it is actually a string - the
Unfortunately, the code does not seem to take into account the fact that
if you walk up the class hierarchy, you might reach classes which were
loaded previously, and which already received a message and have so been
resolved - for them you can't assume that the super_class pointer is still
the class name - for them the super_class pointer *is* the super_class
What happens in that case is then that the +load code will call
objc_lookup_class() passing as argument a pointer not to a string (a class
name), but to an actual class structure.
Usually, the class structure, from the point of view of
objc_lookup_class() (which is interpreting it as a string), is a weird
string. objc_lookup_class() will read bytes until it gets to a \0, and
try to match that "string" to the existing class names. That will fail,
and objc_lookup_class() will return Nil. That usually works fine,
actually, for the +load code to build the tree - it means any class which
has already been resolved is treated as a root class - which is fine as
far as +load is concerned, since building the tree above that class is not
strictly necessary to determine the proper order of sending +load methods
to the classes in the module being loaded.
This is, IMO, why the bug has never been noticed - because in all normal
circumstances the code actually works well.
Of course, there is always a very small risk that, when treating a pointer
to an ObjC class as a string, something catastrophic will happen -
objc_lookup_class() might loop scanning bytes and searching for the string
terminator, but if it doesn't find it, it might loop very long and scan
bytes until it goes outside the memory it's allowed to address, and then
the application will crash.
That's what happened to Richard (as reported in libobjc/10742) - and he
got a plain crash.
He's been very good at tracking where the crash originated - that wasn't
I have not been able to reproduce the crash - and I don't think it can be
easily reproduced consistently - it depends on the sequence of bytes in
the binary class structure in the object file.
But what I have been able to reproduce (and what I've used to make sure
the patch was improving things) is simply adding a logging statement to
objc_lookup_string() to print out all class names being looked up. Now
what happens is that when you load bundles, if you try hard enough with
the class hierarchy, you can reproduce the case that objc_lookup_class()
is called with a non-string argument instead of a class name, and you see
garbage being logged.
The following patch, derived from Richard's, fixes this problem. I tested
it checking that on my special cases, objc_lookup_class() was no longer
ever called with garbage class names, but only with valid class names.
Richard reports that it fixed his problem, so I think we have got it
I added extensive comments, investigated/double-checked exactly what's
happening and all details, tidied it up a bit, but it's in practice his
I'm happy for this to go into CVS HEAD - it's a bug fix, but it affects
applications so rarely that it's not really worth putting it into a
But I'd appreciate if you could approve it in any reasonable timeframe.
RCS file: /cvs/gcc/gcc/libobjc/init.c,v
retrieving revision 1.7
diff -u -r1.7 init.c
--- init.c 2 Jul 2002 19:42:27 -0000 1.7
+++ init.c 13 May 2003 17:28:53 -0000
@@ -99,6 +99,50 @@
should not be destroyed during the execution of the program. */
static cache_ptr __objc_load_methods = NULL;
+/* This function is used when building the class tree used to send
+ ordinately the +load message to all classes needing it. The tree
+ is really needed so that superclasses will get the message before
+ This tree will contain classes which are being loaded (or have just
+ being loaded), and whose super_class pointers have not yet been
+ resolved. This implies that their super_class pointers point to a
+ string with the name of the superclass; when the first message is
+ sent to the class (/an object of that class) the class links will
+ be resolved, which will replace the super_class pointers with
+ pointers to the actual superclasses.
+ Unfortunately, the tree might also contain classes which had been
+ loaded previously, and whose class links have already been
+ This function returns the superclass of a class in both cases, and
+ can be used to build the determine the class relationships while
+ building the tree.
+static Class class_superclass_of_class (Class class)
+ char *super_class_name;
+ /* If the class links have been resolved, use the resolved
+ * links. */
+ if (CLS_ISRESOLV (class))
+ return class->super_class;
+ /* Else, 'class' has not yet been resolved. This means that its
+ * super_class pointer is really the name of the super class (rather
+ * than a pointer to the actual superclass). */
+ super_class_name = (char *)class->super_class;
+ /* Return Nil for a root class. */
+ if (super_class_name == NULL)
+ return Nil;
+ /* Lookup the superclass of non-root classes. */
+ return objc_lookup_class (super_class_name);
/* Creates a tree of classes whose topmost class is directly inherited
from `upper' and the bottom class in this tree is
`bottom_class'. The classes in this tree are super classes of
@@ -127,9 +171,7 @@
tree = objc_calloc (1, sizeof (objc_class_tree));
tree->class = superclass;
tree->subclasses = list_cons (prev, tree->subclasses);
- superclass = (superclass->super_class ?
- objc_lookup_class ((char *)
- : Nil);
+ superclass = class_superclass_of_class (superclass);
prev = tree;
@@ -157,10 +199,7 @@
DEBUG_PRINTF ("1. class %s was previously inserted\n",
- else if ((class->super_class ?
- objc_lookup_class ((char *) class->super_class)
- : Nil)
- == tree->class)
+ else if (class_superclass_of_class (class) == tree->class)
/* If class is a direct subclass of tree->class then add class to
list of subclasses. First check to see if it wasn't already
@@ -370,9 +409,7 @@
if (class == superclass)
- class = (class->super_class ?
- objc_lookup_class ((char *) class->super_class)
- : Nil);
+ class = class_superclass_of_class (class);
Tue May 13 14:56:03 2003 Richard Frith-Macdonald <firstname.lastname@example.org>
Nicola Pero <email@example.com>
* init.c (class_superclass_of_class): New function.
(create_tree_of_subclasses_inherited_from): Use it.