When the first attempt to use a class is made, the +initialize method is supposed to be called automatically and safely. The current code check to see whether +initialize needs to be called by seeing if the dispatch table is installed. If the table is not installed, it installs the dispatch table and calls +initialize using locking to prevent another thread from trying to do the same thing, but there is a problem ... once the dispatch table for the class has been installed, it is possible for another thread to use the class before +initialize completes (ie before the method has had a chance to set up all the class variables). This is a rare race condition, but when it occurs it's extremely hard to recognise and track down.
I think the fix is reasonably straightforward: when this occurs we need to refrain from installing the dispatch table until +initialize has completed, but we must make sure that the initializing thread is able to call methods of the class while it is executing. The obvious way to do this is to set up a copy of the dispatch table and have the method lookup use that copy (but only if locking permits) if there is no installed dispatch table. Upon completion of +initialize, the copy is installed as the normal dispatch table so that all threads can use it. Unfortunately I haven't had time to implement/test such a fix yet.
It turns out that solving this bug, even though it's conceptually simple, is quite a lot of work. I have new code to fix it, but it took me a whole day to develop and involves extensive additions and alterations to sendmsg.c (though I tried to keep the existing code unchanged as much as possible).
I'm running the new code on my system to see if I can find any problems before I supply a patch, but if there's any interest I can provide a patch earlier for people to test.
Created attachment 17020 [details]
Patch for sendmsg.c to fix this bug
I've been using this patch for a while ... it modifies sendmsg.c to prevent the bug by not installing the new dispatch table until after the initialize method has completed (while still allowing methods called from +initialize to be found).
Created attachment 17613 [details]
rewrite of dispatch table installation
I agree with the approach you describe, in that we need a look-a-side buffer for the dispatch table to send messages during +initialize and install the dtable after +initialize returns.
I was not comfortable with the patch because:
/* Assumes that __objc_runtime_mutex is locked down.
* If receiver is not null, it is the object whos supercalss must be
* initialised if the superclass of the one we are installing the
* dispatch table for is not already installed.
__objc_begin_install_dispatch_table_for_class (Class class, Class receiver)
I still have a hard time groking what was intended with the receiver. It all seems very intertwined and I think there is a more straight forward way to implement this.
Also, with this patch get_imp fails on class methods. (get_imp also has the nasty effect of installing the dispatch table without calling +initialize and the same goes for __objc_responds_to).
I'm not to fond of introducing InitializingList as special type. I think should be fine with using the existing hash map tables for this. I don't think we really need to introduce a new type. Do you really think that method dispatch for partially installed dispatch tables is performance critical?
Well... after all this complaining, let's get to something more constructive :-). I've attached a patch (including some test cases which still need to be augmented) that I'd like to propose for a reimplementation originally based on your code. I hope I've added enough comments and asserts to insure the assumptions and prerequisites are met. For the final submission I'll remove some of the asserts.
It combines __objc_install_dispatch_table_for_class and __objc_init_install_dtable into:
/* This function is called by:
objc_msg_lookup, get_imp and __objc_responds_to
(and the dispatch table installation functions themselves)
to install a dispatch table for a class.
If CLS is a class, it installs instance methods.
If CLS is a meta class, it installs class methods.
In either case +initialize is invoked for the corresponding class.
The implementation must insure that the dispatch table is not
installed until +initialize completes. Otherwise it opens a
potential race since the installation of the dispatch table is
used as gate in regular method dispatch and we need to guarantee
that +initialize is the first method invoked an that no other
thread my dispatch messages to the class before +initialize
__objc_install_dtable_for_class (Class cls)
Which implements your suggestion with the following helper functions:
static void __objc_prepare_dtable_for_class (Class cls);
- builds the dispatch table and stores it in a look-a-side buffer
(I used the hash tables instead of a custom type).
static struct sarray *__objc_prepared_dtable_for_class (Class cls);
- access the prepared table: this is used to identify whether the dispatch
table is currently being installed (akin to the __objc_is_initializing_dispatch_table of the proposed patch) and is also
used for subclasses that may be +initialized during the +initialize of the
super class (i.e. class clusters when NSString's +initialize invokes
GSString methods an need to copy NSString's dtable.
static void __objc_install_prepared_dtable_for_class (Class cls);
static IMP __objc_get_prepared_imp (Class cls,SEL sel);
Could you please have a look and let me know what you think? I'm still going to write some more test, checking the class cluster behavior mentioned above and I'll need some tests wrt to categories. So this is not final but it should address the main issue and the get_imp/__objc_responds_to issue.
I just searched this old bug report up, and found that I'd missed seeing (or forgotten about) the latest response :-(
I've been running with my patch for a couple of years without problems, but when I tried David's patch out, practically all my gnustep code simply started crashing on startup ... so there's something wrong there but I haven't had time to see what it might be.
However, to address points in the last comment ...
> I still have a hard time groking what was intended with the receiver. It all
> seems very intertwined and I think there is a more straight forward way to
> implement this.
I can't argue with that ... it *is* complex :-(
> Also, with this patch get_imp fails on class methods. (get_imp also has the
> nasty effect of installing the dispatch table without calling +initialize and
> the same goes for __objc_responds_to).
Very good point ... it hasn't been an issue for me in practice, but that clearly is something which needs to be addressed. However, I'm not sure that's a new bug ... haven't those functions always bypassed +initialize? I'd have to check, but I agree that they really should call +initialize
> I'm not to fond of introducing InitializingList as special type. I think should
> be fine with using the existing hash map tables for this. I don't think we
> really need to introduce a new type. Do you really think that method dispatch
> for partially installed dispatch tables is performance critical?
Certainly there's no performance issue with method dispatch in initialize ... it should be rare enough that we can use a really slow implementation. However, we have to build the dispatch table for the class, so it seemed reasonable to use it within the thread calling +initialize even though we don't want to install it for other threads until +initialize has completed. Perhaps we can store the uninstalled dispatch tables in a hash map more simply than using a new type ... that sounds reasonable.
I haven't looked at my patch for quite some time but I don't remember that there was an issue and did quite some extensive tests... but anyway it surely has bit-rotted since.
If you're approach works and you can address the fact that dispatch tables can be manipulated before +initialized was invoked on the class, then I'm fine with whatever you come up with.
I spent some hours looking at your code and I like it ... it's certainly clearer than mine.
I found three problems which i've fixed on my system:
1. failure to check CLS_ISRESOLV early enough (I added a check at the very start of __objc_install_dtable_for_class,
2. I needed to add a check immediately after installing the dispatch table of the superclass ... in case the +initialize in the superclass had actually initialized the subclass too. In this case we need to return immediately from __objc_install_dtable_for_class rather than trying to install the table again.
3. The +initialize method of class can cause changes to the methods of the class, so the prepared dispatch table can need to be replaced during the executing of +initialize. I added changes for that.
I've been working on an old copy of libobjc ... once I can get your patch plus my modifications workng with svn trunk, I'll post a patch against the current runtime code.
Thanks a lot for picking it up!
I'll keep an eye out for your version and will take it for a spin when it comes available.
Created attachment 23702 [details]
David's patch with a few more fixes and updated for current svn trunk
Made it safe to modify class methods during +initialize (eg as done by class clusters adding behaviors in gnustep).
Made it safe for superclass +initialize to use the subclass.
Simplified message lookup so that get_imp and objc_msg_lookup share common code, reducing the chance of a mistake in this tricky class initialisation process.
Created attachment 23857 [details]
Revised/improved patch against svn trunk
Here's a new version of the patch.
This adds support for the new runtime function class_respondsToSelector()which was previously not considered as it was added to the runtime after the original fic for this bug.
This patch also fixes a completely unrelated possible bug that class_respondsToSelector() and __objc_responds_to() were not necessarily returning OjC BOOL values (ie could in theory return values other than 0 or 1).
I applied the patch to trunk.
Thanks a lot! :-)