objc_msg_lookup contains a race condition. The code is:
result = sarray_get_safe...;
if (result == 0)
if (receiver->class_pointer->dtable == __objc_uninstal...
The problem is that if two processors are hitting that code at roughly the same time, one of them may see result is 0 but find that receiver->class_pointer_dtable is not __objc_uninstalled_dtable, because the thread on the other processor has patched it up to contain the proper dtable entry for that class.
It's a narrow window of vulnerability, but we managed to get unlucky and hit it by moving a threaded program from a uni-processor system to a dual-processor system. I don't think a test program would be particularly useful for this bug.
I believe that particular race could be fixed by putting another call to result = sarray_get_safe in the else portion and then only setting result to __objc_get_forward_imp(op) if result is still 0. Looking at the code, I think there are similar races elsewhere.
aside from minor stylistic cleanups, nothing seems to have changed in this function between gcc
3.2.3 and mainline (20030525), so I highly suspect this bug is still present. It take it that finding a
testcase for this is still not practical? Thanks,
Subject: Re: objc_msg_lookup is not thread safe
>>>>> "Dara" == gcc-bugzilla <firstname.lastname@example.org> writes:
Dara> PLEASE REPLY TO email@example.com ONLY, *NOT*
Dara> ------- Additional Comments From firstname.lastname@example.org
Dara> 2003-05-27 20:50 ------- Hello,
Dara> aside from minor stylistic cleanups, nothing seems to have
Dara> changed in this function between gcc 3.2.3 and mainline
Dara> (20030525), so I highly suspect this bug is still
Dara> present. It take it that finding a testcase for this is
Dara> still not practical? Thanks,
I haven't been able to think of a nice test case. I only saw this bug
in the real world after an app that had been running fine for years
was run on a dual processor machine. The same application is started
probably a hundred times a day and we only saw one glitch out of about
a month of runs. If you have a dual processor system you can use for
testing (I don't), I can create a test case that you can play with.
However, if the test case appears to run fine, that won't prove that
the bug is fixed.
The test case would just be to start up as many threads as you have
processors and have them each refer to a class that wasn't referred to
before the app went multi-threaded. But unless you have a way of
being sure that the two CPUs are going to be hitting the unsafe code in
the runtime at the same time, the test case doesn't prove anything.
I believe a workaround is to reference each class at least once before
going multi-threaded. Our application build process now invokes a
shell script that generates a function, objective_c_runtime_hack that
consists of lines:
[[X alloc] free];
for each Class X. We then call objective_c_runtime_hack () once
before going multi-threaded and we haven't seen the problem since.
However, we wound up switching off one of the processors due to
multi-threaded bugs in OpenSSL, so our limited experience with that
workaround doesn't show much.
Cliff Matthews <email@example.com>
This bug is still present. Finding a testcase is probably not reasonable given the nature of the bug.
Fixed on CVS
Thu Jul 10 10:27:43 2003 Nicola Pero <firstname.lastname@example.org>
* sendmsg.c (get_imp): Fixed rare threading problem.
(__objc_responds_to): Similar fixes.
(objc_msg_lookup): Similar fixes.
(__objc_init_install_dtable): Lock the runtime before checking if the
table is installed.