This is the mail archive of the
java-patches@gcc.gnu.org
mailing list for the Java project.
RE: Remove data race in libgcj interpreter
- From: "Boehm, Hans" <hans dot boehm at hp dot com>
- To: Andrew Haley <aph at redhat dot com>, Java Patch List <java-patches at gcc dot gnu dot org>
- Date: Thu, 4 Sep 2008 18:35:11 +0000
- Subject: RE: Remove data race in libgcj interpreter
- Accept-language: en-US
- Acceptlanguage: en-US
- References: <48AD67B2.4040308@redhat.com> <48C0042C.5080804@redhat.com>
>
> +
> /*************************************************************
> **********
> + Locking in the direct-threaded case is quite subtle. See PR
> + libgcj/8995 for more details.
> +
> + We must ensure that no-one ever reads an inconsistent {insn,value}
> + pair. Rather than trying to lock everything we detect when some
> + other thread has modified the insn we are trying to
> modify and back
> + out.
> +
> + When we read a pair before rewriting it we do this:
> +
> + i1 = insn
> + v = value
> + lock
> + i2 = insn
> + unlock
> + if (i2 != i1) try again
This is probably statistically likely to generate correct code on some machines, but it doesn't look correct to me. I think you're counting on the three loads to be performed in the order written. But the compiler gets to reshuffle the first two. And there is generally no prohibition against moving loads INTO a critical section, so they can in fact also be arbitrarily reordered with the i2 assignment as well. (On IA64 and SPARC, the hardware might conceivably do that, depending on the lock implementation. There is a long story about when this is Posix compatible, but it doesn't really matter here.)
There is also another important issue: Posix and the upcoming C++ standard (among others) give undefined behavior to any program involving a data race. Clearly this program involves data races on the initial insn and value reads. The problem is that there are several good reasons for the Posix and C++ positions:
1) Compilers, including I believe gcc, implicitly assume that there are no asynchronous changes on non-volatile variables. If you invalidate this assumption, weird things may happen. These weird things don't always correspond to reading a different value. For example, the compiler may reread a register copy of the value from the shared variable when it has to spill the register, making it appear that a local variable (which is a copy of the shared variable) suddenly changes without an assignment to it. It's not hard to come up with cases in which this would provoke a crash. Empirically, this sort of thing doesn't happen much, but we were told at some point that gcc may behave this way.
2) I would guess that gcc supports some platforms on which things like pointer loads are not always atomic.
3) In the not-too-distant future, we'd really like to have reliable data-race detectors. This is much easier if code doesn't contain intentional data races.
4) None of us has managed to come up with an entirely satisfactory definition of what programs with data races mean. The JSR133 definition is reasonably close, but still has some issues that have come up recently. And it would probably have more serious performance implications on something like C++.
I haven't thought though this through enough. Does it work to instead keep a shared (possibly per method?) count of the number of running interpreters, together with a shared bit (tested on backward branches) asking them to stop? You only modify the code if no interpreters are running (in that method). Getting this right is still nontrivial, but the races are restricted to well-defined atomic variables, which is fine.
Hans
> +
> + and when we write a pair we do this:
> +
> + lock
> + value = v
> + write barrier
> + insn = i
> + unlock
> +
> + There is a read barrier in NEXT_INSN which ensures that we don't
> + read a rewritten value with a stale insn.
> +
> +