This is the mail archive of the
java-patches@gcc.gnu.org
mailing list for the Java project.
RE: Alignment problem with hashtable locks on PowerPC
- From: "Boehm, Hans" <hans_boehm at hp dot com>
- To: "'Bryce McKinlay'" <bryce at waitaki dot otago dot ac dot nz>, "Boehm, Hans" <hans_boehm at hp dot com>
- Cc: "'Jeff Sturm '" <jsturm at one-point dot com>, "'java-patches at gcc dot gnu dot org '" <java-patches at gcc dot gnu dot org>
- Date: Thu, 15 Nov 2001 11:22:15 -0800
- Subject: RE: Alignment problem with hashtable locks on PowerPC
> -----Original Message-----
> From: Bryce McKinlay [mailto:bryce@waitaki.otago.ac.nz]
> ...
> I've often wondered if it wouldn't be better to teach the
> compiler about
> compare-and-swap, and inline the lightweight part of
> _Jv_MonitorEnter/Exit. The inlined code would only call out to an
> external function in the unlikely event of a heavy lock or
> contention.
> Of course, this would probibly mean ditching the
> reduced-memory benefits
> of the hashtable, but I'm not convinced that would be a big
> loss given
> the 64-bit alignment requirements. I think this could be a big win
> particularly on platforms that have a convenient thread register.
>
You're talking about adding a sync_info word back to the objects, so that
the inlined code wouldn't have to do he hash function computation, etc.? I
think the current code is probably slightly too big to be inlined, even if
the thread id is in a register. (Having that as a non-default compiler
option might be interesting, though.)
That's an interesting trade-off, and one I'd like to understand better. It
would make gcj's synchronization cost very similar to everyone elses. But I
think it does cost elsewhere.
My impression is that object size affects performance quite a bit. The 8
byte alignment requirement is an annoyance, but I'm not sure that it's
really avoidable in either case. On most platforms, at least doubles need
to be aligned anyway, which means that you either need to have objects
containing doubles allocated differently, or you need 64-bit alignment
everywhere. (Does PowerPC really not require that? Itanium equires 16 byte
alignment for malloc, since some loads and stores require 16-byte alignment.
Objects that start in the last word of a cache line are probably not such a
good idea anyway. If alignment is a real issue, it's also conceivable to me
that one could eliminate the requirement from the locking code, though
that's nontrivial.) The sync_info field is particularly troublesome, since
it's a rarely accessed field, which is likely to have frequently accessed
fields on both sides. Thus it unavoidably takes up cache space (and memory
bandwidth to load it), even in the 99% of cases in which it's completely
unused.
My intuition would be to favor object size over synchronization cost, since
it's usually possible, though admittedly hard, for the programmer and/or
compiler to eliminate the synchronization when it turns into a real problem.
But measurements would be really nice.
JVMs that use a compacting garbage collector don't really have much of a
choice in this matter, since they need some extra header bits for hashcodes
etc. anyway. Thus usually you can find the space for a few synchronization
bits in the second header word, and there is little point in not using it.
Gcj has a choice. Thus there's an opportunity for the experiment ...
Hans