This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
Re: known synchronized() failures?
- From: Thomas Aeby <aeby at graeff dot com>
- To: Java GCJ Mailing List <java at gcc dot gnu dot org>
- Cc: Bryce McKinlay <mckinlay at redhat dot com>, Rutger Ovidius <r_ovidius at eml dot cc>
- Date: Mon, 02 May 2005 14:37:05 +0200
- Subject: Re: known synchronized() failures?
- References: <1114597683.6316.23.camel@localhost> <426FBFDF.8060407@redhat.com>
Thanks to Bryce and Rutger for the hints,
> - Try again with GCC 4.0, or a 3.4 tree with the 16662 fix applied,
and
> see if the bug still occurs.
> - If you still see the bug, build a libgcj with LOCK_DEBUG enabled,
> attach to the locked process in gdb, inspect the thread stacks and
post
> what you see here.
In the meantime I have built gcc 4.0.0 and confirmed that the problem
still exists, but now at least I have libraries with symbols .... a
typical stack trace of the process consuming all the CPU (and is the
one who is trying to get a lock) looks like
/disk/hdc2/glibc/debian-build/glibc_2.3.2.ds1-20.test2/glibc-2.3.2.ds1/build-tree/glibc-2.3.2/linuxthreads/restart.h:24
/disk/hdc2/glibc/debian-build/glibc_2.3.2.ds1-20.test2/glibc-2.3.2.ds1/build-tree/glibc-2.3.2/linuxthreads/mutex.c:199
../../../gcc-4.0.0/libjava/posix-threads.cc:192
../../../gcc-4.0.0/libjava/java/lang/natObject.cc:929
sfi/director/schedule/ParallelScheduler.java:539
sfi/director/schedule/ParallelScheduler.java:471
sfi/director/util/DirectorThread.java:105
and (at some other moment in time)
/disk/hdc2/glibc/debian-build/glibc_2.3.2.ds1-20.test2/glibc-2.3.2.ds1/build-tree/glibc-2.3.2/linuxthreads/spinlock.c:405
/disk/hdc2/glibc/debian-build/glibc_2.3.2.ds1-20.test2/glibc-2.3.2.ds1/build-tree/glibc-2.3.2/linuxthreads/mutex.c:123
../../../gcc-4.0.0/libjava/posix-threads.cc:112
../../../gcc-4.0.0/libjava/java/lang/natObject.cc:929
sfi/director/schedule/ParallelScheduler.java:539
sfi/director/schedule/ParallelScheduler.java:471
sfi/director/util/DirectorThread.java:105
Ok, that's there:
// release lock on he
LOG(REQ_CONV, (address | REQUEST_CONVERSION | HEAVY), self);
while ((he -> address & ~FLAGS) == (address & ~FLAGS))
{
// Once converted, the lock has to retain heavyweight
// status, since heavy_count > 0 .
_Jv_CondWait (&(hl->si.condition), &(hl->si.mutex), 0, 0);
}
So, next thing I try is to find the thread actually holding the lock.
It seems that's this one:
../linuxthreads/sysdeps/unix/sysv/linux/pt-sigsuspend.c:56
/disk/hdc2/glibc/debian-build/glibc_2.3.2.ds1-20.test2/glibc-2.3.2.ds1/build-tree/glibc-2.3.2/linuxthreads/pthread.c:1205
/disk/hdc2/glibc/debian-build/glibc_2.3.2.ds1-20.test2/glibc-2.3.2.ds1/build-tree/glibc-2.3.2/linuxthreads/restart.h:34
/disk/hdc2/glibc/debian-build/glibc_2.3.2.ds1-20.test2/glibc-2.3.2.ds1/build-tree/glibc-2.3.2/linuxthreads/mutex.c:123
./include/java-threads.h:147
sfi/director/schedule/ParallelScheduler.java:263
(line 263 is the end of a synchronized block)
I'm not actually deep enough in libgcj/glibc/linuxthreads in order to
see immediately what is going wrong and it's actually a rather dangerous
job to get as much info out since I can only reproduce this on a
production machine provoking hangs ...
Are the above stack traces enlightening for one of you?
I am sure looking forward to Thread.getStackTrace() getting
implemented :-)
Best regards,
Tom
--
----------------------------------------------------------------------------
Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland, Tel: (+41)264180040
Internet: aeby@graeff.com PGP public key available
----------------------------------------------------------------------------