deadlock detection

Hans Boehm Hans.Boehm@hp.com
Fri Sep 5 16:18:00 GMT 2003


[Moved from patches list.]

On the pthreads level, I think there is nothing to prevent a thread
from exiting with a lock held, at least for the Linux implementation.
And that would cause the symptoms you observe.  It's certainly possible
to write (probably illegal) Java byte code that would have the same
effect.  It shouldn't be possible to write such code in Java.
Certainly CNI or JNI code could do it.

If your gcj version uses hash synchronization, it's also conceivable
that has an obscure bug with this result.  I would be mildly surprised at
that, since the code has been around for a while.  But I've seen stranger
things.

It would also be good to know that this can occur across multiple Linux
versions.  Pthreads libraries have alo been known to contain bugs, and
that unfortunately includes some versions of linuxthreads.

On 5 Sep 2003, Jacob Gladish wrote:

> This is somewhat related to my last post... I have an application that
> appears to be deadlocked, and after inspecting the thread stacks, things
> look very strange to me. It appears the several threads are waiting to
> enter a monitor as show by these last few frames
>
> #0  0x2c8e29b6 in __sigsuspend (set=0x767ff77c) at
> ../sysdeps/unix/sysv/linux/sigsuspend.c:45
> #1  0x2c6ac0f1 in __pthread_wait_for_restart_signal (self=0x767ffc00) at
> pthread.c:969
> #2  0x2c6adbf1 in __pthread_alt_lock (lock=<incomplete type>, self=0x0)
> at restart.h:34
> #3  0x2c6aab96 in __pthread_mutex_lock (mutex=0x12c0cf90) at mutex.c:120
> #4  0x2c55e9cc in _Jv_MonitorEnter (obj=0x8e2f198) at
> include/java-threads.h:147
>
> What's strange is that after an exhaustive search, I cannot find any of
> threads in the system that have the monitor that this thread is waiting
> for. What I find suspicious is that when going to frame 4 and printing
> out the contents of the mutex structre, I see the following:
>
>
> (gdb) f 4
> #4  0x2c55e9cc in _Jv_MonitorEnter (obj=0x8e2f198) at
> include/java-threads.h:147
> 147             pthread_mutex_lock (&mu->mutex);
> (gdb) p *mu
> $11 = {
>   mutex = {
>     __m_reserved = 0,
>     __m_count = 0,
>     __m_owner = 0x0,
>     __m_kind = 0,
>     __m_lock = {
>       __status = 1988098124,
>       __spinlock = 0
>     }
>   },
>   owner = 25626,
>   count = 1
> }
>
> There doesn't appear to be a thread with id 25626 running. I'm not a
> pthreads expert, but it looks to me that the VM thinks that a
> non-existent thread owns the mutex. Is it possible that a thread
> aqcuired the mutex and never released it?
>
> This is gcj3.1 with a good deal of patching applied.
>
>
> And I'll second that vouch. We have an application with a large number
> of threads, and deadlocking the system is very easy to when a developer
> is even slightly careless with synchronized blocks. A simple singal to
> report the deadlocked condition would have saved us many hours of
> pouring over thread dumps.
>
> Maybe this could be something for me to contribute to the 3.3+ line. Is
> anyone interested in an offline discussion on this?
>
>
> thanks
> -jake
>
>
>
> On Thu, 2003-09-04 at 23:03, Jeff Sturm wrote:
> > On 4 Sep 2003, Tom Tromey wrote:
> > > Jacob> Does anyone know of any feature in the current vm or furtur
> > > Jacob> plans for the vm to support any type of deadlock detection?
> > > Jacob> This is probably more along the lines of development, but I
> > > Jacob> thought someone may have done something in the form of a patch.
> > >
> > > I haven't heard of anybody doing this.
> >
> > Nor have I, but I'll vouch for its utility.  Deadlock detection in
> > Sun's VM has saved me considerable debugging effort more than once.
> >
> > Jeff
>



More information about the Java mailing list