debugging threads vs processes

Boehm, Hans hans_boehm@hp.com
Fri Jul 13 10:52:00 GMT 2001


Here are a few other bits of information about linuxthreads, and a partial
defense of it:

I've had some luck debugging multithreaded applications with the gdb thread
support.  I agree that it's not reliable, and correct applications sometimes
die under gdb.  I'd be all in favor of fixing these problems.  I tend to
avoid single-stepping multithreaded applications under gdb, if there are
really multiple simultaneously active threads.  I often find myself adding
code to the application to detect problems, and then using gdb primarily to
look around.  But I've also had reasonable luck with breakpoints.

As everyone else has already pointed out, Linux threads map directly to
processes.  In addition the linuxthreads package creates a manager
thread/process.  I believe this is always the first process created after
the main one, and all subsequent processes are children (in the Linux
process hierarchy) of the manager thread.  My impression is that the gdb
thread facilities tend to work better if you attach to the main (original)
process, and let it find the others.

I have no experience with the IBM package, nor have I seen performance
measurements.  It may be just what we need.  It may also be a very useful
alternative that works well in some cases.  But I have experience with other
m-to-n thread packages, and I'm not immediately enthusiastic about
abandoning linuxthreads.  The advantages of linuxthreads are:

1) They're relatively simple.  Their performance is mostly explicable.  If
bugs are found they can sometimes be tracked down by mere mortals.  The
debugger's job is easier than if it has to deal with another mapping.  /proc
can be used for looking at thread state.  Other process-level tools work for
threads without additional work.  In a pinch, they're debuggable with a
thread-unaware debugger.

2) Since threads map directly to processes, threads benefit from any
intelligence in the kernel scheduler, e.g. in support of processor affinity.
Threads may actually get to keep their cache context for a while.

The disadvantages of which I'm aware are:

1) Thread creation for non-blocking threads is more expensive.  I don't know
if this actually matters much, since most of the m-to-n thread
implementations end up creating a kernel thread when a thread blocks in a
system call.  And I suspect that many applications with large numbers of
threads have nearly all of them blocked in the kernel at one point or
another.  (I believe linuxthreads thread creation performance could also be
improved.  I believe it currently remaps the thread stack on every thread
creation.  It should probably keep a cache of recently discarded stacks,
instead of eagerly unmapping them.  I suspect, but don't know for sure, that
this is a significant fraction of thread creation cost.)

2) Thread switching is a bit slower, since it need to go through the kernel.
I haven't observed this to be a big factor.

3) Thread scheduling becomes more expensive when there are lots of runnable
threads.  But I suspect that's mostly because user level thread schedulers
tend to be fast and dumb, ignoring processor affinity, and the kernel
scheduler tries to do the right thing.  Either way you pay, in cache misses
or scheduling overhead.  Which is better is probably application dependent.
I suspect that pluggable kernel schedulers are a better solution.

4) The pthread standard expected user-level scheduling.  Linuxthreads seems
to have conformance issues near the edges of the spec.  I think some of this
is fixable.  E.g. getpid() could presumably return the pid of the main
process if the standards say it should, by having the linuxthreads library
override it.  It already does that with many other library calls.  I
personally haven't run into much of an issue with this, though some of the
problems are documented.

Clearly a comlinuxthreads could use more effort in improving/fixing it.  So
could gdb.  But in general, I've found it to be a well-designed package.
(H.J. Lu and I did the original Itanium port, so I'm not saying this
entirely out of ignorance.)

Disclaimer:  This is a personal opinion, not HP's position.  There are
others within HP who disagree with my opinion.
 
Hans

> -----Original Message-----
> From: Tom Tromey [ mailto:tromey@redhat.com ]
> Sent: Friday, July 13, 2001 9:54 AM
> To: jpolsonaz@mac.com
> Cc: Jeff Sturm; Per Bothner; java-discuss@sources.redhat.com
> Subject: Re: debugging threads vs processes
> 
> 
> >>>>> ">" == jpolsonaz  <jpolsonaz@mac.com> writes:
> 
> >> Has anybody tried using IBM's new Linux threading library with GCJ?
> 
> Not as far as I know.
> 
> >> My personal experience with the currently released LinuxThreads is
> >> that multi-threaded applications are almost undebuggable.
> >> LinuxThreads currently has a bug that causes all threads to wakeup
> >> from semaphores the first time you connect a debugger to the
> >> process.
> 
> Interesting.  I wasn't aware of that.
> 
> Tom
> 



More information about the Java mailing list