dwarf2 signal unwind problem

David S. Miller davem@redhat.com
Thu Apr 25 21:31:00 GMT 2002


I think all Linux dwarf2 signal unwind implementations in libjava get
things subtly wrong.  I noticed this while tracking down libjava
testsuite failures on sparc-linux.

Here is the comment that appears above every MAKE_THROW_FRAME
implementation currently in include/dwarf2-signal.h:

  /* ${CPU} either leaves PC pointing at a faulting instruction or the  \
   following instruction, depending on the signal.  SEGV always does    \
   the former, so we adjust the saved PC to point to the following      \
   instruction; this is what the handler in libgcc expects.  */         \

I don't see how this can be correct.  Let me give an example
to show everyone why I think this way.

Consider an exception region consisting of one load instruction,
output by GCC as follows:

EXCEPTION_BEGIN_LABEL:
	LOAD 	NULL_POINTER, REG
EXCEPTION_END_LABEL:

This will elicit a SIGSEGV signal and cause MAKE_THROW_FRAME to
execute.

If we advance the program counter reported by the signal handler,
it will not match up in any of the exception region tables when
looked up at runtime.

Therefore I think every implementation of MAKE_THROW_FRAME in
this file is in error.  They should not skip the PC over the
exception causing instruction.

This brings me to my second observation, which probably indicates
another similar error in all of the current dwarf2 signal unwind
implementations.

Sometimes, when an unwind out of a signal handler occurs, the
unwind code has to walk the stack from the beginning multiple
times to find out who to really unwind to.

Secondly, the unwind code has to adjust the context->ra value on
some platforms to make the return PC point to the right place.
See gcc/unwind-dw2.c:uw_update_context(), specifically this:

  /* Compute the return address now, since the return address column
     can change from frame to frame.  */
  context->ra = __builtin_extract_return_addr
    ((void *) (_Unwind_Ptr) _Unwind_GetGR (context, fs->retaddr_column));

In order for the signal unwind PC to be instruction accurate
(and as my single LOAD exception region above shows, it must
 be instruction accurate for it to work in all cases) things
must be setup such that MAKE_THROW_FRAME compensates for this
adjustment that the unwind code in GCC is going to make.

One might think that we could make the adjustment in
MD_FALLBACK_FRAME_STATE_FOR, but because the unwind tree could be
walked multiple times (and MD_FALLBACK_FRAME_STATE_FOR thus called
multiple times for the same signal frame) this solution would not
work.  It must be adjusted in MAKE_THROW_FRAME for correct operation
in all cases.

I know my initial Sparc implementation got both of these issues wrong,
and I am fixing that up now (expect a patch to java-patches shortly).

But I encourage the implementors of the other MAKE_THROW_FRAME
instances in dwarf2-unwind.h to take a good hard look at this
because I think they have at least one of these bugs too!



More information about the Java mailing list