This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: EH newbies howto


hi guys.

i have reviewed the entire document, rewriting bits of it, making the
corrections suggested, removing superflous information, filling in
some gaps, and making it read more like documentation and less like 
an email ;-).

in particular, i think we could do without the debugging tips, but
i'm open to suggestions.

give this a go again, and keep the corrections coming.

thanks to all who responded, both privately and on the list.

aldy


Dwarf2 Exception Handler HOWTO
Authors: Will Cohen and Andwrew Macleod, Red Hat Inc.
				   

**Description of EH **

Exception Handling (EH) is provided by languages such as C++ and Java
to indicate something unusual has happened in a called function and
special action needs to be taken to resolve the problem.  C++ and Java
surround code that may produce an exception with a try
block.  Following the try block will be a catch block.  The code in the
catch block is only executed if a throw is encountered within the
scope of the try block.  A throw is used to start the exception
handling process.  If a throw is encountered, the processor goes up
the call chain looking for an appropriate catch.  Once an appropriate
catch is found, the registers and stack are fixed up to resume
execution in the catch block.


** How EH works **

Although the mechanism is simple in concept, the details introduced by
the compiler complicate the mechanism.  Most compiler ports use the
processor registers to store values. The register values are often
saved on the stack in a function's prologue and restored in the
function's epilogue to give the function additional working registers.
When the throw occurs, the processor does not complete the execution
of the functions on the call chain between the function that contains
the catch and the function performing the throw. However, the
processor must restore the register values (including the frame and
stack pointers) even though epilogues are not executed for functions
on the call chain.  In GCC the exception handling process can be
implemented either with a setjmp/longjmp mechanism or a stack unwinder
that uses dwarf2 debugging information to determine how to restore the
registers.  Because the dwarf2 mechanism leads to more compact code
that executes more efficiently, the disccusion will be limited to the
dwarf2 mechanism.

Consider the following function call chain:

  a() calls b() calls c()

If c() throws, and the throw is caught by a(), we have to restore all
the registers which c() saved in the prologue, then restore all the
registers which b()'s prologue saved.  Finally, control is transfered
back to the exception handler located in the catch of function a().
This puts all the correct values back in their places so that a() will
execute properly.

The epilogues cannot be run for each function because they will do
other unrelated things (for example produce return values), as well as
assume a certain start state themselves.  Thus, a stack unwinding
mechanism is used to restore the registers.  The compiler marks every
functions' prologue instructions that adjust the stack pointer, change
the frame pointer, or store a saved register with a
RTX_FRAME_RELATED_EXPR note.  This information to unwind the stack is
recorded in the dwarf2 debugging information.

The throw can use an object to pass information back to a catch.
Because the portion of the stack being used by the function initiating
the throw will soon be removed, memory for the object being thrown
will be allocated on the heap rather than the stack, so the object
will still exist at the catch.  This is implemented with a call to the
function __cxa_allocate_exception() followed by a call to a
constructor to initalize the object.  After the object is created, the
throw is initiated with __cxa_throw(), and the processor will never
return from this call.

The unwinding is performed in two steps.  The first phase searchs for
a frame with an associated catch and tracks the values for the
registers.  Much of the first phase is managed by
_Unwind_RaiseException() in unwind.inc.  The second phase installs the
restored values for the registers, adjusts the stack and jumps to the
catch.  This is implemented by special code in the function epilogue
of _Unwind_Raise_Exception().

The actual jump that transfers to the catch usually jumps to a landing
pad rather than to the catch directly.  The landing pad may perform
fixup code within the function due to the optimizations performed
within the function with the catch.  After the landing pad code is
executed, a switch case is executed to determine what action to take.
There may be multiple catches associated with a particular try, each
is for a different type of object thrown.  It is possible that none of
the catches in that frame match and the unwinding will need to resume
to find an appropriate catch.  The information on the type of object
being thrown is passed in one of the processor registers.

For more information about the operation of the EH mechanism read the
following documents:

  Exception Handling on HP-UX
  http://www.usenix.org/events/osdi2000/wiess2000/full_papers/dinechin/dinechin_html/

  C++ ABI for Itanium: Exception Handling
  http://www.codesourcery.com/cxx-abi/abi-eh.html

  Exception Handling for a C++ on Tahoe


** Requirements for EH **

The exception handing via dwarf2 debugging information requires
several things to work:

-Two registers to pass information into catch (EH_RETURN_DATA_REGNO):
	exc_ptr EH_RETURN_DATA_REGNO(0)
	filter (select appropriate catch)  EH_RETURN_DATA_REGNO(1)
-Two registers in epilogue described by RTL:
	EH_RETURN_STACKADJ_RTX (how much to bump stack pointer)
	EH_RETURN_HANDLER_RTX	(where the jump should return to)
-Unaligned accesses to read dwarf2 information
-Binutils that undertand dwarf2 object file format for the target


** Implementing Dwarf2 EH **

Given a function call chain as was described above,
RTX_FRAME_RELATED_EXPR notes are specified for a handful of significant
instructions which affect registers the unwinder cares about. In
general, these are:

  The stack pointer.

  The frame pointer.

  The return address register, if it is not in memory.

  Any call preserved register whose value is saved and is restored by the 
  epilogue.

The compiler then examines these instructions and generates dwarf2
unwind information in the .eh_frame section.  This information,
encoded in dwarf2, are instructions for a state machine which
describes where values are saved, and how to get at them.  The library
libgcc contains a dwarf2 interpreter which is used at runtime to
actually get these values and restore them to their correct registers.
Once described properly, this all happens automatically.  Most of the
time, ports do not need to modify the runtime at all, just set things
up in their config directory and flag the proper instructions.

In order to understand which instructions are actually significant, it
will help if you understand approximately how the dwarf2 unwind code
works.

In order to restore the registers for function c() (which is executing
a throw), we need to execute the dwarf2 code representing instructions
up to the point where the throw happens.  I.e., interpret the dwarf2
code which will "undo" the register saves which have been performed up
until the point of the throw.  This part is taken care of
automatically as well.  What is important is that we tag all the
correct instructions.  Again, we only care about prologue
instructions.  Any other ones which the unwinder might care about it
will find itself.  Usually, these are just other stack bumps within
the body of a function.

The dwarf2 unwinder keeps a value called the Canonical Frame Address
(CFA).  All memory references it makes are relative to this address.
Initially, this value is defined as the value of the stack pointer
upon entry to the function (i.e. before any instructions are actually
executed).  Any memory references in the prologue to save the values
of register are stored in the dwarf2 info as a POSITIVE offset to the
value of the CFA.  It looks at STACK_GROW_DOWNWARD to determine
whether that offset is added or subtracted from the CFA.

The emitter tracks the value of the CFA by remembering what register
it is based off of, and an offset to this register.  By default this
offset is 0, so the CFA is defined upon entry to the function as SP +
0.  If the target stack frame stores any values at a negative offset
to this, (for example, some targets store the return address in the
previous 4 bytes) we need to define an initial offset for the CFA such
that the offset is always non-negative.  This is accomplished by
defining the value the following macro in the target's header file.

#define INCOMING_FRAME_SP_OFFSET x

Where x would normally be 0, but in this case we need to specify 4 or
-4 in order for all the offsets to be positive.  The unwinder looks to
see if STACK_GROWS_DOWNWARD is defined to determine which way the
stack goes, and adjusts the sign of all its offsets appropriately.  If
the stack does grow downward, it knows all saves/load offsets will
actually be subtracted from the CFA, or if the stack grows upward, the
offsets will all be added to the CFA.  In order to get the initial
offset value of the CFA correct, you will need to subtract 4 bytes if
the stack grows upward, or add 4 bytes if the stack grows downward.

We have to flag any instruction in the prologue which executes a register
save which is in turn restored in the epilogue. The dwarf2 emitter
needs to be able to examine the instruction and determine at what offset this
store is going to happen from the CFA.  The register number and this
value is then inserted into the dwarf2 code stream.  The key here is
that the emitter needs to be able to tell from the instruction what the
address is.  Currently, the emitter only handles simple cases, so the
instruction needs to be relatively self explanatory:

Since the CFA is initially defined as the value of the stack pointer,
it is easy if the instruction saves off the stack pointer.  For
example:

(set (mem:SI 
        (plus:SI (reg:SI SP) (const_int 8)))
     (reg:SI 8)) 

This saves register 8 at CFA + 8.  It's easy to figure it from looking
at the instruction, as long as we know of any modifications we've made
to SP since the start of the function.  So the above insttruction will
need no annotations.

However, if our save instruction instead looks like:

(set (mem:SI
        (plus:SI (reg:SI SP) (reg:SI 6)))
     (reg:SI 8))

In this case, the unwinder has no way of knowing at what offset
register 8 is being saved at, so the prologue code should emit an
RTX_FRAME_RELATED_EXPR note that describes the action.

This is done by setting the RTX_FLAG_RELATED_P flag on the
instruction, and attaching a REG_FRAME_RELATED_EXPR note to it.  If
this note is present, the unwinder ignores the instruction itself, and
looks at the note to determine the instruction's action.  For example,
if the value of register 6 is 24, we would attach a note to our
initial instruction like, such that the resulting instruction looks
like this:

  (insn x x x x (set (mem:SI
                       (plus:SI (reg:SI SP) (reg:SI 6)))
                     (reg:SI 8))
		(expr_list:REG_FRAME_RELATED_EXPR
		  (set (mem:SI
		         (plus:SI (reg:SI SP) (const_int 24)))
		       (reg:SI 8))))

It is very important that at any given point in the function, the
unwinder knows how to find the value of the CFA.  Sometimes it is
easy, as this value might be contained in the frame_pointer for the
duration of the function, but other times we might have to calculate
it.  The runtime part takes care of this, inasmuch as we flag the
appropriate instructions.


Relevant Macros:

** RETURN_ADDR_RTX

This macro must be defined for a frame value of 0.  It must be
possible to retrieve the return address pointer in the current
function in order to throw properly.  This means that the prologue and
epilogue must be structured in a way such that the return address is
stored at a known offset or available in a register.  You must define
this macro or the call to builtin_return_address(0) will assume that
that your return address is at SP + sizeof(Pmode).


** INCOMING_RETURN_ADDR_RTX

This is the macro that enables dwarf2 EH.  If it is not defined,
exception handling will be implemented using setjmp and longjmp.

The value of the macro is an RTX expression which is used to
determine where the return address is located upon entry to a
function, before any prologue instructions are exceuted.  If the
return address is passed in in a register, it would look something
like:

#define INCOMING_RETURN_ADDR_RTX  gen_rtx_REG (Pmode, 26)


** DWARF_FRAME_RETURN_COLUMN

This tells the dwarf2 unwind mechanism which dwarf2 register slot the
return address can be found.  This is only a temporary internal
storage location the unwinder uses to track things, so it doesn't have
to be the correct location, it just has to be one which is not going
to be used for anything else.  The dwarf unwind mechanism keeps its
own internal register mapping list to track where various hardware
register are saved.  This column number is simply the index of where
in this internal list we can use a place mark for the return address
value.

If the return address is kept in a dedicated register, you should
define this macro to refer to that register.  For example, the arm
backend defines this as:

#define DWARF_FRAME_RETURN_COLUMN	DWARF_FRAME_REGNUM (LR_REGNUM)

By default, this will be either the PC slot, or the first value past
the end of the hard registers.  Generally, you only have to worry
about this value if your port has a lot of registers. The dwarf2
unwind spec requires that the return address column number be a single
byte value, so it must be less than 256.

The only times you will have to define this will be if:

    - the return addess is stored in a hard register OR

    - you have more than 255 physical registers AND

    - the PC register has a register number greater than 255.

When setting this macro, you need to choose some register whose gcc
register number is less than 256, a register which will never be saved
and restored in the prologue.


** EH_RETURN_DATA_REGNO(N) **

The new implementation of the EH uses two registers to pass
information back to catch.  The macro EH_RETURN_DATA_REGNO maps
the values to hard registers.  These registers require stack slots, so
they cannot be scratch registers that are not saved across function
calls.  The macro EH_RETURN_DATA_REGNO will need to be defined and
return appropriate register numbers for the values 0 and 1.  Below is
an example:

#define EH_RETURN_DATA_REGNO(N) \
	((N) == 0 ? GPR_R7 : (N) == 1 ? GPR_R8 : INVALID_REGNUM)


** EH_RETURN_HANDLER_RTX **

The EH_RETURN_HANDLER_STACKADJ_RTX macro returns RTL which describes
the location used to store the address the processor should jump to
catch exception.  This is usually a register that is available from
end of the function's body to the end of the epilogue. Thus, this
cannot be a register used as a temporary by the epilogue.


** EPILOGUE COMMUNICATION

In key function in the unwinder is _Unwind_RaiseException().  In order
for _Unwind_RaiseException() to work properly, it needs to be able to
transfer control to the appropriate catch handler.  Consequently,
_Unwind_RaiseException() is processed in a special manner.  First, it is
compiled such that every possible preserved register is saved in the
prologue and restored in the epilogue (this is done by virtue of
__Unwind_Raise_Exception() calling uw_init_context(), which calls
__builtin_unwind_init, which sets the function as having a nonlocal
receiving function, causing gcc to mark all registers as live).

Next, the unwinder uses the EH tables to determine where this throw
should transfer control to.  The dwarf2 unwind interpreter is used to
figure out what values are supposed to be in which registers.  As the
various values of the register are determined during unwinding, they
are saved in a table which tracks the position in the frame of each
value.  We are not unwinding the stack yet, just determining where the
right values for each register are currently located.

When we are ready to unwind the stack, we go through this table, and
if any register does not already have the right value (for example it
was saved in some prologue), we know where it is saved, and we copy it
from that location into the place where _Unwind_RaiseException's
prologue stored it.  So we overwrite the value _Unwind_RaiseException()
saved with the value we think the register should have when we
transfer control to our selected handler.  When we return from
_Unwind_RaiseException(), the epilogue will restore these values.

There are still a few values which need to be fixed up.  First, the
return address of the function that called _Unwind_RaiseException()
must be replaced with the address of the desired handler.  When the
return from _Unwind_RaiseException occurs, it will actually transfer
control to the desired handler instead of returning to where
_Unwind_RaiseException was called. The value of the stack pointer also
needs to be adjusted.

Most of this is handled automatically, but you will have to do adjust
the return address and stack pointer in the epilogue.  You will need
to define an 'eh_return' pattern in your md file which will save the
stack adjustment and return address values to the appropriate
temporary registers.  The code to generate the function epilogue will
use the values in these registers to adjust the stack and jump to the
appropriate location.

The trick with the eh_return instruction is that you will need to find
2 registers to use from the end of the function to the end of the
epilogue.  The last thing _Unwind_RaiseException() does is process the
eh_return instruction which will set the stack offset and return
address into the 2 registers specified in the eh_epilogue.  Until
needed at the end of the epilogue, these values cannot be overwritten.
Typically, you pick 2 registers which are not preserved over calls,
nor used as temporaries during epilogue processing.  It is possible to
use the register holding the return function value for the stack
adjustment value.  If the processor uses a register to hold the return
address and you can prevent the epilogue from reloading the register
from the stack, you can store the target address in the return address
register.

You will need two additional registers to communicate information to
the catch.  These registers require stack slots, so they cannot be
scratch registers that are not saved across function calls.  The macro
EH_RETURN_DATA_REGNO will need to be defined and return appropriate
register numbers for the values 0 and 1.


** REGISTERING FRAME INFORMATION

In order for exception handling to work, you need to register the
frame information at runtime.  This is accomplished in the same way
that constructors and destructors are registered.

__register_frame_info() needs to be called in exactly the same way
that a static constructor would be called.  Each object file has a
.eh_frame section which contains the frame information.  If the given
port does not support named sections, then the frame information will
be issued in a text section with the label __FRAME_BEGIN__.

__register_frame_info should be called before any constructors, in
case a constructor throws an exception.

The frame section's address is the first argument passed to
__register_frame_info().  The second argument is the address of a
local frame object.  This is the 'struct object' defined in frame.h.

After all the destructors are being called, __deregister_frame_info()
must be called with the address of the start of the section.

** DEBUGGING TIPS

First, compile a simple test case:

#include <stdio.h>

main() {
  try {
    throw 1;
  }
  catch (...) {
    printf(" in catch\n");
    return 1;
  }
  printf(" back in main\n");
  return 10;
}

When compiled and run like the following it should generate the output
"in catch":

bash-2.04$ gcc -o throw0.x86 throw0.C
bash-2.04$ ./throw0.x86
 in catch

*1*

Compile it with -S -dA and look at the .s file.  There should be an
.eh_frame section/area beginning with __FRAME_BEGIN__ and it should
be annotated with comments about what each dwarf2 instruction is.

If there is EH data, you should also see a .gcc_except_table
section/area, beginning with the label: __EXCEPTION_TABLE__.

If neither of these are present, this is the first aspect to fix.
Usually these will show up on their own, but if you do not have named
sections you might need to coerce it a bit.  Make sure you have
enabled DWARF2 EH by defining INCOMING_RETURN_ADDR_RTX, or you will
not get these tables. Also look in except.h to make sure that all the
conditions hold so that MUST_USE_SJLJ_EXCEPTIONS is defined to be 0.
If MUST_USE_SJLJ_EXCEPTIONS is set to 1, then the setjmp/longjmp
mechanism will be used.

Also check to make sure the instructions in the prologue are properly
marked so the unwinder can track register values.  This can be checked
by using readelf.  Look for the saves of the appropriate registers in
the Frame Descriptor Entry (FDE).  The following is the output of
readelf for an x86 program. The FDE shows saves of r3 and r5:

bash-2.04$ readelf --debug=frames throw1.x86
The section .eh_frame contains:

00000000 00000014 00000000 CIE
  Version:               1
  Augmentation:          "eh"
  Code alignment factor: 1
  Data alignment factor: -4
  Return address column: 8

  DW_CFA_def_cfa: r4 ofs 4
  DW_CFA_offset: r8 at cfa-4

00000018 0000002c 0000001c FDE cie=00000000 pc=08048730..080487ce
  DW_CFA_advance_loc: 1 to 08048731
  DW_CFA_def_cfa_offset: 8
  DW_CFA_offset: r5 at cfa-8
  DW_CFA_advance_loc: 2 to 08048733
  DW_CFA_def_cfa_reg: r5
  DW_CFA_advance_loc: 1 to 08048734
  DW_CFA_offset: r3 at cfa-12
  DW_CFA_advance_loc: 11 to 0804873f
  DW_CFA_GNU_args_size: 16
  DW_CFA_advance_loc: 37 to 08048764
  DW_CFA_GNU_args_size: 8
  DW_CFA_advance_loc: 12 to 08048770
  DW_CFA_GNU_args_size: 16
  DW_CFA_advance_loc: 8 to 08048778
  DW_CFA_GNU_args_size: 0
  DW_CFA_advance_loc: 16 to 08048788
  DW_CFA_GNU_args_size: 16
  DW_CFA_advance_loc: 20 to 0804879c
  DW_CFA_GNU_args_size: 0
  DW_CFA_advance_loc: 18 to 080487ae
  DW_CFA_GNU_args_size: 16


If the EH is using the dwarf2 stack unwinding, there should not be calls
to setjmp or longjmp in the assembly language code.


*2*

Run the executable with gdb, and put a breakpoint in
__register_frame_info ().

If the routine does not exist, or it is never called, then the problem
is that the unwind frames are not being registered at startup.
Generally, what I will do here is compile a short test case which
contains a static constructor:

class  A {
public:
  A() { }
  ~A () { }
};

A a;

main () {
}


compile and run it through gdb, setting a breakpoint in 
A::A().

If that does not work, then constructors and destructors in general are
broken and needs to be fixed.  Until this works, EH frames will not
get registered.

Assuming this does work, you can look at the traceback in gdb and see
how static constructors are initialized and work on getting the
eh_frames registered via a similar mechanism.


*3*

If the __register_frame_info() breakpoint gets hit, then the problem
is most likely in the actual unwinding.  Typically, something in the
prologue is incorrect, but it could be your eh_epilogue.  Also check
to make sure that RETURN_ADDR_RTX is defined properly.  In any case,
now you have to debug _Unwind_RaiseException() in unwind.inc.  If you
are lucky, running your test program on a debugger will actually give
you a decent traceback and you can track it back from there to see
what has actually gone wrong.  This is the point at which it is hard
to write down what to look for in a document, but you want to watch
for a few things:

  - Is the _Unwind_RaiseException() unwinding the stack correctly?  If
everything is operating correctly, the processor should execute the
uw_install_context at the end of _Unwind_RaiseException() to restore
the registers to the proper values and jump to the approriate
exception handling code.  _Unwind_RaiseException() may not find a
frame that has a catch, and unwind the stack until there is not stack
left.  It would return _URC_END_OF_STACK in this case.

  - Verify that the unaligned loads are working properly for the gcc
port.  Check that the tests gcc.c-torture/execute/packed-1.c and
gcc.c-torture/execute/packed-2.c work. The dwarf2 data is not aligned.
Thus, the debugging information is not correctly read if unaligned
data accesses do not work.

  - One possible causes of _Unwind_RaiseException() not correctly
unwinding the stack is inccorrect return addresses.  Because the
unwinding mechanism uses the return addresses to determine which FDE
to use to track the stack unwinding, you will want to verify that the
correct return addresses are being used.  Obtain a disassembled version
of the excutable with objdump (-d option), so you can map the return
address back to the original code.  In the function uw_frame_state_for
print out the value for context->ra.  The first time
_Unwind_RaiseException() calls uw_frame_state_for() (from
uw_init_context()) it should produce a return address within
_Unwind_RaiseException().  Each following call to uw_frame_state_for()
should go up the call chain, so initially _Uwind_RaiseException(), then
__cxa_throw() and then whatever function performing the throw.  If the
unwinding gets the return address wrong, it cannot find the correct
FDE to figure out how to get the next frame.

  - Another cause of incorrect stack unwinding is not getting the CFA
to update the registers. The context should have a slot that points
the frame pointer register. You should be able to set break points in
the function that does the throw and anything else it calls.  Print
out the stack and frame pointer after the execution of the prologue to
these functions.  Compare these values to the values to context->cfa
for the various iterations of the loop in
_Unwind_RaiseException().

  - If the processor makes it to the uw_install_context() at the end of
_Unwind_RaiseException(), but does not seem to be executing the code in
the catch, step through the machine instructions in the epilogue of
_Unwind_RaiseException().  Examine the value that the stack pointer and
stack pointer are set to.  The frame pointer should be set to the same
value as the frame pointer for the function containing the catch.  Step
through the return, which transfers control to the catch. Does it jump
to a reasonable place?

  - Sometimes there are problems in the c++ specific part of
the port,  typically the rtti (run time type info) stuff that the handler
uses to figure out the type of the throw, etc. If the program is crashing in 
__is_pointer() or __cplus_type_matcher(), this is likely your cause.

*4* 

Once the simple test case works (simple, because the throw is in the same
function as the handler, so no multiple stacks needs to be unwound, just
the mechanism be present), try this slightly more complex one:

#include <stdio.h>


void f ()
{
  printf (" in f()\n");
  throw 1;
}


main() {
  try {
  printf(" before throw\n");
    f();
  }
  catch (...) {
  printf(" in catch\n");
    return 6;
  }
  printf(" back in main\n");
return 10;
}


Running this one will require that we unwind through f(), and require
stack adjustments, and exercises most of the unwind mechanism.


*5*

Check to see that the C++ specific part of EH works and that
constructors and destructors are being called in the EH process.


// Testcase for proper handling of
// c++ type, constructors and destructors.

#include <stdio.h>

int c, d;

struct A
{
  int i;
  A () { i = ++c; printf ("A() %d\n", i); }
  A (const A&) { i = ++c; printf ("A(const A&) %d\n", i); }
  ~A() { printf ("~A() %d\n", i); ++d; }
};

void
f()
{
  printf ("Throwing 1...\n");
  throw A();
}


int
main ()
{
  try
    {
      f();
    }
  catch (A)
    {
      printf ("Caught.\n");
    }
  printf ("c == %d, d == %d\n", c, d);
  return c != d;
}


You should get the following output:

Throwing 1...
A() 1
A(const A&) 2
~A() 1
A(const A&) 3
Caught.
~A() 3
~A() 2
c == 3, d == 3


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]