This document is taken from the version revised by Aldy Hernandez and posted to the GCC mailing list here.

Until this document is fully marked-up, here are clickable versions of the embedded links:

http://www.usenix.org/events/osdi2000/wiess2000/full_papers/dinechin/dinechin_html/

http://www.codesourcery.com/cxx-abi/abi-eh.html

Dwarf2 Exception Handler HOWTO

Authors: Will Cohen and Andwrew Macleod, Red Hat Inc.

Description of EH

Exception Handling (EH) is provided by languages such as C++ and Java to indicate something unusual has happened in a called function and special action needs to be taken to resolve the problem. C++ and Java surround code that may produce an exception with a try block. Following the try block will be a catch block. The code in the catch block is only executed if a throw is encountered within the scope of the try block. A throw is used to start the exception handling process. If a throw is encountered, the processor goes up the call chain looking for an appropriate catch. Once an appropriate catch is found, the registers and stack are fixed up to resume execution in the catch block.

How EH works

Although the mechanism is simple in concept, the details introduced by the compiler complicate the mechanism. Most compiler ports use the processor registers to store values. The register values are often saved on the stack in a function's prologue and restored in the function's epilogue to give the function additional working registers. When the throw occurs, the processor does not complete the execution of the functions on the call chain between the function that contains the catch and the function performing the throw. However, the processor must restore the register values (including the frame and stack pointers) even though epilogues are not executed for functions on the call chain. In GCC the exception handling process can be implemented either with a setjmp/longjmp mechanism or a stack unwinder that uses dwarf2 debugging information to determine how to restore the registers. Because the dwarf2 mechanism leads to more compact code that executes more efficiently, the disccusion will be limited to the dwarf2 mechanism.

Consider the following function call chain:

  a() calls b() calls c() 

If c() throws, and the throw is caught by a(), we have to restore all the registers which c() saved in the prologue, then restore all the registers which b()'s prologue saved. Finally, control is transfered back to the exception handler located in the catch of function a(). This puts all the correct values back in their places so that a() will execute properly.

The epilogues cannot be run for each function because they will do other unrelated things (for example produce return values), as well as assume a certain start state themselves. Thus, a stack unwinding mechanism is used to restore the registers. The compiler marks every functions' prologue instructions that adjust the stack pointer, change the frame pointer, or store a saved register with a RTX_FRAME_RELATED_EXPR note. This information to unwind the stack is recorded in the dwarf2 debugging information.

The throw can use an object to pass information back to a catch. Because the portion of the stack being used by the function initiating the throw will soon be removed, memory for the object being thrown will be allocated on the heap rather than the stack, so the object will still exist at the catch. This is implemented with a call to the function __cxa_allocate_exception() followed by a call to a constructor to initalize the object. After the object is created, the throw is initiated with __cxa_throw(), and the processor will never return from this call.

The unwinding is performed in two steps. The first phase searchs for a frame with an associated catch and tracks the values for the registers. Much of the first phase is managed by _Unwind_RaiseException() in unwind.inc. The second phase installs the restored values for the registers, adjusts the stack and jumps to the catch. This is implemented by special code in the function epilogue of _Unwind_Raise_Exception().

The actual jump that transfers to the catch usually jumps to a landing pad rather than to the catch directly. The landing pad may perform fixup code within the function due to the optimizations performed within the function with the catch. After the landing pad code is executed, a switch case is executed to determine what action to take. There may be multiple catches associated with a particular try, each is for a different type of object thrown. It is possible that none of the catches in that frame match and the unwinding will need to resume to find an appropriate catch. The information on the type of object being thrown is passed in one of the processor registers.

For more information about the operation of the EH mechanism read the following documents:

Requirements for EH

The exception handing via dwarf2 debugging information requires several things to work:

Implementing Dwarf2 EH

Given a function call chain as was described above, RTX_FRAME_RELATED_EXPR notes are specified for a handful of significant instructions which affect registers the unwinder cares about. In general, these are:

The compiler then examines these instructions and generates dwarf2 unwind information in the .eh_frame section. This information, encoded in dwarf2, are instructions for a state machine which describes where values are saved, and how to get at them. The library libgcc contains a dwarf2 interpreter which is used at runtime to actually get these values and restore them to their correct registers. Once described properly, this all happens automatically. Most of the time, ports do not need to modify the runtime at all, just set things up in their config directory and flag the proper instructions.

In order to understand which instructions are actually significant, it will help if you understand approximately how the dwarf2 unwind code works.

In order to restore the registers for function c() (which is executing a throw), we need to execute the dwarf2 code representing instructions up to the point where the throw happens. I.e., interpret the dwarf2 code which will "undo" the register saves which have been performed up until the point of the throw. This part is taken care of automatically as well. What is important is that we tag all the correct instructions. Again, we only care about prologue instructions. Any other ones which the unwinder might care about it will find itself. Usually, these are just other stack bumps within the body of a function.

The dwarf2 unwinder keeps a value called the Canonical Frame Address (CFA). All memory references it makes are relative to this address. Initially, this value is defined as the value of the stack pointer upon entry to the function (i.e. before any instructions are actually executed). Any memory references in the prologue to save the values of register are stored in the dwarf2 info as a POSITIVE offset to the value of the CFA. It looks at STACK_GROW_DOWNWARD to determine whether that offset is added or subtracted from the CFA.

The emitter tracks the value of the CFA by remembering what register it is based off of, and an offset to this register. By default this offset is 0, so the CFA is defined upon entry to the function as SP + 0. If the target stack frame stores any values at a negative offset to this, (for example, some targets store the return address in the previous 4 bytes) we need to define an initial offset for the CFA such that the offset is always non-negative. This is accomplished by defining the value the following macro in the target's header file.

 #define INCOMING_FRAME_SP_OFFSET x 

Where x would normally be 0, but in this case we need to specify 4 or -4 in order for all the offsets to be positive. The unwinder looks to see if STACK_GROWS_DOWNWARD is defined to determine which way the stack goes, and adjusts the sign of all its offsets appropriately. If the stack does grow downward, it knows all saves/load offsets will actually be subtracted from the CFA, or if the stack grows upward, the offsets will all be added to the CFA. In order to get the initial offset value of the CFA correct, you will need to subtract 4 bytes if the stack grows upward, or add 4 bytes if the stack grows downward.

We have to flag any instruction in the prologue which executes a register save which is in turn restored in the epilogue. The dwarf2 emitter needs to be able to examine the instruction and determine at what offset this store is going to happen from the CFA. The register number and this value is then inserted into the dwarf2 code stream. The key here is that the emitter needs to be able to tell from the instruction what the address is. Currently, the emitter only handles simple cases, so the instruction needs to be relatively self explanatory:

Since the CFA is initially defined as the value of the stack pointer, it is easy if the instruction saves off the stack pointer. For example:

(set (mem:SI
        (plus:SI (reg:SI SP) (const_int 8)))
     (reg:SI 8))

This saves register 8 at CFA + 8. It's easy to figure it from looking at the instruction, as long as we know of any modifications we've made to SP since the start of the function. So the above insttruction will need no annotations.

However, if our save instruction instead looks like:

(set (mem:SI
        (plus:SI (reg:SI SP) (reg:SI 6)))
     (reg:SI 8))

In this case, the unwinder has no way of knowing at what offset register 8 is being saved at, so the prologue code should emit an RTX_FRAME_RELATED_EXPR note that describes the action.

This is done by setting the RTX_FRAME_RELATED_P flag on the instruction, and attaching a REG_FRAME_RELATED_EXPR note to it. If this note is present, the unwinder ignores the instruction itself, and looks at the note to determine the instruction's action. For example, if the value of register 6 is 24, we would attach a note to our initial instruction like, such that the resulting instruction looks like this:

  (insn x x x x (set (mem:SI
                       (plus:SI (reg:SI SP) (reg:SI 6)))
                     (reg:SI 8))
                (expr_list:REG_FRAME_RELATED_EXPR
                  (set (mem:SI
                         (plus:SI (reg:SI SP) (const_int 24)))
                       (reg:SI 8))))

It is very important that at any given point in the function, the unwinder knows how to find the value of the CFA. Sometimes it is easy, as this value might be contained in the frame_pointer for the duration of the function, but other times we might have to calculate it. The runtime part takes care of this, inasmuch as we flag the appropriate instructions.

Relevant Macros:

EPILOGUE COMMUNICATION

In key function in the unwinder is _Unwind_RaiseException(). In order for _Unwind_RaiseException() to work properly, it needs to be able to transfer control to the appropriate catch handler. Consequently, _Unwind_RaiseException() is processed in a special manner. First, it is compiled such that every possible preserved register is saved in the prologue and restored in the epilogue (this is done by virtue of __Unwind_Raise_Exception() calling uw_init_context(), which calls __builtin_unwind_init, which sets the function as having a nonlocal receiving function, causing gcc to mark all registers as live).

Next, the unwinder uses the EH tables to determine where this throw should transfer control to. The dwarf2 unwind interpreter is used to figure out what values are supposed to be in which registers. As the various values of the register are determined during unwinding, they are saved in a table which tracks the position in the frame of each value. We are not unwinding the stack yet, just determining where the right values for each register are currently located.

When we are ready to unwind the stack, we go through this table, and if any register does not already have the right value (for example it was saved in some prologue), we know where it is saved, and we copy it from that location into the place where _Unwind_RaiseException's prologue stored it. So we overwrite the value _Unwind_RaiseException() saved with the value we think the register should have when we transfer control to our selected handler. When we return from _Unwind_RaiseException(), the epilogue will restore these values.

There are still a few values which need to be fixed up. First, the return address of the function that called _Unwind_RaiseException() must be replaced with the address of the desired handler. When the return from _Unwind_RaiseException occurs, it will actually transfer control to the desired handler instead of returning to where _Unwind_RaiseException was called. The value of the stack pointer also needs to be adjusted.

Most of this is handled automatically, but you will have to do adjust the return address and stack pointer in the epilogue. You will need to define an eh_return pattern in your md file which will save the stack adjustment and return address values to the appropriate temporary registers. The code to generate the function epilogue will use the values in these registers to adjust the stack and jump to the appropriate location.

The trick with the eh_return instruction is that you will need to find 2 registers to use from the end of the function to the end of the epilogue. The last thing _Unwind_RaiseException() does is process the eh_return instruction which will set the stack offset and return address into the 2 registers specified in the eh_epilogue. Until needed at the end of the epilogue, these values cannot be overwritten. Typically, you pick 2 registers which are not preserved over calls, nor used as temporaries during epilogue processing. It is possible to use the register holding the return function value for the stack adjustment value. If the processor uses a register to hold the return address and you can prevent the epilogue from reloading the register from the stack, you can store the target address in the return address register.

You will need two additional registers to communicate information to the catch. These registers require stack slots, so they cannot be scratch registers that are not saved across function calls. The macro EH_RETURN_DATA_REGNO will need to be defined and return appropriate register numbers for the values 0 and 1.

REGISTERING FRAME INFORMATION

In order for exception handling to work, you need to register the frame information at runtime. This is accomplished in the same way that constructors and destructors are registered.

__register_frame_info() needs to be called in exactly the same way that a static constructor would be called. Each object file has a .eh_frame section which contains the frame information. If the given port does not support named sections, then the frame information will be issued in a text section with the label __FRAME_BEGIN__.

__register_frame_info should be called before any constructors, in case a constructor throws an exception.

The frame section's address is the first argument passed to __register_frame_info(). The second argument is the address of a local frame object. This is the struct object defined in frame.h.

After all the destructors are being called, __deregister_frame_info() must be called with the address of the start of the section.

DEBUGGING TIPS

First, compile a simple test case:

#include <stdio.h>

main() {
  try {
    throw 1;
  }
  catch (...) {
    printf(" in catch\n");
    return 1;
  }
  printf(" back in main\n");
  return 10;
}

When compiled and run like the following it should generate the output "in catch":

bash-2.04$ gcc -o throw0.x86 throw0.C
bash-2.04$ ./throw0.x86
 in catch
  1. Compile it with -S -dA and look at the .s file. There should be an .eh_frame section/area beginning with __FRAME_BEGIN__ and it should be annotated with comments about what each dwarf2 instruction is.

    If there is EH data, you should also see a .gcc_except_table section/area, beginning with the label: __EXCEPTION_TABLE__. If neither of these are present, this is the first aspect to fix. Usually these will show up on their own, but if you do not have named sections you might need to coerce it a bit. Make sure you have

    enabled DWARF2 EH by defining INCOMING_RETURN_ADDR_RTX, or you will not get these tables. Also look in except.h to make sure that all the conditions hold so that MUST_USE_SJLJ_EXCEPTIONS is defined to be 0. If MUST_USE_SJLJ_EXCEPTIONS is set to 1, then the setjmp/longjmp mechanism will be used. Also check to make sure the instructions in the prologue are properly marked so the unwinder can track register values. This can be checked by using readelf. Look for the saves of the appropriate registers in the Frame Descriptor Entry (FDE). The following is the output of

    readelf for an x86 program. The FDE shows saves of r3 and r5:

bash-2.04$ readelf --debug=frames throw1.x86
The section .eh_frame contains:

00000000 00000014 00000000 CIE
  Version:               1
  Augmentation:          "eh"
  Code alignment factor: 1
  Data alignment factor: -4
  Return address column: 8

  DW_CFA_def_cfa: r4 ofs 4
  DW_CFA_offset: r8 at cfa-4

00000018 0000002c 0000001c FDE cie=00000000 pc=08048730..080487ce
  DW_CFA_advance_loc: 1 to 08048731
  DW_CFA_def_cfa_offset: 8
  DW_CFA_offset: r5 at cfa-8
  DW_CFA_advance_loc: 2 to 08048733
  DW_CFA_def_cfa_reg: r5
  DW_CFA_advance_loc: 1 to 08048734
  DW_CFA_offset: r3 at cfa-12
  DW_CFA_advance_loc: 11 to 0804873f
  DW_CFA_GNU_args_size: 16
  DW_CFA_advance_loc: 37 to 08048764
  DW_CFA_GNU_args_size: 8
  DW_CFA_advance_loc: 12 to 08048770
  DW_CFA_GNU_args_size: 16
  DW_CFA_advance_loc: 8 to 08048778
  DW_CFA_GNU_args_size: 0
  DW_CFA_advance_loc: 16 to 08048788
  DW_CFA_GNU_args_size: 16
  DW_CFA_advance_loc: 20 to 0804879c
  DW_CFA_GNU_args_size: 0
  DW_CFA_advance_loc: 18 to 080487ae
  DW_CFA_GNU_args_size: 16

class  A {
public:
  A() { }
  ~A () { }
};

A a;

main () {
}

#include <stdio.h>


void f ()
{
  printf (" in f()\n");
  throw 1;
}


main() {
  try {
  printf(" before throw\n");
    f();
  }
  catch (...) {
  printf(" in catch\n");
    return 6;
  }
  printf(" back in main\n");
return 10;
}

// Testcase for proper handling of
// c++ type, constructors and destructors.

#include <stdio.h>

int c, d;

struct A
{
  int i;
  A () { i = ++c; printf ("A() %d\n", i); }
  A (const A&) { i = ++c; printf ("A(const A&) %d\n", i); }
  ~A() { printf ("~A() %d\n", i); ++d; }
};

void
f()
{
  printf ("Throwing 1...\n");
  throw A();
}


int
main ()
{
  try
    {
      f();
    }
  catch (A)
    {
      printf ("Caught.\n");
    }
  printf ("c == %d, d == %d\n", c, d);
  return c != d;
}

Throwing 1...
A() 1
A(const A&) 2
~A() 1
A(const A&) 3
Caught.
~A() 3
~A() 2
c == 3, d == 3

None: Dwarf2EHNewbiesHowto (last edited 2010-08-04 10:27:12 by winnie)