DWARF Package File Format

This document describes the file format for DWARF package files, version 2. A DWARF package file is an ELF-format file that collects the contents from the separate DWARF object (.dwo) files produced during the compilation of an application. DWARF object (.dwo) files are produced when compiling with GCC's -gsplit-dwarf option (see DWARF Extensions for Separate Debug Info Files).

DWARF package files can be built with the dwp utility, which is built along with the gold linker in binutils. Its usage is:

Usage: dwp [options] [file...]
  -h, --help               Print this help message
  -e EXE, --exec EXE       Get list of dwo files from EXE (defaults output to EXE.dwp)
  -o FILE, --output FILE   Set output dwp file name
  -v, --verbose            Verbose output
  -V, --version            Print version number

If given a list of .dwo files, it will build a package file with the name given by the -o option. If given an executable or shared library with the -e option, it will read the executable to obtain the list of .dwo files, and build a package file with the extension .dwp appended to the name of the executable file (unless the -o option is also given).

Differences from Version 1

Version 1 was an experimental version that is no longer supported.

In the first version of this format, each compilation unit has a separate set of sections for .debug_info and associated sections. In version 2, the .debug_info sections from each compilation unit are combined into a single .debug_info section in the .dwp file. Likewise, the .debug_types, .debug_abbrev, .debug_line, .debug_loc, .debug_str_offsets, .debug_macinfo, and .debug_macro sections are combined, and the CU and TU index sections now record the various offsets into those sections where the contributions for each CU or TU begin.

Version 2 adds additional fast lookup tables to the package file. These tables are described in New DWARF Fast Lookup Tables.

Version 2 also adds the concept of a “thin” package file, which contains an index of debug information without copying the actual information from the .dwo files.

Design Goals

The design of the DWARF package (.dwp) file is guided by the following goals:

High-Level File Structure

The .dwp file is an ELF-format file, using the same byte order and size as the corresponding application binary. It consists only of a file header, section table, a number of DWARF debug information sections, two index sections, and (optionally) the new fast lookup table sections.

Each .dwp file will contain no more than one of each of the following sections:

.debug_info.dwo
.debug_types.dwo
.debug_abbrev.dwo
.debug_line.dwo
.debug_loc.dwo
.debug_str_offsets.dwo
.debug_str.dwo
.debug_macinfo.dwo
.debug_macro.dwo

The string table section in .debug_str.dwo contains all the strings referenced from DWARF attributes using the form DW_FORM_str_index. Any attribute in a compilation unit or a type unit using this form will refer to an entry in that unit's contribution to the .debug_str_offsets.dwo section, which in turn will provide the offset of a string in the .debug_str.dwo section.

For the purposes of recording offsets to DIEs in a compilation unit or in a type unit, we define a virtual debug address space consisting of the .debug_info.dwo section combined with the .debug_types.dwo section. The .debug_info.dwo section begins at offset 0 within this address space, and the .debug_types.dwo section begins immediately following the end of the .debug_info.dwo section. In the fast lookup table sections, references to a CU, TU, or a DIE within a CU or TU, will use offsets within the virtual debug address space.

Package files may be “thin” or self-contained. A thin package file contains index information only, and refers to the debug information in the original .dwo files. A self-contained file contains index information as well as all the debugging information. A self-contained file must contain a .debug_info.dwo and .debug_abbrev.dwo section; all others are present only if needed.

The CU Index Section

The first index section is a compilation unit index that maps a compilation unit signature to the offset of a CU within the virtual debug address space, and to a set of offsets into the various other debug information sections. This section is named .debug_cu_index.

Each compilation unit set must contain a contribution from each of the following sections:

.debug_info.dwo
.debug_abbrev.dwo

Each compilation unit set may also contain a contribution from each of the following sections:

.debug_line.dwo
.debug_loc.dwo
.debug_str_offsets.dwo
.debug_macinfo.dwo
.debug_macro.dwo

(Note that a set should not contain both .debug_macinfo.dwo and .debug_macro.dwo. The latter is an extension that is intended to replace the former in a future version of DWARF.)

The TU Index Section

The second index section is a type unit index that maps a type signature to the offset of a TU within the virtual debug address space, and to a set of offsets into the various other debug information sections. This section is named .debug_tu_index.

Each type unit set must contain a contribution from each of the following sections:

.debug_types.dwo
.debug_abbrev.dwo

Each type unit set may also contain a contribution from each of the following sections:

.debug_line.dwo
.debug_str_offsets.dwo

The Fast Lookup Table Sections

The following fast lookup table sections may also be present in the .dwp file:

.debug_names
.debug_typenames
.debug_namespaces

The index entries in these sections refer to DIEs in the .debug_info or .debug_types sections by their offset in virtual debug address space.

The fast lookup table sections are a separate DWARF proposal, currently still under development.

The File Directory Section

A thin package file requires one additional section that provides a directory of the referenced .dwo files.

Format of the CU and TU Index Sections

Both index sections have the same format, and serve to map a 64-bit signature to a set of contributions to the debug sections. Each section begins with a header, followed by a hash table of signatures, a parallel table of indexes, a table of offsets, and a table of sizes. The index sections will be aligned at 8-byte boundaries in the file.

DWARF Package File Format, Version 2.png

The index section header contains four unsigned 32-bit values (using the byte order of the application binary):

(We assume that N and M will not exceed 232.)

The size of the hash table, M, must be 2k such that 2k > 3 * N / 2.

The hash table begins at offset 16 in the section, and consists of an array of M 64-bit slots. Each slot contains a 64-bit signature (using the byte order of the application binary).

The parallel table begins immediately after the hash table (at offset 16 + 8 * M from the beginning of the section), and consists of an array of M 32-bit slots (using the byte order of the application binary), corresponding 1-1 with slots in the hash table. Each entry in the parallel table contains a row index into the pool of offsets.

Unused slots in the hash table will have 0 in both the hash table entry and the parallel table entry. While 0 is a valid hash value, the row index in a used slot will always be non-zero.

Given a 64-bit compilation unit signature or a type signature S, an entry in the hash table is located as follows:

  1. Calculate a primary hash H = S & MASK(k), where MASK(k) is a mask with the low-order k bits all set to 1.

  2. Calculate a secondary hash H' = (((S >> 32) & MASK(k)) | 1).

  3. If the hash table entry at index H matches the signature, use that entry. If the hash table entry at index H is unused (all zeroes), terminate the search: the signature is not present in the table.
  4. Let H = (H + H') modulo M. Repeat at Step 3.

Because M > N and H' and M are relatively prime, the search is guaranteed to stop at an unused slot or find the match.

The table of offsets begins immediately following the parallel table (at offset 16 + 12 * M from the beginning of the section). The table is a two-dimensional array of 32-bit words (using the byte order of the application binary), with L columns and N+1 rows, in row-major order. Each row in the array is indexed starting from 0. The first row provides a key to the remaining rows: each column in this row provides an identifier for a debug section, and the offsets in the same column of subsequent rows refer to that section. The section identifiers are:

DW_SECT_INFO

1

.debug_info.dwo

DW_SECT_TYPES

2

.debug_types.dwo

DW_SECT_ABBREV

3

.debug_abbrev.dwo

DW_SECT_LINE

4

.debug_line.dwo

DW_SECT_LOC

5

.debug_loc.dwo

DW_SECT_STR_OFFSETS

6

.debug_str_offsets.dwo

DW_SECT_MACINFO

7

.debug_macinfo.dwo

DW_SECT_MACRO

8

.debug_macro.dwo

The offsets provided by the CU and TU index sections are the base offsets for the contributions made by each CU or TU to the corresponding section in the package file. Each CU and TU header contains an abbrev_offset field, used to find the abbreviations table for that CU or TU within the contribution to the .debug_abbrev.dwo section for that CU or TU, and should be interpreted as relative to the base offset given in the index section. Likewise, offsets into .debug_line.dwo from DW_AT_stmt_list attributes should be interpreted as relative to the base offset for .debug_line.dwo, and offsets into other debug sections obtained from DWARF attributes should also be interpreted as relative to the corresponding base offset.

The table of sizes begins immediately following the table of offsets, and provides the sizes of the contributions made by each CU or TU to the corresponding section in the package file. Like the table of offsets, it is a two-dimensional array of 32-bit words, with L columns and N rows, in row-major order. Each row in the array is indexed starting from 1 (row 0 is shared by the two tables).

Format of the File Directory Section

[TBD. For thin package files only.]

Notes

None: DebugFissionDWP (last edited 2016-03-16 12:01:27 by tschwinge)