Split DWARF and rnglists, gcc vs clang

Simon Marchi simon.marchi@polymtl.ca
Fri Nov 13 14:45:24 GMT 2020


On 2020-11-12 7:11 p.m., Mark Wielaard wrote:
> Hi Simon,
>
> On Thu, Nov 05, 2020 at 11:11:43PM -0500, Simon Marchi wrote:
>> I'm currently squashing some bugs related to .debug_rnglists in GDB, and
>> I happened to notice that clang and gcc do different things when
>> generating rnglists with split DWARF.  I'd like to know if the two
>> behaviors are acceptable, and therefore if we need to make GDB accept
>> both.  Or maybe one of them is not doing things correctly and would need
>> to be fixed.
>>
>> clang generates a .debug_rnglists.dwo section in the .dwo file.  Any
>> DW_FORM_rnglistx attribute in the DWO refers to that section.  That
>> section is not shared with any other object, so DW_AT_rnglists_base is
>> never involved for these attributes.  Note that there might still be a
>> DW_AT_rnglists_base on the DW_TAG_skeleton_unit, in the linked file,
>> used if the skeleton itself has an attribute of form DW_FORM_rnglistx.
>> This rnglist would be found in a .debug_rnglists section in the linked
>> file, shared with the other units of the linked file.
>>
>> gcc generates a single .debug_rnglists in the linked file and no
>> .debug_rnglists.dwo in the DWO files.  So when an attribute has form
>> DW_FORM_rnglistx in a DWO file, I presume we need to do the lookup in
>> the .debug_rnglists section in the linked file, using the
>> DW_AT_rnglists_base attribute found in the corresponding skeleton unit.
>> This looks vaguely similar to how it was done pre-DWARF 5, with
>> DW_AT_GNU_ranges base.
>>
>> So, is gcc wrong here?  I don't see anything in the DWARF 5 spec
>> prohibiting to do it like gcc does, but clang's way of doing it sounds
>> more in-line with the intent of what's described in the DWARF 5 spec.
>> So I wonder if it's maybe an oversight or a misunderstanding between the
>> two compilers.
>
> I think I would have asked the question the other way around :) The
> spec explicitly describes rnglists_base (and loclists_base) as a way
> to reference ranges (loclists) through the index table, so that the
> only relocation you need is in the (skeleton) DIE.

I presume you reference this non-normative text in section 2.17.3?

    This range list representation, the rnglist class, and the related
    DW_AT_rnglists_base attribute are new in DWARF Version 5. Together
    they eliminate most or all of the object language relocations
    previously needed for range lists.

What I understand from this is that the rnglist class and
DW_AT_rnglists_base attribute help reduce the number of relocations in
the non-split case (it removes the need for relocations from
DW_AT_ranges attribute values in .debug_info to .debug_rnglists).  I
don't understand it as saying anything about where to put the rnglist
data in the split-unit case.

> But the rnglists
> (loclists) themselves can still use relocations. A large part of them
> is non-shared addresses, so using indexes (into the .debug_addr
> addr_base) would simply be extra overhead. The relocations will
> disappear once linked, but the index tables won't.
>
> As an alternative, if you like to minimize the amount of debug data in
> the main object file, the spec also describes how to put a whole
> .debug_rnglists.dwo (or .debug_loclists.dwo) in the split dwarf
> file. Then you cannot use all entry encodings and do need to use an
> .debug_addr index to refer to any addresses in that case. So the
> relocations are still there, you just refer to them through an extra
> index indirection.
>
> So I believe both encodings are valid according to the spec. It just
> depends on what you are optimizing for, small main object file size or
> smallest encoding with least number of indirections.

So, if I understand correctly, gcc's way of doing things (putting all
the rnglists in a common .debug_rnglists section) reduces the overall
size of debug info since the rnglists can use the direct addressing
rnglists entries (e.g. DW_RLE_start_end) rather than the indirect ones
(e.g. DW_RLE_startx_endx).  But this come at the expense of a lot of
relocations in the rnglists themselves, since they refer to addresses
directly.

I thought that the main point of split-units was to reduce the number of
relocations processed by the linker and data moved around by the linker,
to reduce link time and provide a better edit-build-debug cycle.  Is
that the case?

Anyway, regardless of the intent, the spec should ideally be clear about
that so we don't have to guess.

> P.S. I am really interested in these interpretations of DWARF, but I
> don't really follow the gdb implementation details very much. Could we
> maybe move discussions like these from the -patches list to the main
> gdb (or gcc) mailinglist?

Sure, I added gdb@ and gcc@.  I also left gdb-patches@ so that it's
possible to follow the discussion there.

Simon


More information about the Gcc mailing list