[PATCH 0/4] Support for the CTF debug format

Jose E. Marchesi jose.marchesi@oracle.com
Fri Jan 22 11:01:33 GMT 2021


Hi people!

Last year we submitted a first patch series introducing support for
the CTF debugging format in GCC [1].  We got a lot of feedback that
prompted us to change the approach used to generate the debug info,
and this patch series is the result of that.

This implementation works, but there are several points that need
discussion and agreement with the upstream community, as they impact
the way debugging options work.  We are also proposing a way to add
additional debugging formats (such as BTF) in the future.  See below
for more details.

[1] https://gcc.gnu.org/legacy-ml/gcc-patches/2019-05/msg01297.html

About CTF
=========

CTF is a debugging format designed in order to express C types in a
very compact way.  The key is compactness and simplicity.  For more
information see:

- CTF specification
  http://www.esperi.org.uk/~oranix/ctf/ctf-spec.pdf

- Compact C-Type support in the GNU toolchain (talk + slides)
  https://linuxplumbersconf.org/event/4/contributions/396/

- On type de-duplication in CTF (talk + slides)
  https://linuxplumbersconf.org/event/7/contributions/725/

CTF in the GNU Toolchain
========================

During the last year we have been working in adding support for CTF to
several components of the GNU toolchain:

- binutils support is already upstream.  It supports linking objects
  with CTF information with full type de-duplication.

- GDB support is to be sent upstream very shortly.  It makes the
  debugger capable to use the CTF information whenever available.
  This is useful in cases where DWARF has been stripped out but CTF is
  kept.

- GCC support is being discussed and submitted in this series.

>From debug hooks to debug formats
=================================

Our first attempt in adding CTF to GCC used the obvious approach of
adding a new set of debug hooks as defined in gcc/debug.h.

During our first interaction with the upstream community we were told
to _not_ use debug hooks, because these are to be obsoleted at some
point.  We were suggested to instead hook our handlers (which
processed type TREE nodes producing CTF types from them) somewhere
else.  So we did.

However at the time we were also facing the need to support BTF, which
is another type-related debug format needed by the BPF GCC backend.
Hooking here and there doesn't sound like such a good idea when it
comes to support several debug formats.

Therefore we thought about how to make GCC support diverse debugging
formats in a better way.  This led to a proposal we tried to discuss
at the GNU Tools Track in LPC2020:

- Update of the BPF support in the GNU Toolchain
  https://linuxplumbersconf.org/event/7/contributions/724/

Basically, the current situation in terms of diversity of debugging
formats in GCC can be summarized in the following like:

     tree     --+                  +--> dwarf2out
     rtl      --+                  +--> dbxout
                +--> debug_hooks --+--> vmsdbgout
     backends --+                  +--> xcoffout
     lto      --+                  +--> godump

i.e. each debug format materializes in a set of debug hooks, as in
gcc/debug.h.  The installed hooks are then invoked from many different
areas of the compiler including front-end, middle-end, back-end and
also lto.  Most of the hooks get TREE objects, from which they are
supposed to extract/infer whatever information they need to express.

This approach has several problems, some of which were raised by you
people when we initially submitted the CTF support:

- The handlers depend on the TREE nodes, so if new TREE nodes are
  added to cover new languages, or functionality in existing
  languages, all the debug hooks may need to be updated to reflect it.

- This also happens when the contents of existing TREE node types
  change or get expanded.

- The semantics encoded in TREE nodes usually are not in the best form
  to be used by debug formats.  This implies that the several sets of
  debug hooks need to do very similar transformations, which again
  will have to be adjusted/corrected if the TREE nodes change.

- And more...

In contrast, this is how LLVM supports several debug formats:

                                     +--> DWARF
     IR --> class DebugHandlerBase --+--> CodeView
                                     +--> BTF    

i.e. LLVM gets debugging information as part of the IR, and then has
debug info backends in the form of instances of DebugHandlerBase,
which process that subset of the IR to produce whatever debug output.

To overcome the problems above, we thought about introducing a new set
of debug hooks, resulting in something like this:

                   +--> godump
                   +--> xcoffout
      debug_hooks -+--> vmsdbgout
                   +--> dbxout                        +--> DWARF
                   +--> dwarf2out --> n_debug_hooks --+--> BTF
                                        (walk)        +--> CTF
                                                      ... more ...

See how these "new debug hooks" are intended to be called by the DWARF
old debug hooks.  In this way:

- The internal DWARF representation becomes the canonical (and only)
  IR for debugging information in the compiler.  This is similar to
  what LLVM uses to implement support for DWARF, BTF and the Microsoft
  debug format.

- Debug formats (like CTF, BTF, stabs, etc) are implemented to provide
  a very simple API that traverses the DWARF DIE trees available in
  dwarf2out.

- The semantics expressed in the DWARF DIEs, which have been already
  extracted from the TREE nodes, are free of many internal details and
  more suitable to be easily translated into whatever abstractions the
  debug formats require.
  
To avoid misunderstandings, we got to refer to these "new debug hooks"
simply as "debug formats".

In this patch series we are using this later approach in order to
support CTF, and we can say we are happy about using the internal
DWARF DIEs as a source instead of TREE nodes: it led to a more natural
implementation, much easier to understand.  This sort of confirms in
practice that the approach is sound.
   
The debug format API
====================

As you can see in the patch series, we hooked CTF in dwarf2out_early_finish
like this:

     /* Emit CTF debug info.  */
     if (ctf_debug_info_level > CTFINFO_LEVEL_NONE && lang_GNU_C ())
      {
        ctf_debug_init ();
        debug_format_do_cu (comp_unit_die ());
        for (limbo_die_node *node = limbo_die_list; node; node = node->next) 
          debug_format_do_cu (node->die);
        ctf_debug_finalize (filename);
       }

In turn, debug_format_do_cu traverses the tree of DIEs passed to it calling
ctf_do_die on them.

This conforms the debug format API:

   FOO_debug_init ()
     Initialize the debug format FOO.

   FOO_debug_finalize (FILENAME)
     Possibly write out, cleanup and finalization for debug format FOO.

   FOO_do_die (DIE)
     Process the given DIE.
  
Note how the emission of DWARF is interrupted after that point, if no
DWARF was requested by the user.

dwarf2out - dwarf2ctf
=====================

The functions ctf_debug_init, ctf_do_die and ctf_debug_finalize, that
implement the API described above, are all in gcc/dwarf2ctf.c.

Obviously, these routines need access to the dwarf DIE data
structures, and several functions which are defined in dwarf2out.[ch],
many (most?) of which are private to that file: dw_die_ref, get_AT,
etc.

Therefore, in this implementation we opted by writing dwarf2ctf.c in a
way it can just be #included in dwarf2ctf.c.

A question remains: would it be better to abstract these types and
functions in an API in dwarf2out.h?

Command line options for debug formats
======================================

This implementation adds the following command-line options to select the
emission of CTF:

     -gt[123]

These options mimic the -g[123...] options for DWARF.

This involved adding new entries for debug_info_type:

     CTF_DEBUG            - Write CTF debug info.
     CTF_AND_DWARF2_DEBUG - Write both CTF and DWARF info.
   
Doing this, we just followed the trend initiated by vmsdbgout.c, which
added VMS_DEBUG and VMS_AND_DWARF2_DEBUG.

This approach is not very good, because debug_info_type was designed
to cover different debug hook implementations; debug formats, in
contrast, are a different thing.

This translates to problems and fragile behavior:

- Everywhere write_symbols is checked we have to expand the logic to
  take the CTF values into account.  You can see that is the case in
  this patch series.  This is very fragile and doesn't scale well: we
  are most probably missing some checks.

- The CTF debug format needs certain DWARF debug level (2) in order to
  work, since otherwise not enough type DIEs get generated.  This will
  probably happen with some other formats as well.

- Therefore, -gt implicitly sets the DWARF debug level to 2.  But if
  the user uses -gt -g1, the CTF information will be incomplete
  because -g1 resets the DWARF debug level to 1.  -gtoggle also
  presents difficulties.
  
- Backends select what debug hooks to use by defining constants like
  DWARF2_DEBUGGING_INFO.  Since the new debug formats are based on the
  DWARF debug hooks, that is the constant to define by the backends
  wanting to support DWARF + debug infos.
     
  However, some backends may want to use one of the debug formats by
  default, i.e. for -g.  This is the case of the BPF backend, that
  needs to generate BTF instead of DWARF.  Currently, there is no way
  to specify this.

  We could add a new optional backend hook/constant to select the
  desired default debug format, like:

       #define DWARF2_DEBUGGING_INFO /* Selects the dwarf debug hooks */

       /* Selects the default debug format to emit with -g.  */
       #define CTF_DEBUGGING_FORMAT
       #define BTF_DEBUGGING_FORMAT
       #define DWARF_DEBUGGING_FORMAT /* The default */

  Regardless of what debug format is defined as the default, the other
  formats are also available with -gdwarf, -gctf, -gbtf, etc.

-gt or -gctf
============

This patch series uses -gt to trigger the generation of CTF debug
data, but if we agree on the approach outlined in the last section for
supporting debug formats in the backends, most likely we will want to
use -gctf instead of -gt.

Work in progress: BTF as a debug format
=======================================

We are already working in adding support for the BTF debug format to
GCC.  This is needed by the BPF backend, which should generate BTF
instead of DWARF.  This is absolutely needed in order to compile BPF
programs that work in the Linux kernel, as explained in the "Update of
the BPF support in the GNU Toolchain" talk mentioned above.

Since BTF is very similar to CTF, we are just adding support for BTF
to the CTF implementation.  In this way, ctfout.[ch] and dwarf2ctf.c
provide two debug formats.

Indu Bhagat (4):
  Add new function lang_GNU_GIMPLE
  CTF debug format
  CTF testsuite
  CTF documentation

 gcc/Makefile.in                               |    3 +
 gcc/common.opt                                |    9 +
 gcc/ctfout.c                                  | 1579 +++++++++++++++++
 gcc/ctfout.h                                  |  322 ++++
 gcc/doc/invoke.texi                           |   16 +
 gcc/dwarf2cfi.c                               |    3 +-
 gcc/dwarf2ctf.c                               |  816 +++++++++
 gcc/dwarf2out.c                               |   32 +-
 gcc/final.c                                   |    5 +-
 gcc/flag-types.h                              |   19 +-
 gcc/gengtype.c                                |    2 +-
 gcc/langhooks.c                               |    9 +
 gcc/langhooks.h                               |    1 +
 gcc/opts.c                                    |   65 +-
 gcc/targhooks.c                               |    3 +-
 gcc/testsuite/gcc.dg/debug/ctf/ctf-1.c        |    6 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-2.c        |   10 +
 .../gcc.dg/debug/ctf/ctf-anonymous-struct-1.c |   23 +
 .../gcc.dg/debug/ctf/ctf-anonymous-union-1.c  |   26 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-1.c  |   31 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-2.c  |   38 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-3.c  |   17 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-4.c  |   13 +
 .../gcc.dg/debug/ctf/ctf-attr-mode-1.c        |   22 +
 .../gcc.dg/debug/ctf/ctf-attr-used-1.c        |   22 +
 .../gcc.dg/debug/ctf/ctf-bitfields-1.c        |   30 +
 .../gcc.dg/debug/ctf/ctf-bitfields-2.c        |   39 +
 .../gcc.dg/debug/ctf/ctf-bitfields-3.c        |   16 +
 .../gcc.dg/debug/ctf/ctf-bitfields-4.c        |   19 +
 .../gcc.dg/debug/ctf/ctf-complex-1.c          |   22 +
 .../gcc.dg/debug/ctf/ctf-cvr-quals-1.c        |   65 +
 .../gcc.dg/debug/ctf/ctf-cvr-quals-2.c        |   30 +
 .../gcc.dg/debug/ctf/ctf-cvr-quals-3.c        |   25 +
 .../gcc.dg/debug/ctf/ctf-cvr-quals-4.c        |   23 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-enum-1.c   |   21 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-enum-2.c   |   27 +
 .../gcc.dg/debug/ctf/ctf-file-scope-1.c       |   25 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-float-1.c  |   16 +
 .../gcc.dg/debug/ctf/ctf-forward-1.c          |   40 +
 .../gcc.dg/debug/ctf/ctf-forward-2.c          |   16 +
 .../gcc.dg/debug/ctf/ctf-func-index-1.c       |   25 +
 .../debug/ctf/ctf-function-pointers-1.c       |   24 +
 .../debug/ctf/ctf-function-pointers-2.c       |   22 +
 .../debug/ctf/ctf-function-pointers-3.c       |   21 +
 .../gcc.dg/debug/ctf/ctf-functions-1.c        |   34 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-int-1.c    |   17 +
 .../gcc.dg/debug/ctf/ctf-objt-index-1.c       |   30 +
 .../gcc.dg/debug/ctf/ctf-pointers-1.c         |   26 +
 .../gcc.dg/debug/ctf/ctf-pointers-2.c         |   25 +
 .../gcc.dg/debug/ctf/ctf-preamble-1.c         |   11 +
 .../gcc.dg/debug/ctf/ctf-skip-types-1.c       |   33 +
 .../gcc.dg/debug/ctf/ctf-skip-types-2.c       |   17 +
 .../gcc.dg/debug/ctf/ctf-skip-types-3.c       |   20 +
 .../gcc.dg/debug/ctf/ctf-skip-types-4.c       |   19 +
 .../gcc.dg/debug/ctf/ctf-skip-types-5.c       |   19 +
 .../gcc.dg/debug/ctf/ctf-skip-types-6.c       |   18 +
 .../gcc.dg/debug/ctf/ctf-str-table-1.c        |   26 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-1.c |   25 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-2.c |   32 +
 .../gcc.dg/debug/ctf/ctf-struct-array-1.c     |   65 +
 .../gcc.dg/debug/ctf/ctf-struct-pointer-1.c   |   21 +
 .../gcc.dg/debug/ctf/ctf-struct-pointer-2.c   |   22 +
 .../gcc.dg/debug/ctf/ctf-typedef-1.c          |   68 +
 .../gcc.dg/debug/ctf/ctf-typedef-2.c          |   20 +
 .../gcc.dg/debug/ctf/ctf-typedef-3.c          |   24 +
 .../gcc.dg/debug/ctf/ctf-typedef-struct-1.c   |   14 +
 .../gcc.dg/debug/ctf/ctf-typedef-struct-2.c   |   17 +
 .../gcc.dg/debug/ctf/ctf-typedef-struct-3.c   |   32 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-union-1.c  |   14 +
 .../gcc.dg/debug/ctf/ctf-variables-1.c        |   25 +
 .../gcc.dg/debug/ctf/ctf-variables-2.c        |   16 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf.exp        |   41 +
 gcc/testsuite/gcc.dg/debug/dwarf2-ctf-1.c     |    7 +
 gcc/toplev.c                                  |   21 +-
 include/ctf.h                                 |  513 ++++++
 libiberty/simple-object.c                     |    3 +
 76 files changed, 4862 insertions(+), 11 deletions(-)
 create mode 100644 gcc/ctfout.c
 create mode 100644 gcc/ctfout.h
 create mode 100644 gcc/dwarf2ctf.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-anonymous-struct-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-anonymous-union-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-4.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-attr-mode-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-attr-used-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-4.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-complex-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-4.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-enum-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-enum-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-file-scope-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-float-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-forward-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-forward-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-func-index-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-function-pointers-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-function-pointers-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-function-pointers-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-functions-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-int-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-objt-index-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-pointers-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-pointers-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-preamble-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-4.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-5.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-6.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-str-table-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-array-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-pointer-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-pointer-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-struct-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-struct-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-struct-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-union-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-variables-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-variables-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf.exp
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-ctf-1.c
 create mode 100644 include/ctf.h

-- 
2.25.0.2.g232378479e



More information about the Gcc-patches mailing list