This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] add option to eliminate unused types in dwarf2 output


hi -

One of the complaints about gcc we've had locally has been about
the large size of the debugging symbol information emitted by gcc.
Compared to the other compiler we've been using (KCC), the debug
symbol table emitted by gcc can sometimes be as much as a factor
of 7 or 8 larger than that produced by KCC.  And when we're building
on the order of a GB of object code, that makes a big difference.

The major difference between the debugging information output by KCC
and that output by gcc is that gcc includes information about all
types and functions that are declared in the compilation unit
being processed, while KCC seems to include that information
only if the type is actually used in that compilation unit.
In fact, the code where i saw a large difference in the debug
symbol size between the two compilers was code using a large
library which included lots of headers from the library
but only made a few calls.

In discussions from a couple years ago, this was defended as the
correct behavior --- after all, you never know if the user of the
debugger may want to cast something to a type which was declared
in the program, but never actually used.  However, it seems rare
that the someone using a debugger will be interested in types
which are nowhere used in the program being debugged.  Thus,
in view of the potentially large savings in space possible,
it seems to make sense to, as an option, allow eliminating unused
declarations from the debug symbol output.

That's what i've tried to do with the patch below.
It adds a new option, `-feliminate-unused-dwarf2-types'.
When turned on, we make an additional pass over the DIE tree
before emitting it, to remove type and function declarations
that are not referenced from anywhere.  In addition, this patch
also disables emitting the names of source files which do not
contribute any debugging information.

Here's the sort of savings i get for two of the sources
in our system:

Without -feliminate-unused-dwarf2-types:
375268 Clique.o
334304 d0om_Dictionary.o

With -feliminate-unused-dwarf2-types:
116464 Clique.o
196196 d0om_Dictionary.o


This patch is against gcc 3.1; it bootstrapped and tested ok
on i686-pc-linux-gnu.  One caveat is that i haven't tested this
on any other platforms.

If it would help, i could remake the patch against 3.2.
I could also separate out the pieces dealing with eliminating
unused source file names.

Anyway, is there any interest in adding something like this
to gcc?

scott snyder
snyder@fnal.gov



2002-06-16  scott snyder  <snyder@fnal.gov>

	* flags.h: Add flag_eliminate_unused_dwarf2_types.
	* toplev.c: Add flag_eliminate_unused_dwarf2_types.
	(f_options): Add -feliminate-unused-dwarf2-types.
	* dwarf2out.c (struct file_table): Add emitted member.
	(splice_child_die): Fix the parent pointer for the child being
	spliced.
	(lookup_filename): Maintain file_table.emitted array.  Don't
	output .file directive here.
	(maybe_emit_file): (new)
	(init_file_table): Set up file_table.emitted.
	(dwarf2out_source_line): Use maybe_emit_file.
	(dwarf2out_start_source_file): Use maybe_emit_file.
	(dwarf2out_init): Use maybe_emit_file.
	(prune_unused_types_walk_attribs): (new)
	(prune_unused_types_mark): (new)
	(prune_unused_types_walk): (new)
	(prune_unused_types_prune): (new)
	(prune_unused_types): (new)
	(dwarf2out_finish): Call prune_unused_types if
	flag_eliminate_unused_dwarf2_types is set.
	* doc/invoke.texi (Option Summary): Add
	-feliminate-unused-dwarf2-types.
	(Debugging Options): Likewise.


Index: flags.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/flags.h,v
retrieving revision 1.76.4.1
diff -u -p -c -r1.76.4.1 flags.h
*** flags.h	21 Mar 2002 23:12:21 -0000	1.76.4.1
--- flags.h	17 Jun 2002 23:37:14 -0000
*************** extern int flag_gcse_sm;
*** 634,639 ****
--- 634,643 ----
  
  extern int flag_eliminate_dwarf2_dups;
  
+ /* Nonzero means we should do dwarf2 unused type elimination.  */
+ 
+ extern int flag_eliminate_unused_dwarf2_types;
+ 
  /* Non-zero means to collect statistics which might be expensive
     and to print them when we are done.  */
  extern int flag_detailed_statistics;
Index: toplev.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.574.2.13
diff -u -p -c -r1.574.2.13 toplev.c
*** toplev.c	30 Apr 2002 23:04:51 -0000	1.574.2.13
--- toplev.c	17 Jun 2002 23:37:35 -0000
*************** void (*incomplete_decl_finalize_hook) PA
*** 378,383 ****
--- 378,387 ----
  
  int flag_eliminate_dwarf2_dups = 0;
  
+ /* Nonzero if doing dwarf2 unused type elimination.  */
+ 
+ int flag_eliminate_unused_dwarf2_types = 0;
+ 
  /* Nonzero if generating code to do profiling.  */
  
  int profile_flag = 0;
*************** static const lang_independent_options f_
*** 959,964 ****
--- 963,970 ----
  {
    {"eliminate-dwarf2-dups", &flag_eliminate_dwarf2_dups, 1,
     N_("Perform DWARF2 duplicate elimination") },
+   {"eliminate-unused-dwarf2-types", &flag_eliminate_unused_dwarf2_types, 1,
+    N_("Perform DWARF2 unused type elimination") },
    {"float-store", &flag_float_store, 1,
     N_("Do not store floats in registers") },
    {"volatile", &flag_volatile, 1,
Index: dwarf2out.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/dwarf2out.c,v
retrieving revision 1.356.2.9
diff -u -p -c -r1.356.2.9 dwarf2out.c
*** dwarf2out.c	7 May 2002 17:27:30 -0000	1.356.2.9
--- dwarf2out.c	17 Jun 2002 23:37:52 -0000
*************** struct file_table
*** 3298,3303 ****
--- 3298,3304 ----
    unsigned allocated;
    unsigned in_use;
    unsigned last_lookup_index;
+   unsigned *emitted;
  };
  
  /* Size (in elements) of increments by which we may expand the filename
*************** static void output_loc_list		PARAMS ((dw
*** 3674,3679 ****
--- 3675,3687 ----
  static char *gen_internal_sym 		PARAMS ((const char *));
  static void mark_limbo_die_list		PARAMS ((void *));
  
+ static void prune_unused_types_mark     PARAMS ((dw_die_ref, int));
+ static void prune_unused_types_walk     PARAMS ((dw_die_ref));
+ static void prune_unused_types_walk_attribs PARAMS ((dw_die_ref));
+ static void prune_unused_types_prune    PARAMS ((dw_die_ref));
+ static void prune_unused_types          PARAMS ((void));
+ static int maybe_emit_file              PARAMS ((int));
+ 
  /* Section names used to hold DWARF debugging information.  */
  #ifndef DEBUG_INFO_SECTION
  #define DEBUG_INFO_SECTION	".debug_info"
*************** splice_child_die (parent, child)
*** 5059,5064 ****
--- 5067,5073 ----
  	break;
        }
  
+   child->die_parent = parent;
    child->die_sib = parent->die_child;
    parent->die_child = child;
  }
*************** lookup_filename (file_name)
*** 11804,11809 ****
--- 11813,11821 ----
        file_table.allocated = i + FILE_TABLE_INCREMENT;
        file_table.table = (char **)
  	xrealloc (file_table.table, file_table.allocated * sizeof (char *));
+       file_table.emitted = (unsigned *)
+ 	xrealloc (file_table.emitted,
+                   file_table.allocated * sizeof (unsigned));
      }
  
    /* Add the new entry to the end of the filename table.  */
*************** lookup_filename (file_name)
*** 11811,11828 ****
    file_table.in_use = i + 1;
    file_table.last_lookup_index = i;
  
!   if (DWARF2_ASM_LINE_DEBUG_INFO)
!     fprintf (asm_out_file, "\t.file %u \"%s\"\n", i, file_name);
  
    return i;
  }
  
  static void
  init_file_table ()
  {
    /* Allocate the initial hunk of the file_table.  */
    file_table.table = (char **) xcalloc (FILE_TABLE_INCREMENT, sizeof (char *));
    file_table.allocated = FILE_TABLE_INCREMENT;
  
    /* Skip the first entry - file numbers begin at 1.  */
    file_table.in_use = 1;
--- 11823,11860 ----
    file_table.in_use = i + 1;
    file_table.last_lookup_index = i;
  
!   file_table.emitted[i] = 0;
  
    return i;
  }
  
+ static int
+ maybe_emit_file (fileno)
+      int fileno;
+ {
+   static int emitcount = 0;
+   if (DWARF2_ASM_LINE_DEBUG_INFO)
+     {
+       if (!file_table.emitted[fileno])
+         {
+           file_table.emitted[fileno] = ++emitcount;
+           fprintf (asm_out_file, "\t.file %u \"%s\"\n",
+                    file_table.emitted[fileno], file_table.table[fileno]);
+         }
+       return file_table.emitted[fileno];
+     }
+   else
+     return fileno;
+ }
+ 
  static void
  init_file_table ()
  {
    /* Allocate the initial hunk of the file_table.  */
    file_table.table = (char **) xcalloc (FILE_TABLE_INCREMENT, sizeof (char *));
    file_table.allocated = FILE_TABLE_INCREMENT;
+   file_table.emitted = (unsigned *) xcalloc (FILE_TABLE_INCREMENT,
+                                              sizeof (unsigned));
  
    /* Skip the first entry - file numbers begin at 1.  */
    file_table.in_use = 1;
*************** dwarf2out_source_line (line, filename)
*** 11851,11856 ****
--- 11883,11890 ----
  	{
  	  unsigned file_num = lookup_filename (filename);
  
+           file_num = maybe_emit_file (file_num);
+ 
  	  /* Emit the .loc directive understood by GNU as.  */
  	  fprintf (asm_out_file, "\t.loc %d %d 0\n", file_num, line);
  
*************** dwarf2out_start_source_file (lineno, fil
*** 11932,11937 ****
--- 11966,11972 ----
        dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
        dw2_asm_output_data_uleb128 (lineno, "Included from line number %d",
  				   lineno);
+       maybe_emit_file (lookup_filename (filename));
        dw2_asm_output_data_uleb128 (lookup_filename (filename),
  				   "Filename we just started");
      }
*************** dwarf2out_init (main_input_filename)
*** 12005,12010 ****
--- 12040,12046 ----
       be emitting line number data for it first, which avoids having
       to add an initial DW_LNS_set_file.  */
    lookup_filename (main_input_filename);
+   maybe_emit_file (lookup_filename (main_input_filename));
  
    /* Allocate the initial hunk of the decl_die_table.  */
    decl_die_table
*************** output_indirect_string (pfile, h, v)
*** 12122,12127 ****
--- 12158,12356 ----
    return 1;
  }
  
+ 
+ 
+ /* Given DIE that we're marking as used, find any other dies
+    it references as attributes and mark them as used.  */
+ 
+ static void
+ prune_unused_types_walk_attribs (die)
+      dw_die_ref die;
+ {
+   dw_attr_ref a;
+ 
+   for (a = die->die_attr; a != NULL; a = a->dw_attr_next)
+     {
+       if (a->dw_attr_val.val_class == dw_val_class_die_ref)
+         {
+           /* A reference to another DIE.
+              Make sure that it will get emitted.  */
+           prune_unused_types_mark (a->dw_attr_val.v.val_die_ref.die, 1);
+         }
+       else if (a->dw_attr == DW_AT_decl_file)
+         {
+           /* A reference to a file.  Make sure the file name is emitted.  */
+           a->dw_attr_val.v.val_unsigned =
+             maybe_emit_file (a->dw_attr_val.v.val_unsigned);
+         }
+     }
+ }
+ 
+ 
+ /* Mark DIE as being used.  If DOKIDS is true, then walk down
+    to DIE's children.  */
+ 
+ static void
+ prune_unused_types_mark (die, dokids)
+      dw_die_ref die;
+      int dokids;
+ {
+   dw_die_ref c;
+ 
+   if (die->die_mark == 0) {
+     /* We haven't done this node yet.  Mark it as used.  */
+     die->die_mark = 1;
+ 
+     /* We also have to mark its parents as used.
+        (But we don't want to mark our parents' kids due to this.)  */
+     if (die->die_parent)
+       prune_unused_types_mark (die->die_parent, 0);
+ 
+     /* Mark any referenced nodes.  */
+     prune_unused_types_walk_attribs (die);
+   }
+ 
+   if (dokids && die->die_mark != 2)
+     {
+       /* We need to walk the children, but haven't done so yet.
+          Remember that we've walked the kids.  */
+       die->die_mark = 2;
+ 
+       /* Walk them.  */
+       for (c = die->die_child; c; c = c->die_sib)
+         prune_unused_types_walk (c);
+     }
+ }
+ 
+ 
+ /* Walk the tree DIE and mark types that we actually use.  */
+ 
+ static void
+ prune_unused_types_walk (die)
+      dw_die_ref die;
+ {
+   dw_die_ref c;
+ 
+   /* Don't do anything if this node is already marked.  */
+   if (die->die_mark)
+     return;
+ 
+   switch (die->die_tag) {
+   case DW_TAG_const_type:
+   case DW_TAG_packed_type:
+   case DW_TAG_pointer_type:
+   case DW_TAG_reference_type:
+   case DW_TAG_volatile_type:
+   case DW_TAG_typedef:
+   case DW_TAG_array_type:
+   case DW_TAG_structure_type:
+   case DW_TAG_union_type:
+   case DW_TAG_class_type:
+   case DW_TAG_friend:
+   case DW_TAG_variant_part:
+   case DW_TAG_enumeration_type:
+   case DW_TAG_subroutine_type:
+   case DW_TAG_string_type:
+   case DW_TAG_set_type:
+   case DW_TAG_subrange_type:
+   case DW_TAG_ptr_to_member_type:
+   case DW_TAG_file_type:
+     /* It's a type node --- don't mark it.  */
+     return;
+ 
+   case DW_TAG_subprogram:
+     /* Mark functions, unless it's only a declaration.  */
+     if (get_AT_flag (die, DW_AT_declaration)) 
+       return;
+     break;
+ 
+   default:
+     /* Mark everything else.  */
+     break;
+   }
+ 
+   die->die_mark = 1;
+ 
+   /* Now, mark any dies referenced from here.  */
+   prune_unused_types_walk_attribs (die);
+ 
+   /* Mark children.  */
+   for (c = die->die_child; c; c = c->die_sib)
+     prune_unused_types_walk (c);
+ }
+ 
+ 
+ /* Remove from the tree DIE any dies that aren't marked.  */
+ 
+ static void
+ prune_unused_types_prune (die)
+      dw_die_ref die;
+ {
+   dw_die_ref c, p, n;
+   if (!die->die_mark)
+     abort();
+ 
+   p = NULL;
+   for (c = die->die_child; c; c = n)
+     {
+       n = c->die_sib;
+       if (c->die_mark)
+         {
+           prune_unused_types_prune (c);
+           p = c;
+         }
+       else
+         {
+           if (p)
+             p->die_sib = n;
+           else
+             die->die_child = n;
+           free_die (c);
+         }
+     }
+ }
+ 
+ 
+ /* Remove dies representing declarations that we never use.  */
+ 
+ static void
+ prune_unused_types ()
+ {
+   unsigned int i;
+   limbo_die_node *node;
+ 
+   /* Clear all the marks.  */
+   unmark_dies (comp_unit_die);
+   for (node = limbo_die_list; node; node = node->next)
+     unmark_dies (node->die);
+ 
+   /* Set the mark on nodes that are actually used.  */
+   prune_unused_types_walk (comp_unit_die);
+   for (node = limbo_die_list; node; node = node->next)
+     prune_unused_types_walk (node->die);
+ 
+   /* Also set the mark on nodes referenced from the
+      pubname_table or arange_table.  */
+   for (i=0; i < pubname_table_in_use; i++)
+     {
+       prune_unused_types_mark (pubname_table[i].die, 1);
+     }
+   for (i=0; i < arange_table_in_use; i++)
+     {
+       prune_unused_types_mark (arange_table[i], 1);
+     }
+ 
+   /* Get rid of nodes that aren't marked.  */
+   prune_unused_types_prune (comp_unit_die);
+   for (node = limbo_die_list; node; node = node->next)
+     prune_unused_types_prune (node->die);
+ 
+   /* Leave the marks clear.  */
+   unmark_dies (comp_unit_die);
+   for (node = limbo_die_list; node; node = node->next)
+     unmark_dies (node->die);
+ }
+ 
  /* Output stuff that dwarf requires at the end of every file,
     and generate the DWARF-2 debugging info.  */
  
*************** dwarf2out_finish (input_filename)
*** 12201,12206 ****
--- 12430,12438 ----
       They will go into limbo_die_list.  */
    if (flag_eliminate_dwarf2_dups)
      break_out_includes (comp_unit_die);
+ 
+   if (flag_eliminate_unused_dwarf2_types)
+     prune_unused_types ();
  
    /* Traverse the DIE's and add add sibling attributes to those DIE's
       that have children.  */
Index: doc/invoke.texi
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.119.2.11
diff -u -p -c -r1.119.2.11 invoke.texi
*** doc/invoke.texi	25 Apr 2002 22:33:21 -0000	1.119.2.11
--- doc/invoke.texi	17 Jun 2002 23:39:15 -0000
*************** in the following sections.
*** 254,259 ****
--- 254,260 ----
  -p  -pg  -print-file-name=@var{library}  -print-libgcc-file-name @gol
  -print-multi-directory  -print-multi-lib @gol
  -print-prog-name=@var{program}  -print-search-dirs  -Q @gol
+ -feliminate-unused-dwarf2-types @gol
  -save-temps  -time}
  
  @item Optimization Options
*************** anything else.
*** 3229,3234 ****
--- 3230,3248 ----
  @opindex dumpspecs
  Print the compiler's built-in specs---and don't do anything else.  (This
  is used when GCC itself is being built.)  @xref{Spec Files}.
+ 
+ @item -feliminate-unused-dwarf2-types
+ @opindex feliminate-unused-dwarf2-types
+ Normally, when producing DWARF2 output, GCC will emit debugging
+ information for all types and functions declared in a compilation
+ unit, regardless of whether or not they are actually used
+ in that compilation unit.  Sometimes this is useful, such as
+ if, in the debugger, you want to cast a value to a type that is
+ not actually used in your program (but is declared).  More often,
+ however, this results in a significant amount of wasted space.
+ With this option, GCC will avoid producing debug symbol output
+ for types and functions that are nowhere used in the source
+ file being compiled.
  @end table
  
  @node Optimize Options


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]