This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] More compact (100x) -g3 .debug_macinfo (take 3)


On Fri, Jul 15, 2011 at 09:22:42AM -0700, Richard Henderson wrote:
> On 07/15/2011 08:42 AM, Jakub Jelinek wrote:
> 
> > The newly added opcodes:
> > DW_MACINFO_GNU_define_indirect		0xe0
> > 	This opcode has two arguments, one is uleb128 lineno and the
> > 	other is offset size long byte offset into .debug_str.  Except
> > 	for the encoding of the string it is similar to DW_MACINFO_define.
> > DW_MACINFO_GNU_undef_indirect		0xe1
> > 	This opcode has two arguments, one is uleb128 lineno and the
> > 	other is offset size long byte offset into .debug_str.  Except
> > 	for the encoding of the string it is similar to DW_MACINFO_undef.
> > DW_MACINFO_GNU_transparent_include	0xe2
> > 	This opcode has a single argument, a offset size long byte offset into
> > 	.debug_macinfo.  It instructs the debug info consumer that
> > 	this opcode during reading should be replaced with the sequence
> > 	of .debug_macinfo opcodes from the mentioned offset, up to
> > 	a terminating 0 opcode (not including that 0).
> > DW_MACINFO_GNU_define_opcode		0xe3
> > 	This is an opcode for future extensibility through which
> > 	a debugger could skip unknown opcodes.  It has 3 arguments:
> > 	1 byte opcode number, uleb128 count of arguments and
> > 	a count bytes long array, with a DW_FORM_* code how the
> > 	argument is encoded.
> 
> I do like the new opcodes.
> 
> Elsewhere you described transparent_include as also saving state
> about defined opcodes around the include.  Do you want to either
> describe that or drop it?

Ok, so how about this way (as DWARF4 modifications, of course for
DWARF5 proposal GNU_ would be gone and the ops would have different
codes):

6.3.1

The valid macinfo types are as follows:
...
DW_MACINFO_GNU_define_indirect		A macro definition.
DW_MACINFO_GNU_undef_indirect		A macro undefinition.
DW_MACINFO_GNU_transparent_include	Include a sequence of entries from given offset.
DW_MACINFO_GNU_define_opcode		Define extension opcode and its arguments.

6.3.1.1

All DW_MACINFO_GNU_define_indirect and DW_MACINFO_undef_indirect entries have
two operands.  The first operand encodes the line number of the source line on
which the relevant defining or undefining macro directives appeared.
The second operand consists of an offset into a string table contained in
the .debug_str section of the object file.  In the 32-bit DWARF format, the
representation of the operand value is a 4-byte unsigned offset; in the
64-bit DWARF format, it is an 8-byte unsigned offset.  Apart from the
encoding of the operands these entries are equivalent to DW_MACINFO_define
resp. DW_MACINFO_undef.

6.3.1.5  Transparent inclusion of a sequence of entries

A DW_MACINFO_GNU_transparent_include entry has one operand, offset into
another part of the .debug_macinfo section.  In the 32-bit DWARF format, the
representation of the operand value is a 4-byte unsigned offset; in the
64-bit DWARF format, it is an 8-byte unsigned offset.  This entry instructs
the consumer to replace this entry with a sequence of macinfo entries found
at the given .debug_macinfo offset, up to, but excluding, the terminating
entry with type code 0.  This entry type is aimed at sharing duplicate
sequences of macinfo entries between macinfo from different compilation
units.  The producer should ensure that only sequences with matching
DWARF format size (either all 32-bit DWARF or all 64-bit DWARF) are
merged together, and that either DW_MACINFO_start_file entries aren't
in those sequences, or only macinfo entries referencing the same
.debug_line section part include the sequence.

6.3.1.6  Defining new opcodes and operands

A DW_MACINFO_GNU_define_opcode entry has 2 operands.  The first operand
is a one byte constant with the type code it defines operand types for,
the second operand is a DW_FORM_block encoded array of operand forms.
The second operand starts with an unsigned LEB128 encoded number of operands
and for each of the operands there is one byte, containing a form encoding
how the corresponding operand is encoded.  This entry allows to define
new vendor extension entry types which consumers will be able to skip over
and ignore.  Each so defined opcode is valid for subsequent entries
until the terminating entry with type code 0, including any sequences
included from those entries using DW_MACINFO_GNU_transparent_include.
Opcodes defined using this entry in a chain included through
DW_MACINFO_GNU_transparent_include isn't valid in the parent sequence
after the DW_MACINFO_GNU_transparent_include entry that included it though.

7.22 Macro Information

Add
	DW_MACINFO_lo_user			0xe0
	DW_MACINFO_GNU_define_indirect		0xe0
	DW_MACINFO_GNU_undef_indirect		0xe1
	DW_MACINFO_GNU_transparent_include	0xe2
	DW_MACINFO_GNU_define_opcode		0xe3
	DW_MACINFO_hi_user			0xfe
to the table.

> I'd like to see this broken out into some functions, and avoid
> as much code as possible within ifdefs.  Perhaps
> 
> some_function (...)
> {
> #ifndef OBJECT_FORMAT_ELF
>   return;
> #endif
>   // everything else
> }
> 
> I think it also doesn't help review that there are no comments
> at all, and a preponderance of description-less variable names
> like "ref" and "ref2".

I've tried to cure these issues in the following (so far just
lightly tested) patch:

2011-07-15  Jakub Jelinek  <jakub@redhat.com>

	* dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add.
	(DW_MACINFO_GNU_define_indirect, DW_MACINFO_GNU_undef_indirect,
	DW_MACINFO_GNU_transparent_include, DW_MACINFO_GNU_define_opcode):
	Add.

	* dwarf2out.c (dwarf2out_define): If the vector is empty and
	lineno is 0, emit a dummy entry first.
	(dwarf2out_undef): Likewise.  Remove redundant semicolon.
	(htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op,
	optimize_macinfo_range): New functions.
	(output_macinfo): Use them.  If !dwarf_strict and .debug_str is
	mergeable, optimize longer strings using
	DW_MACINFO_GNU_{define,undef}_indirect and if HAVE_COMDAT_GROUP,
	optimize longer sequences of define/undef ops from headers
	using DW_MACINFO_GNU_transparent_include.

--- include/dwarf2.h.jj	2011-06-23 10:14:06.000000000 +0200
+++ include/dwarf2.h	2011-07-13 11:39:49.000000000 +0200
@@ -877,7 +877,13 @@ enum dwarf_macinfo_record_type
     DW_MACINFO_undef = 2,
     DW_MACINFO_start_file = 3,
     DW_MACINFO_end_file = 4,
-    DW_MACINFO_vendor_ext = 255
+    DW_MACINFO_lo_user = 0xe0,
+    DW_MACINFO_GNU_define_indirect = 0xe0,
+    DW_MACINFO_GNU_undef_indirect = 0xe1,
+    DW_MACINFO_GNU_transparent_include = 0xe2,
+    DW_MACINFO_GNU_define_opcode = 0xe3,
+    DW_MACINFO_hi_user = 0xfe,
+    DW_MACINFO_vendor_ext = 0xff
   };
 
 /* @@@ For use with GNU frame unwind information.  */
--- gcc/dwarf2out.c.jj	2011-07-15 20:46:32.000000000 +0200
+++ gcc/dwarf2out.c	2011-07-15 22:15:14.000000000 +0200
@@ -20291,6 +20291,15 @@ dwarf2out_define (unsigned int lineno AT
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
       macinfo_entry e;
+      /* Insert a dummy first entry to be able to optimize the whole
+	 predefined macro block using DW_MACINFO_GNU_transparent_include.  */
+      if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0)
+	{
+	  e.code = 0;
+	  e.lineno = 0;
+	  e.info = NULL;
+	  VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
+	}
       e.code = DW_MACINFO_define;
       e.lineno = lineno;
       e.info = xstrdup (buffer);;
@@ -20309,58 +20318,363 @@ dwarf2out_undef (unsigned int lineno ATT
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
       macinfo_entry e;
+      /* Insert a dummy first entry to be able to optimize the whole
+	 predefined macro block using DW_MACINFO_GNU_transparent_include.  */
+      if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0)
+	{
+	  e.code = 0;
+	  e.lineno = 0;
+	  e.info = NULL;
+	  VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
+	}
       e.code = DW_MACINFO_undef;
       e.lineno = lineno;
-      e.info = xstrdup (buffer);;
+      e.info = xstrdup (buffer);
       VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
     }
 }
 
+/* Routines to manipulate hash table of CUs.  */
+
+static hashval_t
+htab_macinfo_hash (const void *of)
+{
+  const macinfo_entry *const entry =
+    (const macinfo_entry *) of;
+
+  return htab_hash_string (entry->info);
+}
+
+static int
+htab_macinfo_eq (const void *of1, const void *of2)
+{
+  const macinfo_entry *const entry1 = (const macinfo_entry *) of1;
+  const macinfo_entry *const entry2 = (const macinfo_entry *) of2;
+
+  return !strcmp (entry1->info, entry2->info);
+}
+
+/* Output a single .debug_macinfo entry.  */
+
+static void
+output_macinfo_op (macinfo_entry *ref)
+{
+  int file_num;
+  size_t len;
+  struct indirect_string_node *node;
+  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+
+  switch (ref->code)
+    {
+    case DW_MACINFO_start_file:
+      file_num = maybe_emit_file (lookup_filename (ref->info));
+      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
+      dw2_asm_output_data_uleb128 (ref->lineno,
+				   "Included from line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+      break;
+    case DW_MACINFO_end_file:
+      dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
+      break;
+    case DW_MACINFO_define:
+    case DW_MACINFO_undef:
+      len = strlen (ref->info) + 1;
+      if (!dwarf_strict
+	  && len > DWARF_OFFSET_SIZE
+	  && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
+	  && (debug_str_section->common.flags & SECTION_MERGE) != 0)
+	{
+	  ref->code = ref->code == DW_MACINFO_define
+		      ? DW_MACINFO_GNU_define_indirect
+		      : DW_MACINFO_GNU_undef_indirect;
+	  output_macinfo_op (ref);
+	  return;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACINFO_define
+			   ? "Define macro" : "Undefine macro");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_nstring (ref->info, -1, "The macro");
+      break;
+    case DW_MACINFO_GNU_define_indirect:
+    case DW_MACINFO_GNU_undef_indirect:
+      node = find_AT_string (ref->info);
+      if (node->form != DW_FORM_strp)
+	{
+	  char label[32];
+	  ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter);
+	  ++dw2_string_counter;
+	  node->label = xstrdup (label);
+	  node->form = DW_FORM_strp;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACINFO_GNU_define_indirect
+			   ? "Define macro indirect"
+			   : "Undefine macro indirect");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label,
+			     debug_str_section, "The macro: \"%s\"",
+			     ref->info);
+      break;
+    case DW_MACINFO_GNU_transparent_include:
+      dw2_asm_output_data (1, ref->code, "Transparent include");
+      ASM_GENERATE_INTERNAL_LABEL (label,
+				   DEBUG_MACINFO_SECTION_LABEL, ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL);
+      break;
+    default:
+      fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
+	       ASM_COMMENT_START, (unsigned long) ref->code);
+      break;
+    }
+}
+
+/* Attempt to make a sequence of define/undef macinfo ops shareable with
+   other compilation unit .debug_macinfo sections.  IDX is the first
+   index of a define/undef, return the number of ops that should be
+   emitted in a comdat .debug_macinfo section and emit
+   a DW_MACINFO_GNU_transparent_include entry referencing it.
+   If the define/undef entry should be emitted normally, return 0.  */
+
+static unsigned
+optimize_macinfo_range (unsigned int idx, VEC (macinfo_entry, gc) *files,
+			htab_t *macinfo_htab)
+{
+  macinfo_entry *first, *second, *cur, *inc;
+  char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1];
+  unsigned char checksum[16];
+  struct md5_ctx ctx;
+  char *grp_name, *tail;
+  const char *base;
+  unsigned int i, count, encoded_filename_len, linebuf_len;
+  void **slot;
+
+  first = VEC_index (macinfo_entry, macinfo_table, idx);
+  second = VEC_index (macinfo_entry, macinfo_table, idx + 1);
+
+  /* Optimize only if there are at least two consecutive define/undef ops,
+     and either all of them are before first DW_MACINFO_start_file
+     with lineno 0 (i.e. predefined macro block), or all of them are
+     in some included header file.  */
+  if (second->code != DW_MACINFO_define && second->code != DW_MACINFO_undef)
+    return 0;
+  if (VEC_empty (macinfo_entry, files))
+    {
+      if (first->lineno != 0 || second->lineno != 0)
+	return 0;
+    }
+  else if (first->lineno == 0)
+    return 0;
+
+  /* Find the last define/undef entry that can be grouped together
+     with first and at the same time compute md5 checksum of their
+     codes, linenumbers and strings.  */
+  md5_init_ctx (&ctx);
+  for (i = idx; VEC_iterate (macinfo_entry, macinfo_table, i, cur); i++)
+    if (cur->code != DW_MACINFO_define && cur->code != DW_MACINFO_undef)
+      break;
+    else if (first->lineno == 0 && cur->lineno != 0)
+      break;
+    else
+      {
+	unsigned char code = cur->code;
+	md5_process_bytes (&code, 1, &ctx);
+	checksum_uleb128 (cur->lineno, &ctx);
+	md5_process_bytes (cur->info, strlen (cur->info) + 1, &ctx);
+      }
+  md5_finish_ctx (&ctx, checksum);
+  count = i - idx;
+
+  /* From the containing include filename (if any) pick up just
+     usable characters from its basename.  */
+  if (first->lineno == 0)
+    base = "";
+  else
+    base = lbasename (VEC_last (macinfo_entry, files)->info);
+  for (encoded_filename_len = 0, i = 0; base[i]; i++)
+    if (ISIDNUM (base[i]) || base[i] == '.')
+      encoded_filename_len++;
+  /* Count . at the end.  */
+  if (encoded_filename_len)
+    encoded_filename_len++;
+
+  sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED, first->lineno);
+  linebuf_len = strlen (linebuf);
+
+  /* The group name format is: wmN.[<encoded filename>.]<lineno>.<md5sum>  */
+  grp_name = XNEWVEC (char, 4 + encoded_filename_len + linebuf_len + 1
+		      + 16 * 2 + 1);
+  memcpy (grp_name, DWARF_OFFSET_SIZE == 4 ? "wm4." : "wm8.", 4);
+  tail = grp_name + 4;
+  if (encoded_filename_len)
+    {
+      for (i = 0; base[i]; i++)
+	if (ISIDNUM (base[i]) || base[i] == '.')
+	  *tail++ = base[i];
+      *tail++ = '.';
+    }
+  memcpy (tail, linebuf, linebuf_len);
+  tail += linebuf_len;
+  *tail++ = '.';
+  for (i = 0; i < 16; i++)
+    sprintf (tail + i * 2, "%02x", checksum[i] & 0xff);
+
+  /* Construct a macinfo_entry for DW_MACINFO_GNU_transparent_include
+     in the empty vector entry before the first define/undef.  */
+  inc = VEC_index (macinfo_entry, macinfo_table, idx - 1);
+  inc->code = DW_MACINFO_GNU_transparent_include;
+  inc->lineno = 0;
+  inc->info = grp_name;
+  if (*macinfo_htab == NULL)
+    *macinfo_htab = htab_create (10, htab_macinfo_hash, htab_macinfo_eq, NULL);
+  /* Avoid emitting duplicates.  */
+  slot = htab_find_slot (*macinfo_htab, inc, INSERT);
+  if (*slot != NULL)
+    {
+      free (CONST_CAST (char *, inc->info));
+      inc->code = 0;
+      inc->info = NULL;
+      /* If such an entry has been used before, just emit
+	 a DW_MACINFO_GNU_transparent_include op.  */
+      inc = (macinfo_entry *) *slot;
+      output_macinfo_op (inc);
+      /* And clear all macinfo_entry in the range to avoid emitting them
+	 in the second pass.  */
+      for (i = idx;
+	   VEC_iterate (macinfo_entry, macinfo_table, i, cur)
+	   && i < idx + count;
+	   i++)
+	{
+	  cur->code = 0;
+	  free (CONST_CAST (char *, cur->info));
+	  cur->info = NULL;
+	}
+    }
+  else
+    {
+      *slot = inc;
+      inc->lineno = htab_elements (*macinfo_htab);
+      output_macinfo_op (inc);
+    }
+  return count;
+}
+
+/* Output macinfo section(s).  */
+
 static void
 output_macinfo (void)
 {
   unsigned i;
   unsigned long length = VEC_length (macinfo_entry, macinfo_table);
   macinfo_entry *ref;
+  VEC (macinfo_entry, gc) *files = NULL;
+  htab_t macinfo_htab = NULL;
 
   if (! length)
     return;
 
+  /* In the first loop, it emits the primary .debug_macinfo section
+     and after each emitted op the macinfo_entry is cleared.
+     If a longer range of define/undef ops can be optimized using
+     DW_MACINFO_GNU_transparent_include, the
+     DW_MACINFO_GNU_transparent_include op is emitted and kept in
+     the vector before the first define/undef in the range and the
+     whole range of define/undef ops is not emitted and kept.  */
   for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
     {
       switch (ref->code)
 	{
-	  case DW_MACINFO_start_file:
+	case DW_MACINFO_start_file:
+	  VEC_safe_push (macinfo_entry, gc, files, ref);
+	  break;
+	case DW_MACINFO_end_file:
+	  if (!VEC_empty (macinfo_entry, files))
 	    {
-	      int file_num = maybe_emit_file (lookup_filename (ref->info));
-	      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
-	      dw2_asm_output_data_uleb128 
-			(ref->lineno, "Included from line number %lu", 
-			 			(unsigned long)ref->lineno);
-	      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+	      macinfo_entry *file = VEC_last (macinfo_entry, files);
+	      free (CONST_CAST (char *, file->info));
+	      VEC_pop (macinfo_entry, files);
 	    }
-	    break;
-	  case DW_MACINFO_end_file:
-	    dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
-	    break;
-	  case DW_MACINFO_define:
-	    dw2_asm_output_data (1, DW_MACINFO_define, "Define macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  case DW_MACINFO_undef:
-	    dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  default:
-	   fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
-	     ASM_COMMENT_START, (unsigned long)ref->code);
+	  break;
+	case DW_MACINFO_define:
+	case DW_MACINFO_undef:
+	  if (!dwarf_strict
+	      && HAVE_COMDAT_GROUP
+	      && VEC_length (macinfo_entry, files) != 1
+	      && i > 0
+	      && i + 1 < length
+	      && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0)
+	    {
+	      unsigned count = optimize_macinfo_range (i, files, &macinfo_htab);
+	      if (count)
+		{
+		  i += count - 1;
+		  continue;
+		}
+	    }
+	  break;
+	case 0:
+	  /* A dummy entry may be inserted at the beginning to be able
+	     to optimize the whole block of predefined macros.  */
+	  if (i == 0)
+	    continue;
+	default:
 	  break;
 	}
+      output_macinfo_op (ref);
+      /* For DW_MACINFO_start_file ref->info has been copied into files
+	 vector.  */
+      if (ref->code != DW_MACINFO_start_file)
+	free (CONST_CAST (char *, ref->info));
+      ref->info = NULL;
+      ref->code = 0;
     }
+
+  if (macinfo_htab == NULL)
+    return;
+
+  htab_delete (macinfo_htab);
+
+  /* If any DW_MACINFO_GNU_transparent_include were used, on those
+     DW_MACINFO_GNU_transparent_include entries terminate the
+     current chain and switch to a new comdat .debug_macinfo
+     section and emit the define/undef entries within it.  */
+  for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
+    switch (ref->code)
+      {
+      case 0:
+	continue;
+      case DW_MACINFO_GNU_transparent_include:
+	{
+	  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+	  tree comdat_key = get_identifier (ref->info);
+	  /* Terminate the previous .debug_macinfo section.  */
+	  dw2_asm_output_data (1, 0, "End compilation unit");
+	  targetm.asm_out.named_section (DEBUG_MACINFO_SECTION,
+					 SECTION_DEBUG
+					 | SECTION_LINKONCE,
+					 comdat_key);
+	  ASM_GENERATE_INTERNAL_LABEL (label,
+				       DEBUG_MACINFO_SECTION_LABEL,
+				       ref->lineno);
+	  ASM_OUTPUT_LABEL (asm_out_file, label);
+	  ref->code = 0;
+	  free (CONST_CAST (char *, ref->info));
+	  ref->info = NULL;
+	}
+	break;
+      case DW_MACINFO_define:
+      case DW_MACINFO_undef:
+	output_macinfo_op (ref);
+	ref->code = 0;
+	free (CONST_CAST (char *, ref->info));
+	ref->info = NULL;
+	break;
+      default:
+	gcc_unreachable ();
+      }
 }
 
 /* Set up for Dwarf output at the start of compilation.  */

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]