This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFC: ggc-boehm, proof of concept


Hi,

Finally I've debugged it to the point of successful bootstrap with
c,c++,objc (The Boehm's GC crash issue was successfully debugged with
help from Hans Boehm). This is somewhat of a milestone, so I decided
to present current code. Tests are underway, so far they look good. Of
course, the code here is not of production quality. This patch,
together with regenerated files, whitespace changes, etc. is commited
to the boehms-gc branch.

What's broken:
- PCH and some corner areas of GC. The number of stubs in ggc-boehm is
rather high.
- Building Boehm's GC for host machine as a part of bootstrap. I've
made some changes to top level Makefiles (see the diff) but apparently
they are not enough. I haven't investigated it further for now.
Boehm's GC should be built by hand for now.
- Old ggc-page and ggc-zone should work, but they were touched and not
tested afterwards (ggc_realloc_stat had to be moved from ggc-common.c
to indvidual collectors)

What works:
configure --with-gc=boehm

Assuming testsuite does not regress, now I will gather actual
performance data for the collector. I'm looking for ideas what would
be representative tests. Now I'm thinking about GCC bootstrap time,
memory and run time profiling with some large C file (Any good
candidate in GCC sources?) And corresponding tests with some C++ code,
I guess that some template-heavy Boost application would be a good
choice.

--
TIA for any comments,
Laurynas


Index: gcc/ggc-boehm.c =================================================================== --- gcc/ggc-boehm.c (revision 0) +++ gcc/ggc-boehm.c (revision 114910) @@ -0,0 +1,207 @@ +#include "config.h" +#include "system.h" +#include "options.h" +#include "params.h" +#include "timevar.h" +#include "ggc.h" + +#define GC_DEBUG +#include <gc.h> + +static size_t get_used_heap_size(void); +static void register_gty_roots(void); + +static size_t last_allocated = 0; +static ggc_stringpool_roots stringpool_roots; + +void +init_ggc (void) +{ + GC_init(); + GC_disable(); /* Do not collect on allocation */ + register_gty_roots(); + + stringpool_roots.start = NULL; + stringpool_roots.one_after_finish = NULL; +} + +void * +ggc_alloc_stat (size_t size MEM_STAT_DECL) +{ + void * result = GC_MALLOC(size); + return result; +} + +void * +ggc_realloc_stat (void *x, size_t size MEM_STAT_DECL) +{ + void * result = GC_REALLOC(x, size); + return result; +} + +void +ggc_collect (void) +{ + /* Avoid frequent unnecessary work by skipping collection if the + total allocations haven't expanded much since the last + collection. */ + float allocated_last_gc = + MAX (last_allocated, (size_t)PARAM_VALUE (GGC_MIN_HEAPSIZE) * 1024); + + float min_expand = allocated_last_gc * PARAM_VALUE (GGC_MIN_EXPAND) / 100; + + if (GC_get_heap_size() < allocated_last_gc + min_expand + && !ggc_force_collect) + return; + + timevar_push (TV_GC); + if (!quiet_flag) + fprintf (stderr, " {GC %luk -> ", + (unsigned long) get_used_heap_size() / 1024); + + if (!stringpool_roots.start) + stringpool_roots = ggc_register_stringpool_roots(); + else if (ggc_stringpool_moved_p(stringpool_roots)) + { + ggc_unregister_stringpool_roots(stringpool_roots); + stringpool_roots = ggc_register_stringpool_roots(); + } + + GC_enable(); + GC_gcollect(); + GC_disable(); + + if (!quiet_flag) + fprintf (stderr, "%luk}", (unsigned long) get_used_heap_size() / 1024); + last_allocated = GC_get_heap_size(); + timevar_pop (TV_GC); +} + +void +ggc_free (void * block) +{ + GC_FREE(block); /* For some blocks might be unprofitable? */ +} + +size_t +ggc_get_size (const void * block) +{ + return GC_size((void *)block); /* Note that GC_size may return a bit larger + value than originally requested */ +} + +int +ggc_marked_p (const void * d ATTRIBUTE_UNUSED) +{ + abort(); +} + +char * +ggc_pch_alloc_object (struct ggc_pch_data * d ATTRIBUTE_UNUSED, + void * p ATTRIBUTE_UNUSED, size_t s ATTRIBUTE_UNUSED, + bool b ATTRIBUTE_UNUSED, + enum gt_types_enum t ATTRIBUTE_UNUSED) +{ + abort(); +} + +void +ggc_pch_count_object (struct ggc_pch_data * d ATTRIBUTE_UNUSED, + void * p ATTRIBUTE_UNUSED, size_t s ATTRIBUTE_UNUSED, + bool b ATTRIBUTE_UNUSED, + enum gt_types_enum t ATTRIBUTE_UNUSED) +{ + abort(); +} + +void +ggc_pch_finish (struct ggc_pch_data * d ATTRIBUTE_UNUSED, + FILE * f ATTRIBUTE_UNUSED) +{ + abort(); +} + +void +ggc_pch_read (FILE * f ATTRIBUTE_UNUSED, void * p ATTRIBUTE_UNUSED) +{ + abort(); +} + +void +ggc_pch_this_base (struct ggc_pch_data * d ATTRIBUTE_UNUSED, + void * p ATTRIBUTE_UNUSED) +{ + abort(); +} + +void +ggc_pch_prepare_write (struct ggc_pch_data * d ATTRIBUTE_UNUSED, + FILE * f ATTRIBUTE_UNUSED) +{ + abort(); +} + +size_t +ggc_pch_total_size (struct ggc_pch_data * d ATTRIBUTE_UNUSED) +{ + abort(); +} + +void +ggc_pch_write_object (struct ggc_pch_data * d ATTRIBUTE_UNUSED, + FILE * f ATTRIBUTE_UNUSED, void * p1 ATTRIBUTE_UNUSED, + void * p2 ATTRIBUTE_UNUSED, size_t s ATTRIBUTE_UNUSED, + bool b ATTRIBUTE_UNUSED) +{ + abort(); +} + +void +ggc_print_statistics (void) +{ + struct ggc_statistics stats; + memset (&stats, 0, sizeof(stats)); + + last_allocated = 0; + + ggc_print_common_statistics (stderr, &stats); + + fprintf (stderr, + "\nMemory still allocated at the end of the compilation process\n"); + fprintf (stderr, + "Total heap size: %lu\n", (unsigned long)GC_get_heap_size()); + fprintf (stderr, + "Free bytes in the heap: %lu\n", (unsigned long)GC_get_free_bytes()); +} + +int +ggc_set_mark (const void * block ATTRIBUTE_UNUSED) +{ + abort(); +} + +struct ggc_pch_data * +init_ggc_pch (void) +{ + abort(); +} + + +size_t +get_used_heap_size(void) +{ + return GC_get_heap_size() - GC_get_free_bytes(); +} + +void +register_gty_roots(void) +{ + const struct ggc_root_tab *const *rt; + const struct ggc_root_tab *rti; + + for (rt = gt_ggc_rtab; *rt; rt++) + for (rti = *rt; rti->base != NULL; rti++) + GC_add_roots((char *)rti->base, (char *)rti->base + rti->stride); + + /* TODO: it might be required to process gt_ggc_cache_rtab here */ +} Index: gcc/c-lex.c =================================================================== --- gcc/c-lex.c (revision 114687) +++ gcc/c-lex.c (working copy) @@ -453,6 +453,8 @@

    case CPP_STRING:
    case CPP_WSTRING:
+      gcc_assert (tok->val.str.len != 0);
+
      if (!c_lex_return_raw_strings)
	{
	  type = lex_string (tok, value, false);
@@ -460,7 +462,7 @@
	}
      *value = build_string (tok->val.str.len, (char *) tok->val.str.text);
      break;
-
+
    case CPP_PRAGMA:
      *value = build_int_cst (NULL, tok->val.pragma);
      break;
@@ -724,6 +726,8 @@
  cpp_string str = tok->val.str;
  cpp_string *strs = &str;

+  gcc_assert (str.len != 0);
+
  if (tok->type == CPP_WSTRING)
    wide = true;

Index: gcc/ggc.h
===================================================================
--- gcc/ggc.h	(revision 114687)
+++ gcc/ggc.h	(working copy)
@@ -314,4 +314,14 @@

#endif

+/* Stringpool root information */
+typedef struct ggc_stringpool_roots {
+  void *start;
+  void *one_after_finish;
+} ggc_stringpool_roots;
+
+extern ggc_stringpool_roots ggc_register_stringpool_roots (void);
+extern void ggc_unregister_stringpool_roots (ggc_stringpool_roots roots);
+extern int ggc_stringpool_moved_p (ggc_stringpool_roots roots);
+
#endif
Index: gcc/ggc-common.c
===================================================================
--- gcc/ggc-common.c	(revision 114687)
+++ gcc/ggc-common.c	(working copy)
@@ -140,52 +140,6 @@
  return buf;
}

-/* Resize a block of memory, possibly re-allocating it.  */
-void *
-ggc_realloc_stat (void *x, size_t size MEM_STAT_DECL)
-{
-  void *r;
-  size_t old_size;
-
-  if (x == NULL)
-    return ggc_alloc_stat (size PASS_MEM_STAT);
-
-  old_size = ggc_get_size (x);
-
-  if (size <= old_size)
-    {
-      /* Mark the unwanted memory as unaccessible.  We also need to make
-	 the "new" size accessible, since ggc_get_size returns the size of
-	 the pool, not the size of the individually allocated object, the
-	 size which was previously made accessible.  Unfortunately, we
-	 don't know that previously allocated size.  Without that
-	 knowledge we have to lose some initialization-tracking for the
-	 old parts of the object.  An alternative is to mark the whole
-	 old_size as reachable, but that would lose tracking of writes
-	 after the end of the object (by small offsets).  Discard the
-	 handle to avoid handle leak.  */
-      VALGRIND_DISCARD (VALGRIND_MAKE_NOACCESS ((char *) x + size,
-						old_size - size));
-      VALGRIND_DISCARD (VALGRIND_MAKE_READABLE (x, size));
-      return x;
-    }
-
-  r = ggc_alloc_stat (size PASS_MEM_STAT);
-
-  /* Since ggc_get_size returns the size of the pool, not the size of the
-     individually allocated object, we'd access parts of the old object
-     that were marked invalid with the memcpy below.  We lose a bit of the
-     initialization-tracking since some of it may be uninitialized.  */
-  VALGRIND_DISCARD (VALGRIND_MAKE_READABLE (x, old_size));
-
-  memcpy (r, x, old_size);
-
-  /* The old object is not supposed to be used anymore.  */
-  ggc_free (x);
-
-  return r;
-}
-
/* Like ggc_alloc_cleared, but performs a multiplication.  */
void *
ggc_calloc (size_t s1, size_t s2)
Index: gcc/stringpool.c
===================================================================
--- gcc/stringpool.c	(revision 114687)
+++ gcc/stringpool.c	(working copy)
@@ -57,12 +57,11 @@

static hashnode alloc_node (hash_table *);
static int mark_ident (struct cpp_reader *, hashnode, const void *);
-static void ggc_register_stringpool_roots (void);

static void *
stringpool_ggc_alloc (size_t x)
{
-  return ggc_alloc (x);
+  return ggc_alloc (x); /* TODO: specialized alloc for no pointers inside? */
}

/* Initialize the string pool.  */
@@ -183,15 +182,34 @@
  ht_forall (ident_hash, mark_ident, NULL);
}

-/* Register the stringpool entries as GGC roots. TODO: why these are handled
-   specially? */
-void
+/* Register the stringpool entries as GGC roots.  In contrast to all other
+   roots, that are static, stringpool may increase and move around in memory.
+   So it's handled specially. */
+ggc_stringpool_roots
ggc_register_stringpool_roots (void)
{
-  GC_add_roots(ident_hash->entries,
-	       ident_hash->entries + sizeof(hashnode) * (1 << ORDER));
+  ggc_stringpool_roots result;
+  result.start = ident_hash->entries;
+  result.one_after_finish = ident_hash->entries + ident_hash->nslots;
+
+  GC_add_roots (result.start, result.one_after_finish);
+
+  return result;
}

+void
+ggc_unregister_stringpool_roots (ggc_stringpool_roots roots)
+{
+  GC_remove_roots (roots.start, roots.one_after_finish);
+}
+
+int
+ggc_stringpool_moved_p (ggc_stringpool_roots roots)
+{
+  return (roots.start != ident_hash->entries)
+    || (roots.one_after_finish != ident_hash->entries + ident_hash->nslots);
+}
+
/* Strings are _not_ GCed, but this routine exists so that a separate
   roots table isn't needed for the few global variables that refer
   to strings.  */
Index: Makefile.def
===================================================================
--- Makefile.def	(revision 114687)
+++ Makefile.def	(working copy)
@@ -26,7 +26,6 @@

build_modules= { module= libiberty; };
build_modules= { module= bison; };
-build_modules= { module= boehm-gc; };
build_modules= { module= byacc; };
build_modules= { module= flex; };
build_modules= { module= m4; };
@@ -42,6 +41,7 @@
host_modules= { module= opcodes; lib_path=.libs; bootstrap=true; };
host_modules= { module= binutils; bootstrap=true; };
host_modules= { module= bison; no_check_cross= true; };
+host_modules= { module= boehm-gc; bootstrap=true; };
host_modules= { module= byacc; no_check_cross= true; };
host_modules= { module= bzip2; };
host_modules= { module= dejagnu; };
@@ -271,6 +271,7 @@
dependencies = { module=configure-gcc; on=all-gas; };
dependencies = { module=configure-gcc; on=all-ld; };
dependencies = { module=all-gcc; on=all-libiberty; hard=true; };
+dependencies = { module=all-gcc; on=all-boehms-gc; hard=true; };
dependencies = { module=all-gcc; on=all-intl; };
dependencies = { module=all-gcc; on=all-build-texinfo; };
dependencies = { module=all-gcc; on=all-build-bison; };

Index: libcpp/charset.c
===================================================================
--- libcpp/charset.c	(revision 114687)
+++ libcpp/charset.c	(working copy)
@@ -23,6 +23,8 @@
#include "cpplib.h"
#include "internal.h"

+#include <assert.h>
+
/* Character set handling for C-family languages.

   Terminological note: In what follows, "charset" or "character set"
@@ -1317,6 +1319,8 @@

  for (i = 0; i < count; i++)
    {
+      assert (from[i].len != 0);
+
      p = from[i].text;
      if (*p == 'L') p++;
      p++; /* Skip leading quote.  */
Index: gcc/ggc-page.c
===================================================================
--- gcc/ggc-page.c	(revision 114687)
+++ gcc/ggc-page.c	(working copy)
@@ -633,6 +633,52 @@
  base[L1][L2] = entry;
}

+/* Resize a block of memory, possibly re-allocating it.  */
+void *
+ggc_realloc_stat (void *x, size_t size MEM_STAT_DECL)
+{
+  void *r;
+  size_t old_size;
+
+  if (x == NULL)
+    return ggc_alloc_stat (size PASS_MEM_STAT);
+
+  old_size = ggc_get_size (x);
+
+  if (size <= old_size)
+    {
+      /* Mark the unwanted memory as unaccessible.  We also need to make
+	 the "new" size accessible, since ggc_get_size returns the size of
+	 the pool, not the size of the individually allocated object, the
+	 size which was previously made accessible.  Unfortunately, we
+	 don't know that previously allocated size.  Without that
+	 knowledge we have to lose some initialization-tracking for the
+	 old parts of the object.  An alternative is to mark the whole
+	 old_size as reachable, but that would lose tracking of writes
+	 after the end of the object (by small offsets).  Discard the
+	 handle to avoid handle leak.  */
+      VALGRIND_DISCARD (VALGRIND_MAKE_NOACCESS ((char *) x + size,
+						old_size - size));
+      VALGRIND_DISCARD (VALGRIND_MAKE_READABLE (x, size));
+      return x;
+    }
+
+  r = ggc_alloc_stat (size PASS_MEM_STAT);
+
+  /* Since ggc_get_size returns the size of the pool, not the size of the
+     individually allocated object, we'd access parts of the old object
+     that were marked invalid with the memcpy below.  We lose a bit of the
+     initialization-tracking since some of it may be uninitialized.  */
+  VALGRIND_DISCARD (VALGRIND_MAKE_READABLE (x, old_size));
+
+  memcpy (r, x, old_size);
+
+  /* The old object is not supposed to be used anymore.  */
+  ggc_free (x);
+
+  return r;
+}
+
/* Prints the page-entry for object size ORDER, for debugging.  */

void
Index: gcc/ggc-zone.c
===================================================================
--- gcc/ggc-zone.c	(revision 114687)
+++ gcc/ggc-zone.c	(working copy)
@@ -929,6 +929,53 @@
  free (entry);
}

+/* Resize a block of memory, possibly re-allocating it.  */
+void *
+ggc_realloc_stat (void *x, size_t size MEM_STAT_DECL)
+{
+  void *r;
+  size_t old_size;
+
+  if (x == NULL)
+    return ggc_alloc_stat (size PASS_MEM_STAT);
+
+  old_size = ggc_get_size (x);
+
+  if (size <= old_size)
+    {
+      /* Mark the unwanted memory as unaccessible.  We also need to make
+	 the "new" size accessible, since ggc_get_size returns the size of
+	 the pool, not the size of the individually allocated object, the
+	 size which was previously made accessible.  Unfortunately, we
+	 don't know that previously allocated size.  Without that
+	 knowledge we have to lose some initialization-tracking for the
+	 old parts of the object.  An alternative is to mark the whole
+	 old_size as reachable, but that would lose tracking of writes
+	 after the end of the object (by small offsets).  Discard the
+	 handle to avoid handle leak.  */
+      VALGRIND_DISCARD (VALGRIND_MAKE_NOACCESS ((char *) x + size,
+						old_size - size));
+      VALGRIND_DISCARD (VALGRIND_MAKE_READABLE (x, size));
+      return x;
+    }
+
+  r = ggc_alloc_stat (size PASS_MEM_STAT);
+
+  /* Since ggc_get_size returns the size of the pool, not the size of the
+     individually allocated object, we'd access parts of the old object
+     that were marked invalid with the memcpy below.  We lose a bit of the
+     initialization-tracking since some of it may be uninitialized.  */
+  VALGRIND_DISCARD (VALGRIND_MAKE_READABLE (x, old_size));
+
+  memcpy (r, x, old_size);
+
+  /* The old object is not supposed to be used anymore.  */
+  ggc_free (x);
+
+  return r;
+}
+
+
/* Release the free page cache to the system.  */

static void


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]