This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

malloc-based ggc-page


Here's a patch that introduces the ability to use ggc-page on
platforms that lack mmap-anywhere and valloc.  It uses plain malloc.
It tries to keep the top (bottom?) of the heap aligned after (and
before) every allocation, and it's smart enough to skip blocks of the
heap large enough for a page, but that are not properly aligned.  If
unrelated memory allocations misalign the heap, it recovers reasonably
well, on all platforms I've tested.  However, if, after a few
attempts, it finds it can't make progress towards obtaining a properly
aligned page, it gives up and attempts to allocate a block larger than
requested, so that, after some padding to get to a page boundary, it
is still large enough for the requested data bytes.

In order to be able to free() such blocks, it was needed to arrange
that every such ``page'' contained a pointer to the address returned
by malloc.  This pointer is placed just *before* the page boundary.
That's possible because we arrange for malloc to return an address not
exactly at a page boundary, but right before it, so that the pointer
would fit.

I can't say that the page-finding code is extremely efficient,
particularly because it free()s every piece of memory it allocates
while looking for an aligned page.  This is done so that other
malloc()s spread in the code can use them, instead of misaligning the
top of the heap.  In the future, if we rule malloc out, we may change
this.  But then, if this ever happens, it's unlikely that we'll have
too many unaligned blocks either, so it may not make much of a
difference.

The fallback mechanism might also be considerably improved.
Currently, it looks for a block of memory as small as possible that
satisfies the requirement of containing a region starting at a page
boundary with as many bytes as requested after it.  This could be made
faster by performing a binary search on the additional size, between 0
and G.pagesize, and it could also attempt to arrange that the block of
memory ended at a page boundary.  However, the fallback mechanism is
used so rarely, and the complications that these issues would
introduce were so many, that I decided not to implement them.  For
example, assuming free does indeed release memory (which is not
guaranteed by the C Standard), and the malloc reuses it efficiently
(in terms of space, not time, i.e., it returns the first address in
the heap that can contain the requested amount of bytes), a binary
search might get confused if there are two blocks, the first larger
than the second, but not as close to a page boundary, so it might be
necessary to put the larger block away so as to be able to find the
appropriate block size for the second one.  And then, when we try to
pad the second block so as to try to get the top of the heap aligned,
we end up getting a different block, and we may be unable to get the
original block back.

The fallback mechanism could also not search anything at all, and just
malloc a page with the requested size plus a whole page.  This would
certainly work, but it wouldn't change the alignment of the heap,
while this change often helps recovering heap alignment in the next
step.

I have tested this mechanism by disabling mmap and valloc support
(setting ac_cv_func_mmap_anywhere and ac_cv_func_valloc to no in the
environment before running configure) and #defining MALLOC_ACCOUNT to
16 in ggc-page.c.  The fallback count was often 0 and, when it was
not, it was always two orders of magnitude smaller than the aligned
count (on GNU/Linux/x86 and /alpha; I don't have such exact numbers
for other platforms), in stage3 of an -O2 bootstrap.


Enough rambling.  Here's the patch.  Ok to install?

Index: gcc/ChangeLog
from  Alexandre Oliva  <oliva@lsd.ic.unicamp.br>
	
	* ggc-page.c (alloc_anon, release_pages, init_ggc): New
	malloc-based page allocation mechanism.
	(struct globals): New items malloc_offset and malloc_overhead.
	(OBJECTS_PER_PAGE): Use malloc_offset and malloc_overhead.
	(MISALIGNMENT): New macro.
	(init_ggc, clear_marks): Use it.
	* configure.in: Select ggc-page by default on all targets.
	* configure: Rebuilt.
	
Index: gcc/ggc-page.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/ggc-page.c,v
retrieving revision 1.20
diff -u -p -r1.20 ggc-page.c
--- gcc/ggc-page.c	2000/01/17 15:28:04	1.20
+++ gcc/ggc-page.c	2000/01/18 09:56:18
@@ -92,6 +92,19 @@
      3: Object allocations as well.
      4: Object marks as well.   */
 #define GGC_DEBUG_LEVEL (0)
+
+/* If defined to a positive number and malloc-based page allocation is
+   being used, counters are maintained for each successful allocation
+   using the mechanism that tries to convince malloc to return a
+   page-aligned address (aligned) and the fallback mechanism
+   (fallback).  Their values are printed every MALLOC_ACCOUNT
+   allocations. */
+#ifndef MALLOC_ACCOUNT
+# define MALLOC_ACCOUNT 0 /* 16 */
+#endif
+#if HAVE_MMAP_ANYWHERE || HAVE_VALLOC
+# undef MALLOC_ACCOUNT
+#endif
 
 #ifndef HOST_BITS_PER_PTR
 #define HOST_BITS_PER_PTR  HOST_BITS_PER_LONG
@@ -231,13 +244,28 @@ static struct globals
   /* A file descriptor open to /dev/zero for reading.  */
 #if defined (HAVE_MMAP_ANYWHERE) && !defined(MAP_ANONYMOUS)
   int dev_zero_fd;
-#endif
+#endif /* HAVE_MMAP_ANYWHERE */
 
+  /* This is usually 0, but AIX 4.1.5's malloc, for example, always
+     returns addresses p=16n+8.  This variable is supposed to be used
+     as an offset from the value returned by malloc to the beginning
+     of the page.  */
+  size_t malloc_offset;
+
+  /* This is how much space is used up by malloc internal structures.
+     It's typically 8 or 16 bytes. */
+  size_t malloc_overhead;
+
   /* A cache of free system pages.  */
   page_entry *free_pages;
 
   /* The file descriptor for debugging output.  */
   FILE *debug_file;
+
+#if MALLOC_ACCOUNT
+  /* Account number of pages allocated with each mechanism.  */
+  size_t aligned, fallback;
+#endif
 } G;
 
 
@@ -248,7 +276,9 @@ static struct globals
 /* The number of objects per allocation page, for objects of size
    2^ORDER.  */
 #define OBJECTS_PER_PAGE(Order) \
-  ((Order) >= G.lg_pagesize ? 1 : G.pagesize / ((size_t)1 << (Order)))
+  ((Order) >= G.lg_pagesize ? 1 \
+   : ((G.pagesize - G.malloc_overhead - G.malloc_offset) \
+      / ((size_t)1 << (Order))))
 
 /* The size in bytes required to maintain a bitmap for the objects
    on a page-entry.  */
@@ -265,6 +295,10 @@ static struct globals
    test from triggering too often when the heap is small.  */
 #define GGC_MIN_LAST_ALLOCATED (4 * 1024 * 1024)
 
+/* Test the misalignment of an address.  */
+#define MISALIGNMENT(Address) \
+  (((size_t)(Address)) & (G.pagesize - 1))
+
 
 static int ggc_allocated_p PARAMS ((const void *));
 static page_entry *lookup_page_table_entry PARAMS ((const void *));
@@ -429,6 +463,172 @@ alloc_anon (pref, size)
       fputs ("Virtual memory exhausted!\n", stderr);
       exit(1);
     }
+#else
+  char *other, *to_free = 0, *prev_page = 0;
+  size_t misalignment, origsize, newsize;
+  /* If we don't make progress with a few attempts to get an aligned
+     page, we'll revert to a fallback mechanism.  A failure is
+     accounted for when we allocate a block smaller than the requested
+     size, so as to try to get the heap aligned, but the block ends up
+     being placed after the page we had allocated before. */
+  static size_t max_failures = 3;
+  size_t failures_remaining = max_failures;
+
+  size += G.malloc_offset;
+  origsize = size;
+
+  /* Try to allocate a whole pages, so as to try to leave the memory
+     properly aligned for the next allocation. */
+  misalignment = MISALIGNMENT (size + G.malloc_overhead);
+  if (misalignment)
+    size += G.pagesize - misalignment;
+
+  while ((page = 0, failures_remaining != 0)
+	 && ((page = xmalloc (size)),
+	     (misalignment = MISALIGNMENT (page+G.malloc_offset)),
+	     misalignment != 0))
+    {
+      /* Calculate the size the allocated block should be so that it
+	 ends just before a page boundary, i.e., the next allocation
+	 would get us a page-aligned block. */
+      newsize = G.pagesize - misalignment - G.malloc_overhead;
+      if (G.pagesize <= misalignment + G.malloc_overhead)
+	newsize += G.pagesize;
+
+      /* If we get the same page twice, or some page before it, keep
+	 it allocated until the end, so that it doesn't keep pestering
+	 us within this loop.  Note that the case with page ==
+	 prev_page, that is indeed a situation of no progress, is not
+	 accounted as a failure here, because the only way for it to
+	 show up is after an accounted failure.  In any other case,
+	 prev_page would have been set to NULL. */
+      if (page <= prev_page)
+	{
+	  /* If we get exactly the same page, it may be worth to try
+	     to align the block just after it, and hope it won't be
+	     properly recycled.  This is quite likely, because, having
+	     gotten here, we'd have already tried to resize the
+	     previous block and it didn't work.  So let's try to
+	     resize the next block.  This helps on IRIX 5.2. */
+	  if (page == prev_page)
+	    {
+	      other = xmalloc (newsize);
+	      free (other);
+	    }
+	  
+	  /* Prepare to add the page to to_free. */
+	  other = page;
+	  newsize = size;
+	}
+      else
+	{
+	  free (page);
+
+	  /* If we got an exact one-page offset from the previous
+	     allocation, something has gone wrong with our attempted
+	     adjustment.  Shake it a little bit.  It helps at least on
+	     Solaris. */
+	  if (to_free && page == to_free + G.pagesize)
+	    {
+	      other = xmalloc (size + newsize + G.pagesize);
+	      free (other);
+	    }
+
+	  /* Try to resize the just-allocated-and-freed unaligned page
+	     block so that the next allocation will hopefully fall in
+	     a page boundary. */
+	  other = xmalloc (newsize);
+
+	  /* If it appears before that block, it's fine to get the
+	     same block again.  Eventually, we'll have allocated all
+	     newsize blocks before page. */ 
+	  if (other <= page)
+	    {
+	      prev_page = 0;
+	    }
+	  else
+	    {
+	      /* If we get a block after page, it won't help, so just
+		 release it immediately, and arrange for the unaligned
+		 page to be allocated next time we get to it. */
+	      free (other);
+	      other = 0;
+	      prev_page = page;
+
+	      /* Count this as one failure.  */
+	      --failures_remaining;
+	    }
+	}
+
+      /* Add OTHER to the linked list of blocks to be freed when we
+	 get an aligned page. */
+      if (other)
+	{
+	  *(char**)other = to_free;
+	  to_free = other;
+	}
+    }
+
+  /* Release all blocks held during this allocation attempt. */
+  while (to_free)
+    {
+      other = *(char**)to_free;
+      free (to_free);
+      to_free = other;
+    }
+
+  /* If we got an aligned page, skip the offset, which will get us to
+     the page-aligned address. */
+  if (failures_remaining != 0)
+    {
+      /* There are some unaligned page-sized blocks in the heap that
+	 are making it harder to find aligned pages.  Remember to try
+	 harder next time. */
+      if (failures_remaining < 3)
+	max_failures += 3 - failures_remaining;
+
+#if MALLOC_ACCOUNT
+      ++G.aligned;
+#endif
+      other = page;
+      page += G.malloc_offset;
+    }
+  else
+    {
+#if MALLOC_ACCOUNT
+      ++G.fallback;
+#endif
+      /* Otherwise, try to get a memory block that contains enough
+	 bytes for the alignment adjustment, the offset and the
+	 requested size. */
+      size = origsize;
+      while ((page = xmalloc (size)),
+	     (misalignment = MISALIGNMENT (page + G.malloc_offset)),
+	     misalignment &&
+	     (G.pagesize - misalignment + origsize > size))
+	{
+	  free (page);
+	  size += G.malloc_offset;
+	}
+
+      other = page;
+      if (misalignment)
+	page += G.pagesize - misalignment;
+      page += G.malloc_offset;
+    }
+
+#if MALLOC_ACCOUNT
+  if ((G.aligned + G.fallback) % MALLOC_ACCOUNT == 0)
+    {
+      fprintf(G.debug_file, "aligned: %lu, fallback: %lu\n",
+	      (unsigned long)G.aligned, (unsigned long)G.fallback);
+    }
+#endif
+
+  /* Set the pointer just before the page boundary to the beginning of
+     the block, so that it can be freed afterwards. */
+  ((char**)page)[-1] = other;
+  return page;
 #endif /* HAVE_VALLOC */
 #endif /* HAVE_MMAP_ANYWHERE */
 
@@ -578,6 +778,19 @@ release_pages ()
       G.bytes_mapped -= p->bytes;
       free (p);
     }
+#else /* xmalloc */
+  page_entry *p, *next;
+
+  for (p = G.free_pages; p ; p = next)
+    {
+      next = p->next;
+      /* Since xmalloc won't always give us page-aligned addresses, we
+	 store a pointer to the beginning of the block just before the
+	 page boundary. */
+      free (((char**)p->page)[-1]);
+      G.bytes_mapped -= p->bytes;
+      free (p);
+    }
 #endif /* HAVE_VALLOC */
 #endif /* HAVE_MMAP_ANYWHERE */
 
@@ -808,6 +1021,9 @@ init_ggc ()
 
   G.allocated_last_gc = GGC_MIN_LAST_ALLOCATED;
 
+  G.malloc_offset = 0;
+  G.malloc_overhead = 0;
+    
 #ifdef HAVE_MMAP_ANYWHERE
   /* StunOS has an amazing off-by-one error for the first mmap allocation
      after fiddling with RLIMIT_STACK.  The result, as hard as it is to
@@ -815,19 +1031,102 @@ init_ggc ()
      hork badly if we tried to use it.  */
   {
     char *p = alloc_anon (NULL, G.pagesize);
-    if ((size_t)p & (G.pagesize - 1))
+    if (MISALIGNMENT (p))
       {
 	/* How losing.  Discard this one and try another.  If we still
 	   can't get something useful, give up.  */
 
 	p = alloc_anon (NULL, G.pagesize);
-	if ((size_t)p & (G.pagesize - 1))
+	if (MISALIGNMENT (p))
 	  abort ();
       }
     munmap (p, G.pagesize);
   }
+#else
+  {
+    size_t misalignment;
+    char *p[3];
+    
+    p[0] = xmalloc (G.pagesize);
+    p[1] = xmalloc (G.pagesize);
+    
+    G.malloc_overhead = p[1] - p[0] - G.pagesize;
+
+    misalignment = MISALIGNMENT (p[0]);
+
+    free (p[1]);
+    free (p[0]);
+
+#ifndef HAVE_VALLOC
+#if TEST_MALLOC_ALIGNMENT
+    p[2] = p[0];
+#endif
+    p[0] = xmalloc (G.pagesize - misalignment - G.malloc_overhead);
+#if TEST_MALLOC_ALIGNMENT
+    if (p[0] != p[2])
+      abort();
+#endif
+    
+    p[2] = xmalloc (G.pagesize - G.malloc_overhead);
+    if (p[2] == p[1])
+      {
+#if TEST_MALLOC_ALIGNMENT
+	/* xmalloc() doesn't appear to merge free blocks, we may have
+	   to waste some unaligned allocs.  :-( */
+	misalignment = MISALIGNMENT (p[1]);
+	p[2] = xmalloc (G.pagesize - misalignment - G.malloc_overhead);
+	p[1] = xmalloc (G.pagesize - G.malloc_overhead);
 #endif
+      }
+    else if (MISALIGNMENT (p[2] + G.malloc_overhead / 2) == 0)
+      {
+	/* AIX 4.1.5's malloc will always return addresses of the form
+	   16n+8.  We're going to arrange that 16n+8+G.offset points
+	   to the beginning of a page. */
+	G.malloc_offset = G.malloc_overhead /= 2;
+
+#if TEST_MALLOC_ALIGNMENT
+	free (p[2]);
+	free (p[0]);
+
+	p[2] = xmalloc (G.pagesize - misalignment - G.malloc_overhead);
+	if (p[0] != p[2])
+	  abort();
+
+	p[1] = xmalloc (G.pagesize - G.malloc_overhead);
+#endif
+      }
+    else
+      p[1] = p[2];
 
+#if TEST_MALLOC_ALIGNMENT
+    if (MISALIGNMENT (p[1] + G.malloc_offset))
+      abort();
+#endif
+
+    /* We must store a pointer just before the beginning of each page.
+       So, we must arrange that there is enough space for it, and that
+       any extra overhead due to alignment requirements is compensated
+       for.  We must not release blocks of memory previously
+       allocated, because their sizes may modify the results. */
+    if (G.malloc_offset < sizeof(char*)) {
+      char *p[2];
+      G.malloc_offset = sizeof(char*);
+      p[0] = xmalloc (G.pagesize - G.malloc_offset);
+      p[1] = xmalloc (G.pagesize - G.malloc_offset);
+      G.malloc_offset = p[1] - p[0] - G.pagesize + G.malloc_overhead;
+      free (p[1]);
+      free (p[0]);
+    }
+
+    if (p[2] != p[1])
+      free (p[2]);
+    free (p[1]);
+    free (p[0]);
+#endif /* HAVE_VALLOC */
+  }
+#endif /* HAVE_MMAP_ANYWHERE */
+
   empty_string = ggc_alloc_string ("", 0);
   ggc_add_string_root (&empty_string, 1);
 }
@@ -934,9 +1233,9 @@ clear_marks ()
 	{
 #ifdef ENABLE_CHECKING
 	  /* The data should be page-aligned.  */
-	  if ((size_t) p->page & (G.pagesize - 1))
+	  if (MISALIGNMENT (p->page))
 	    abort ();
 #endif
 
Index: gcc/configure.in
===================================================================
RCS file: /cvs/gcc/egcs/gcc/configure.in,v
retrieving revision 1.325
diff -u -p -r1.325 configure.in
--- gcc/configure.in	2000/01/16 18:16:55	1.325
+++ gcc/configure.in	2000/01/18 09:56:18
@@ -4577,13 +4577,7 @@ AC_ARG_WITH(gc,
   *)
     AC_MSG_ERROR([$withval is an invalid option to --with-gc])
     ;;
-esac],
-[if test $ac_cv_func_mmap_anywhere = yes \
-    || test $ac_cv_func_valloc = yes; then
-  GGC=ggc-page
-else
-  GGC=ggc-simple
-fi])
+esac], [GGC=ggc-page])
 AC_SUBST(GGC)
 echo "Using $GGC for garbage collection."
 

-- 
Alexandre Oliva http://www.ic.unicamp.br/~oliva IC-Unicamp, Bra[sz]il
oliva@{lsd.ic.unicamp.br,guarana.{org,com}} aoliva@{acm,computer}.org
oliva@{gnu.org,kaffe.org,{egcs,sourceware}.cygnus.com,samba.org}
** I may forward mail about projects to mailing lists; please use them

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]