This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Bootstrap failure due to GC / PCH memory corruption


Hello,

I'm seeing a bootstrap failure on s390x (which is apparently also
occurring on some other platforms) due to a compiler crash.

The crash turned out to be indirectly caused by a wild store
performed by the garbage collector code.  This in turn is
caused by an apparent mismatch of assumptions made by the GC
code proper (in particular, the OFFSET_TO_BIT mechanism) and
the PCH reader.

The OFFSET_TO_BIT macro in ggc-page.c computes the index of an
object inside a 'page', avoiding an explicit division by the
object size for performance reasons.  Instead, a multiplication
with a pre-computed inverse is used.  The preparation of this
inverse table makes the explicit assumption that if the object
size exceeds the physical page size of the platform, only one
object will exist per GC 'page'.  See compute_inverse at
ggc-page.c:1226.

This assumption holds true for all objects allocated by the
GC proper.  However, reading in a pre-compiled header file
results in adding its contents to the GC page pool using
synthesized 'page table entries' that describe them.  These
ptes, however, violate the above assumption.  In fact, for
every 'order' of object size, one single pte spans all the
objects of that order, even for objects whose size exceeds
the machine page size.  See ggc_pch_read at ggc-page.c:2045.

When the GC collect phase subsequently tries to mark an object
from one of those pages, the OFFSET_TO_BIT macro can return an
incorrect offset value, leading to clobbered memory and the
subsequent crash.

I'm not sure how to best fix this problem: either by creating
multiple ptes per order in ggc_pch_read so that the assumption
of one multi-page object per pte remains true, or else by
adapting compute_inverse to work in all cases without relying
on that assumption.  (The 'easy fix' of just removing the if
doesn't work because the search for the inverse won't terminate
for large object sizes.  I haven't investigated in detail why
this is so.)

The following 'quick fix' completely removes the OFFSET_TO_BIT
optimization and just puts back the division.  This allows
bootstrap to succeed on s390x ...

I'd appreciate any advice on how to fix the problem properly.

Bye,
Ulrich

Index: gcc/ggc-page.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/ggc-page.c,v
retrieving revision 1.69
diff -c -p -r1.69 ggc-page.c
*** gcc/ggc-page.c      17 Jun 2003 14:09:54 -0000      1.69
--- gcc/ggc-page.c      25 Jun 2003 23:03:28 -0000
*************** Software Foundation, 59 Temple Place - S
*** 163,169 ****
  #define DIV_MULT(ORDER) inverse_table[ORDER].mult
  #define DIV_SHIFT(ORDER) inverse_table[ORDER].shift
  #define OFFSET_TO_BIT(OFFSET, ORDER) \
!   (((OFFSET) * DIV_MULT (ORDER)) >> DIV_SHIFT (ORDER))

  /* The number of extra orders, not corresponding to power-of-two sized
     objects.  */
--- 163,169 ----
  #define DIV_MULT(ORDER) inverse_table[ORDER].mult
  #define DIV_SHIFT(ORDER) inverse_table[ORDER].shift
  #define OFFSET_TO_BIT(OFFSET, ORDER) \
!   /*(((OFFSET) * DIV_MULT (ORDER)) >> DIV_SHIFT (ORDER))*/ ((OFFSET) / OBJECT_SIZE (ORDER))

  /* The number of extra orders, not corresponding to power-of-two sized
     objects.  */

-- 
  Dr. Ulrich Weigand
  weigand@informatik.uni-erlangen.de


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]