bug in ggc-page?

Mon Dec 16 08:25:00 GMT 2002

(I'm not quite sure who originally added the extra_order_size_table, 
and the inverse_table, annotate tells me it's probably one of you three)

In implementing the copying collectors, one of which is based on 
ggc-page, i've run across what appears to be a bug in ggc-page. I'm a 
bit surprised it hasn't been noticed before.

I use the marking bits to track unmovable objects.
In setting the marks on certain ggc allocated things (ggc_allocated_p 
tells me they are ggc allocated, so i'm positive i'm not trying to mark 
non-ggc allocated objects) , we get crashes.

But they only occur if the page is for "order <x>" objects, where x > 
31.

The crash occurs in ggc_set_mark, at

" if (entry->in_use_p[word] & mask)"

(gdb) p word
$10 = 2396745
(gdb) p mask
$11 = 2048
(gdb) p bit
$12 = 76695851

This is, of course, odd.
So i look further at how this is calculated a few lines up:

1073      bit = OFFSET_TO_BIT (((const char *) p) - entry->page, 
entry->order);
1074      word = bit / HOST_BITS_PER_LONG;
1075      mask = (unsigned long) 1 << (bit % HOST_BITS_PER_LONG);

(gdb) p p - entry->page
$13 = 776
(gdb) p entry->bytes
4096
Okeydokey,
Both are reasonable.
(gdb) p entry->order
$14 = 32 ' '
Hmmmm. Looks okay to me.
OFFSET_TO_BIT calculates using the inverse table for the entry's order.
So let's look at that:
(gdb) p inverse_table[32]
$15 = {
   mult = 3067833783,
   shift = 4
}
Uh oh.

Let's look at the whole thing:
gdb) p inverse_table[0]@32+4
$17 = {{
     mult = 1,
     shift = 0
   }, {
     mult = 1,
     shift = 1
   }, {
     mult = 1,
     shift = 2
   }, {
     mult = 1,
     shift = 3
   }, {
     mult = 1,
     shift = 4
   }, {
     mult = 1,
     shift = 5
   }, {
     mult = 1,
     shift = 6
   }, {
     mult = 1,
---Type <return> to continue, or q <return> to quit---
     shift = 7
   }, {
     mult = 1,
     shift = 8
   }, {
     mult = 1,
     shift = 9
   }, {
     mult = 1,
     shift = 10
   }, {
     mult = 1,
     shift = 11
   }, {
     mult = 1,
     shift = 0
   } <repeats 20 times>, {
     mult = 3067833783,
     shift = 4
   }, {
     mult = 3435973837,
     shift = 2
   }, {
     mult = 2863311531,
     shift = 2
   }, {
     mult = 3123612579,
     shift = 2
   }}
(gdb)
(gdb)

Looks like the extra orders have very odd values for mult.
But that's actually not unexpected if one thinks about it.
We do all calculation in unsigned int's, rather than 
HOST_WIDE_UNSIGNED_INT, so the orders > 31 are going to overflow.
Which they do.

Just to be sure it's not corruption, let's recompute the table for the 
extra size orders:

(gdb) p compute_inverse (31)
$24 = void
(gdb) p compute_inverse (32)
$26 = void
(gdb) p compute_inverse (33)
$27 = void
(gdb) p compute_inverse (34)
$28 = void
(gdb)
(gdb) p inverse_table [32]
$23 = {
   mult = 3067833783,
   shift = 4
}

Nope, it's not corrupted.

If i'm right, it means commenting out the extra_order_size_table orders 
should fix the problem.
Which it does.

Comments, thoughts?