This is the mail archive of the java-discuss@sources.redhat.com mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: Precise GC (was Re: cannot build libjava/gnu/gcj/xlib/natClip.cc)


> From: Cedric Berger [mailto:cedric@wireless-networks.com]
> Jeff Sturm wrote:
> > 
> > Cedric Berger wrote:
> > > In embedded systems, predictability it the top requirement. And I
> > > don't see how to get predictable results without a precise GC.
> > 
> > In that environment, is the real concern pointer 
> misidentification, heap
> > fragmentation or both?
> 
> Well, both are bad, but pointer misidentification is clearly 
> the worst:
> 
> Fragmentation append with C code too, but you can workaround it by
> 1) preallocating/recycling most of the big objects.
> 2) keeping for example a 20% margin on the heap, and testing 
> extensively.
It seems to me that with anything like a 20% margin, you're clearly relying
on statistics.
Failures will probably be very rare, but not impossible.   For most
reasonable allocation algorithms that don't move objects, the worst case
fragmentation is roughly a factor of log(<max object size>/<min object
size>).  Asymptotically, this is provably the best you can do, unless you
effectively predivide the memory for different size classes.  Even if you
get the really big objects out of the picture, that means it's at least a
factor of 5 or so.  Typical fragmentation seems to be on the order of 5-50%,
depending on the allocator.  Thus there's a big margin between the typical
and worst case. 
> 
> Pointer misidentification, on the other end, is completely 
> unpredictable:
> you can test your device for ours in your lab, and it will 
> work all the
> time. But it will suddenly fail when a customer decide to, say, use
> 123.234.345.102 as the device IP address, and that IP address 
> (stored as
> an int somewhere) matches the memory address of a big object 
> tree that 
> must be GC'd.
Assuming you avoid certain brittle data structures (see
http://www.hpl.hp.com/personal/Hans_Boehm/gc/bounds.html ), a false
reference costs you one copy of your largest data structure, in the worst
case, i.e. up to a factor of 2.  Since you can get multiple false
references, this is potentially worse than fragmentation.  It certainly adds
to the uncertainty.  But it's not clear to me it's very fundamentally
different.  The paper by Hirzel and Diwan in the ISMM 2000 Proceedings also
measures overheads comparable to fragmentation overheads.

I'm not arguing that all embedded systems should use conservative garbage
collectors.  But it has always seemed to me that once you move from
completely static allocation to dynamic allocation of variable sized
objects, you usually have some unpredictability, and everything else is a
matter of degree.

Hans

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]