This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: dynamic library cost (was RE: libtool, java woes)


I don't believe that there is a significant direct time benefit to accurate
stack scanning.  Conservative scanning could be made faster (if it mattered,
it usually doesn't, and it probably already is faster), and will certainly
result in smaller executables and less memory traffic during scanning.

As far as indirect effects are concerned, the story is a bit different.  The
best empirical comparison between type accurate and conservative collection
that I know of is

Hirzel, Diwan, "On the Type Accuracy of Garbage Collection", ISMM 2000

(See http://www.cs.colorado.edu/~diwan/ISMM-Hirzel.ps .
There is a new, related paper at
http://www.cs.colorado.edu/~diwan/ecoop01-gc-liveness.pdf )

They did a fair amount of work to measure differences in data seen as live
by the GC with various degrees of conservativism for programs they expected
to be a challenge for conservative collectors.  They didn't find any growing
leaks, but they did find significant differences (up to about 40%) in the
amount of retained memory, and the stack was often a significant part of the
cause.  As expected, this is highly architecture dependent, and the
difference is insignificant with curent applications on 64 bit plaforms.

I think the conclusion depends somewhat in what you are trying to achieve:

1) Meaningful hard space bounds.  This is hard no matter what.  You need:
 -  A definition of what constitutes "live data".  Most languages don't
define this, since it can be difficult to force optimizations to preserve
it.  Appel has a definition for ML, and I believe there is one for Haskell.
Java carefully doesn't define it.  I don't think Ada defines it either.
 -  Probably a compacting collector, since you otherwise lose a factor of
roughly log(<largest object size>/<smallest object size>) in worst case
space use.
 -  Type accurate scans of the heap and static data.
 -  Probably an accurate stack scan.  (For most applications, you otherwise
lose a factor of <number of misidentified pointers>, and in a few cases the
loss is potentially unbounded.  See
http://www.hpl.hp.com/personal/Hans_Boehm/gc/bounds.html .)  Many, though
not all, approaches here require execution overhead for programs that don't
allocate, e.g. to potentially stop execution at "safe points", or to limit
derived pointers at those "safe points".

2) Minimal surprise in space usage at minimal cost.  I'm not sure what the
right solution is.  Perhaps profile-based type info for frequently executed
call sites?  In the absence of C/C++ code, perhaps full type information, so
that you can fully compact on occasion?

3) Minimal expected running time.  The answer seems to be completely
dependent on the application and architecture.
 
Hans

> -----Original Message-----
> From: dewar@gnat.com [mailto:dewar@gnat.com]
> Sent: Friday, April 13, 2001 8:10 AM
> To: dewar@gnat.com; olson@mmsi.com
> Cc: gcc@gcc.gnu.org; hans_boehm@hp.com; java@gcc.gnu.org;
> jsturm@one-point.com; tromey@redhat.com
> Subject: Re: dynamic library cost (was RE: libtool, java woes)
> 
> 
> <<Seems like the compiler would have to emit a ton of 
> debugging information
> to specify the contents of each word of the stack frame at each PC
> location
> within the program.  Wading through all this mess in a GC 
> would likely be
> more costly than just conservatively scanning the stacks.
> >>
> 
> Well there is also the concern that in a language like C, 
> type accurate
> scanning is not possible (because of free unions), so you still end up
> with conservative guesses in any case, and there is some 
> question as to
> whether there is enough gain. Yes, it is definitely a gain to go to
> a 100% type accurate solution (as would be possible with a 
> language like
> Java or Ada), since then you can count on no memory leaks, but if you
> only go part way, it is not clear that it is worth it.
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]