Ephemeral garbage [Was: GCC 3.2.1 -> GCC 3.3 compile speed regression]

Timothy J. Wood tjw@omnigroup.com
Sat Feb 1 08:09:00 GMT 2003


On Thursday, January 30, 2003, at 06:45  PM, Ziemowit Laski wrote:
[...]
> What we see from the GC numbers is that the 3.3 collections are able 
> to reclaim a greater percentage of allocated memory at each turn.  
> This could be a combination of two factors:
>   - Data in the 3.3 compiler have better temporal locality (!!)
>   - The 3.2.1 collector was buggy/incomplete and could not reliably 
> reclaim some stuff.

   Another possibility that occurs to me is that 3.3 might be allocating 
more ephemeral blocks.  I guess you could say that this is better 
temporal locality, but maybe its just a few sloppy temporary object 
allocations.

   If you sum the memory reclaimed during collection (i.e., the total 
size of temporary objects that didn't actually end up being needed for 
the whole life of the compiler), you get 17820k for your 3.2.1 run and 
38408k for the 3.3 run.

    (BTW, I was using: pbpaste | sed 's/[^-0-9]//g' | bc | awk 
'{sum=sum+$1} END{print sum}' to get the numbers from your data :)

   So, to me it looks like 3.3 creates 2.15x as much garbage that needs 
collecting.

   IMHO, the garbage collector should be used for objects with lifetimes 
that are difficult to determine.  Local temporary stuff with easily 
computed lifetimes should be on a obstack or something similar and not 
get allocated from the GC.  I imagine this is harder than just saying 
that, but it may help to reduce the load on the garbage collector by 
creating less garbage.

   Just to add some new data to the discussion, I took Zem's test file 
and ran it through the head of the 3.3 branch after making a change to 
ggc-page.c to collect after every ggc_pop_context call.  As you might 
expect, this makes the compiler rather slow :)   But, it lets you see 
something about how much trash is generated for various bits of work 
(with -Q on, say).

   This file contains a an example of the output when run with the 
structureparser.gcc-3_3-branch.ii file:

	http://www.omnigroup.com/~bungi/gc.txt.gz

   As one example, right at the top of Zem's file there is:

inline int qRound( double d )
{
     return d >= 0.0 ? int(d + 0.5) : int( d - ((int)d-1) + 0.5 ) + 
((int)d-1);
}

   I made several renamed duplicates of this right after the original 
(just to help stabilize the numbers) and got:

int qRound(double) {GC 187k -> 182k}
int xRound(double) {GC 188k -> 184k}
int x1Round(double) {GC 190k -> 185k}
int x2Round(double) {GC 191k -> 186k}
int x3Round(double) {GC 192k -> 188k}
int x4Round(double) {GC 194k -> 189k}
int x5Round(double) {GC 195k -> 190k}
int x6Round(double) {GC 196k -> 191k}
int x7Round(double) {GC 197k -> 193k}

   Each one of these inlines ended up creating about 5k of garbage -- 
not actually useful data -- but 5k (more than a full page on Mac OS X!) 
of crud that will never be used again.  Multiply this by several 
thousand times after including the STL headers and you have a ugly 
picture :)

   It looks like the actual *saved* data for each of these inlines was 
between 1k and 2k (closer to 1k).  Not a good ratio, I think.

   I need to do some work on ggc-page.c before I can run the real test I 
wanted to run (basically I want to detect exactly which blocks became 
free during a collection so that I can hook this into OmniObjectMeter 
and try to see which allocation sites have the most short lived blocks).

-tim



More information about the Gcc mailing list