This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Ephemeral garbage [Was: GCC 3.2.1 -> GCC 3.3 compile speed regression]


On Thursday, January 30, 2003, at 06:45  PM, Ziemowit Laski wrote:
[...]
What we see from the GC numbers is that the 3.3 collections are able to reclaim a greater percentage of allocated memory at each turn. This could be a combination of two factors:
- Data in the 3.3 compiler have better temporal locality (!!)
- The 3.2.1 collector was buggy/incomplete and could not reliably reclaim some stuff.
Another possibility that occurs to me is that 3.3 might be allocating more ephemeral blocks. I guess you could say that this is better temporal locality, but maybe its just a few sloppy temporary object allocations.

If you sum the memory reclaimed during collection (i.e., the total size of temporary objects that didn't actually end up being needed for the whole life of the compiler), you get 17820k for your 3.2.1 run and 38408k for the 3.3 run.

(BTW, I was using: pbpaste | sed 's/[^-0-9]//g' | bc | awk '{sum=sum+$1} END{print sum}' to get the numbers from your data :)

So, to me it looks like 3.3 creates 2.15x as much garbage that needs collecting.

IMHO, the garbage collector should be used for objects with lifetimes that are difficult to determine. Local temporary stuff with easily computed lifetimes should be on a obstack or something similar and not get allocated from the GC. I imagine this is harder than just saying that, but it may help to reduce the load on the garbage collector by creating less garbage.

Just to add some new data to the discussion, I took Zem's test file and ran it through the head of the 3.3 branch after making a change to ggc-page.c to collect after every ggc_pop_context call. As you might expect, this makes the compiler rather slow :) But, it lets you see something about how much trash is generated for various bits of work (with -Q on, say).

This file contains a an example of the output when run with the structureparser.gcc-3_3-branch.ii file:

http://www.omnigroup.com/~bungi/gc.txt.gz

As one example, right at the top of Zem's file there is:

inline int qRound( double d )
{
return d >= 0.0 ? int(d + 0.5) : int( d - ((int)d-1) + 0.5 ) + ((int)d-1);
}

I made several renamed duplicates of this right after the original (just to help stabilize the numbers) and got:

int qRound(double) {GC 187k -> 182k}
int xRound(double) {GC 188k -> 184k}
int x1Round(double) {GC 190k -> 185k}
int x2Round(double) {GC 191k -> 186k}
int x3Round(double) {GC 192k -> 188k}
int x4Round(double) {GC 194k -> 189k}
int x5Round(double) {GC 195k -> 190k}
int x6Round(double) {GC 196k -> 191k}
int x7Round(double) {GC 197k -> 193k}

Each one of these inlines ended up creating about 5k of garbage -- not actually useful data -- but 5k (more than a full page on Mac OS X!) of crud that will never be used again. Multiply this by several thousand times after including the STL headers and you have a ugly picture :)

It looks like the actual *saved* data for each of these inlines was between 1k and 2k (closer to 1k). Not a good ratio, I think.

I need to do some work on ggc-page.c before I can run the real test I wanted to run (basically I want to detect exactly which blocks became free during a collection so that I can hook this into OmniObjectMeter and try to see which allocation sites have the most short lived blocks).

-tim


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]