This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Ephemeral garbage [Was: GCC 3.2.1 -> GCC 3.3 compile speed regression]
- From: "Timothy J. Wood" <tjw at omnigroup dot com>
- To: Ziemowit Laski <zlaski at apple dot com>
- Cc: gcc at gcc dot gnu dot org, Geoffrey Keating <geoffk at apple dot com>
- Date: Sat, 1 Feb 2003 00:09:18 -0800
- Subject: Ephemeral garbage [Was: GCC 3.2.1 -> GCC 3.3 compile speed regression]
On Thursday, January 30, 2003, at 06:45 PM, Ziemowit Laski wrote:
[...]
What we see from the GC numbers is that the 3.3 collections are able
to reclaim a greater percentage of allocated memory at each turn.
This could be a combination of two factors:
- Data in the 3.3 compiler have better temporal locality (!!)
- The 3.2.1 collector was buggy/incomplete and could not reliably
reclaim some stuff.
Another possibility that occurs to me is that 3.3 might be allocating
more ephemeral blocks. I guess you could say that this is better
temporal locality, but maybe its just a few sloppy temporary object
allocations.
If you sum the memory reclaimed during collection (i.e., the total
size of temporary objects that didn't actually end up being needed for
the whole life of the compiler), you get 17820k for your 3.2.1 run and
38408k for the 3.3 run.
(BTW, I was using: pbpaste | sed 's/[^-0-9]//g' | bc | awk
'{sum=sum+$1} END{print sum}' to get the numbers from your data :)
So, to me it looks like 3.3 creates 2.15x as much garbage that needs
collecting.
IMHO, the garbage collector should be used for objects with lifetimes
that are difficult to determine. Local temporary stuff with easily
computed lifetimes should be on a obstack or something similar and not
get allocated from the GC. I imagine this is harder than just saying
that, but it may help to reduce the load on the garbage collector by
creating less garbage.
Just to add some new data to the discussion, I took Zem's test file
and ran it through the head of the 3.3 branch after making a change to
ggc-page.c to collect after every ggc_pop_context call. As you might
expect, this makes the compiler rather slow :) But, it lets you see
something about how much trash is generated for various bits of work
(with -Q on, say).
This file contains a an example of the output when run with the
structureparser.gcc-3_3-branch.ii file:
http://www.omnigroup.com/~bungi/gc.txt.gz
As one example, right at the top of Zem's file there is:
inline int qRound( double d )
{
return d >= 0.0 ? int(d + 0.5) : int( d - ((int)d-1) + 0.5 ) +
((int)d-1);
}
I made several renamed duplicates of this right after the original
(just to help stabilize the numbers) and got:
int qRound(double) {GC 187k -> 182k}
int xRound(double) {GC 188k -> 184k}
int x1Round(double) {GC 190k -> 185k}
int x2Round(double) {GC 191k -> 186k}
int x3Round(double) {GC 192k -> 188k}
int x4Round(double) {GC 194k -> 189k}
int x5Round(double) {GC 195k -> 190k}
int x6Round(double) {GC 196k -> 191k}
int x7Round(double) {GC 197k -> 193k}
Each one of these inlines ended up creating about 5k of garbage --
not actually useful data -- but 5k (more than a full page on Mac OS X!)
of crud that will never be used again. Multiply this by several
thousand times after including the STL headers and you have a ugly
picture :)
It looks like the actual *saved* data for each of these inlines was
between 1k and 2k (closer to 1k). Not a good ratio, I think.
I need to do some work on ggc-page.c before I can run the real test I
wanted to run (basically I want to detect exactly which blocks became
free during a collection so that I can hook this into OmniObjectMeter
and try to see which allocation sites have the most short lived blocks).
-tim