This is the mail archive of the
mailing list for the GCC project.
Re: Faster compilation speed
- From: Linus Torvalds <torvalds at transmeta dot com>
- To: Robert Dewar <dewar at gnat dot com>
- Cc: dberlin at dberlin dot org, <gcc at gcc dot gnu dot org>, <kevin at atkinson dot dhs dot org>
- Date: Sat, 10 Aug 2002 09:45:58 -0700 (PDT)
- Subject: Re: Faster compilation speed
On Sat, 10 Aug 2002, Robert Dewar wrote:
> If garbage collection is taking a significant amount of time (is this really
> the case),
Note that my whole argument was that the problem with GC is not the
_collection_, but the side effects of GC.
You can always make the collection take basically zero time by just not
doing it very often. Problem solved. In fact, this is the very problem
that most of the papers and implementations seem to have focused on, yet I
don't think that particular problem is very interesting at all.
So the real problem with GC is two-fold: the lack of spatial and in
particular temporal locality, because lazy de-allocation (which you need
to keep the collection overhead acceptably low) "smears" out the locality
over time (and over space). The bigger the GC cycle, the bigger the
Many GC schemes try to help the spatial locality by doing compaction, but
that incurs extra cache overhead, and it still totally misses the temporal
locality you get from re-using allocations quickly.
Another way of saying the same thing in a gcc-centric manner: this is
equivalent to stack slot re-use. Clearly stack slot re-use is a good
thing, but it requires exact liveness analysis. Similarly, allocation
re-use is a good thing, but it requires exact (non-lazy) collection.
And the thing is, the "smearing" of locality you get by lazy collection
kills your caches and your TLB footprint, but it does NOT show up as "gc
overhead". It shows up as overhead everywhere else, which is probably why
so many GC proponents just ignore it.
The other argument I have against GC is the mindset it fosters. See my
point about how you can _trivially_ do copy-on-write with a manual (or
automatic but _exposed_) refcounting scheme. Doing the same with lazy
collection is an excercise in futility - so you don't do it.
Just to make a point: look at copy_rtx_if_shared(), which tries to do
this (yeah, I have an older tree, maybe this is fixed these days. I
seriously doubt it).
The code is CRAP. Total and utter sh*t. The damn thing should just test a
reference count and be done with it. Instead, it has this heuristic that
knows about some rtx's that might be shared, and knows which never can be.
And that _cap_ comes directly from the fact that the code uses a lazy GC
scheme instead of a more intelligent memory manager.
THAT is my beef with GC. I don't care at _all_ about the collection cost.
The real problems are elsewhere, and GC proponents never even acknowledge
(And please - the person who wrote copy_rtx_if_shared() - don't take my
complaint personally. I'm not trying to rag on the poor soul who had to
implement that silly switch statement etc. It's not your fault. It's the
fault of a much more fundamental mistake, and copy_rtx_if_shared is an
innocent bystander, a victim of circumstance and a stupid decision to use