This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: gcc 3.3 garbage collector defaults
- From: Richard Earnshaw <rearnsha at arm dot com>
- To: Andi Kleen <ak at suse dot de>
- Cc: Richard dot Earnshaw at arm dot com, Matt Austern <austern at apple dot com>, Zack Weinberg <zack at codesourcery dot com>, Ziemowit Laski <zlaski at apple dot com>, Neil Booth <neil at daikokuya dot co dot uk>, Benjamin Kosnik <bkoz at redhat dot com>, gcc at gcc dot gnu dot org, libstdc++ at gcc dot gnu dot org
- Date: Wed, 29 Jan 2003 16:24:35 +0000
- Subject: Re: gcc 3.3 garbage collector defaults
- Organization: ARM Ltd.
- Reply-to: Richard dot Earnshaw at arm dot com
>
> I'm not sure I completely understand your reasoning.
>
> On Wed, Jan 29, 2003 at 11:44:04AM +0000, Richard Earnshaw wrote:
> > Note that this has much higher transient use of memory, particularly if we
> > allocate blocks of memory for trivial purposes, only to discard the result
> > very quickly; this is often the case with RTL, where we allocate some RTL
> > in order to try something, only to discover that it is not valid and throw
> > it away. Further, since B is now dead memory all the effort to bring it
> > into the cache has been wasted and we must now bring new memory into the
>
> you mean bringing it into the cache for gc purposes?
No, I mean that when we initially allocate B we bring it into the cache.
If we immediately dump it and go on to allocate C, then instead of
unwinding, and thus using existing cached memory (B) to allocate C, we use
a new chunk of memory that isn't in the cache. The memory that is already
in the cache (B) is just ignored until finally the processor throws it out.
> Except if you have lots of small objects smaller than a cache line
> and your allocator doesn't fully initialize it the working set in
> cache should be the same.
Not if they die very quickly. If we are recycling memory that dies
quickly, then it is highly likely to be in the cache and therefore the
allocation of the new object over the old will reuse those already cached
lines.
> If it does memset and your memset is clever enough to use
> a write combining write on i386 or dcbz on ppc or similar it will
> not even have any fetch traffic on initializing
> (the gc does memset a new object when allocating, right?)
Well, yes it is initialized, but not normally to all 0 -- for an RTL
object the code of the RTL will always be written into the object thus
bringing it into the cache.
> > On a sufficiently large compilation (sufficient for the GC to kick in
> > several times) the *total* memory used by the compiler will probably be
> > about the same for both approaches, but the transient memory usage
> > patterns will remain very different, since the GC approach is terrible for
> > its short-term reuse of dead memory.
>
> [assuming your objects fit without too much waste into cache lines]
>
> Caches do not really care about how much memory you use overall,
> just how much memory you touch in a given time.
No, caches care about how much memory you touch over a *very short* period
of time. If we make the processor fetch some memory into the cache, then
we want to ensure we take maximum benefit from that effort. We don't want
to just drop it on the floor and get some new memory.
> When you allocate a lot more memory overall then the caches will be
> stressed, because the operating system clearing newly allocated
> pages needs to put them into cache too. This overhead should be fairly
> limited, if you don't allocate too excessively.
Again, it's not about long-term effects, it's about very short-term ones.
R.