PATCH: new GC implementation

Andi Kleen ak@muc.de
Wed Sep 22 12:50:00 GMT 1999


On Wed, Sep 22, 1999 at 08:05:36PM +0200, Mark Mitchell wrote:
> >>>>> "Andi" == Andi Kleen <ak@muc.de> writes:
> 
>     Andi> The kernel does not do anything like that. mmaping /dev/zero
>     Andi> and MAP_ANONYMOUS even call the same function and end up
>     Andi> mmaping the zero page readonly into your memory. Note that
>     Andi> Linux does no cache colouring for pages, this means memory
>     Andi> allocation benchmarks are generally unpredictable because of
>     Andi> the cache effects. I guess you were fooled by the profilers
>     Andi> because of that.
> 
> It's possible.  But, we did multiple runs on an otherwise unloaded
> system, precisely to try to avoid cache issues.  Time spent in mmap
> was noticeably higher (about half a second on a 15-second total
> execution time).  The difference in the time spent in mmap was exactly
> the speedup we saw for the total execution time.  I don't know how to
> explain that phenomenon, other than to think that the kernel does
> *something* different in these two cases.  (Clearly, it must do a few
> different things; it's got to resolve the file-descriptor its given to
> figure out it's the /dev/zero fd.  I'm not suggesting that's using
> lots of time; just that there must be some different control flow at
> some point.)

Yes there is actually. Sorry I was wrong. MAP_ANONYMOUS does not manipulate 
the the page tables directly, but only creates a vma (a highlevel memory
object) for the mapped range. The actual pte is only updated on access.
/dev/zero directly sets up all the ptes, which can be a quite slow operation
on x86 especially on SMP (the CPU always flushes page table changes to
the memory bus). 

This is actually a bug, it should really use the same method as MAP_ANONYMOUS.
Thank you for finding it.

-Andi
-- 
This is like TV. I don't like TV.


More information about the Gcc-patches mailing list