PATCH: new GC implementation
Andi Kleen
ak@muc.de
Wed Sep 22 12:50:00 GMT 1999
On Wed, Sep 22, 1999 at 08:05:36PM +0200, Mark Mitchell wrote:
> >>>>> "Andi" == Andi Kleen <ak@muc.de> writes:
>
> Andi> The kernel does not do anything like that. mmaping /dev/zero
> Andi> and MAP_ANONYMOUS even call the same function and end up
> Andi> mmaping the zero page readonly into your memory. Note that
> Andi> Linux does no cache colouring for pages, this means memory
> Andi> allocation benchmarks are generally unpredictable because of
> Andi> the cache effects. I guess you were fooled by the profilers
> Andi> because of that.
>
> It's possible. But, we did multiple runs on an otherwise unloaded
> system, precisely to try to avoid cache issues. Time spent in mmap
> was noticeably higher (about half a second on a 15-second total
> execution time). The difference in the time spent in mmap was exactly
> the speedup we saw for the total execution time. I don't know how to
> explain that phenomenon, other than to think that the kernel does
> *something* different in these two cases. (Clearly, it must do a few
> different things; it's got to resolve the file-descriptor its given to
> figure out it's the /dev/zero fd. I'm not suggesting that's using
> lots of time; just that there must be some different control flow at
> some point.)
Yes there is actually. Sorry I was wrong. MAP_ANONYMOUS does not manipulate
the the page tables directly, but only creates a vma (a highlevel memory
object) for the mapped range. The actual pte is only updated on access.
/dev/zero directly sets up all the ptes, which can be a quite slow operation
on x86 especially on SMP (the CPU always flushes page table changes to
the memory bus).
This is actually a bug, it should really use the same method as MAP_ANONYMOUS.
Thank you for finding it.
-Andi
--
This is like TV. I don't like TV.
More information about the Gcc-patches
mailing list