This is the mail archive of the
mailing list for the GCC project.
Re: Faster compilation speed
Le Sat, Aug 10, 2002, à 06:20:51PM -0700, Linus Torvalds a écrit:
> >The numbers I get on a p4 with cachegrind are *much* worse in all cases.
> >The miss rates are all >2%, which is a far cry from 0.1% and 0.0%.
> One thing to look out for when looking at cache miss numbers is what
> they actually _mean_.
> That is particularly true when it comes to the percentages. Are the
> percentages relative to #instructions, or #memops, or #line fetches (the
> latter ends up being interesting especially for I$).
These are percentages relative to the number of accesses. L2 percentages are
also relative to the original number of accesses, not to the number of L1
> The _best_ number to get (and in the end, the only one that really
> matters) is "cycles spent waiting on cache" and "cycles spent doing
> useful work", but I don't think valgrind gives you that. The P4
> counters should do it, though.
Indeed, cachegrind won't tell you when there was a miss but the hardware was
smart enough to do something useful while it waits for the cache.
Despite this limitation, shouldn't
(number_of_L1_misses * N) + (number_of_L2_misses * M) * cycle_len
[where N is roughly 10 and M roughly 200, or updated figures] be a ballpark
figure of the time lost waiting for RAM to catch up?
> If you wan tto use the HW counters under Linux, get "oprofile" from
> sourceforge.net. (I don't think it does P4 events yet, though)
The site says it doesn't yet.