This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Faster compilation speed
- From: Linus Torvalds <torvalds at transmeta dot com>
- To: dberlin at dberlin dot org, gcc at gcc dot gnu dot org
- Cc:
- Date: Sat, 10 Aug 2002 18:20:51 -0700
- Subject: Re: Faster compilation speed
- Newsgroups: linux.egcs
- Organization:
- References: <20020810212553.GA22959@chepelov.org>
In article <Pine.LNX.4.44.0208102031550.8641-100000@dberlin.org> you write:
>
>The numbers I get on a p4 with cachegrind are *much* worse in all cases.
>
>The miss rates are all >2%, which is a far cry from 0.1% and 0.0%.
One thing to look out for when looking at cache miss numbers is what
they actually _mean_.
That is particularly true when it comes to the percentages. Are the
percentages relative to #instructions, or #memops, or #line fetches (the
latter ends up being interesting especially for I$).
The "percentage per instruction" number is to some degree a nonsensical
number (since many instructions do not do any D$ accesses at all), but
it has the advantage that it makes the I$ and D$ misses comparable, and
it also allows you to make a quick estimation of how much time was
actually spent on cache misses.
The _best_ number to get (and in the end, the only one that really
matters) is "cycles spent waiting on cache" and "cycles spent doing
useful work", but I don't think valgrind gives you that. The P4
counters should do it, though.
If you wan tto use the HW counters under Linux, get "oprofile" from
sourceforge.net. (I don't think it does P4 events yet, though)
Linus