This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: some profiling numbers
On Wednesday, June 25, 2003, at 4:52 PM, graydon hoare wrote:
Daniel Berlin wrote:
You can't make statements that changing the storage model will or
won't change in terms of performance, or about cache friendliness of
most functions, when 99% of all the structure accesses are happening
in macros that are not accounted for in any of the charts, except in
the functions that call them.
You need to know how macro use has changed, the cache miss rate
changes in the macros, etc, in order to properly make general
statements about gcc overall.
your initial criticism is true -- I do not account for the code size
changes in terms of macro calls, nor do I account for the relative
frequency or elapsed time in macro calls. this is because macros are
erased (I think) by the time I see them.
Right
unless dwarf is now preserving macro information and our dwarf reading
is missing it,
Well, it can output info about macros (you need -g3), but only to the
extent that you can do something like p AMACRO(5) in gdb, and have it
give you the right answer.
It can't stop the preprocessor from expanding them in the source,
obviously.
those symbols don't exist to the profiler. I should put a strong
qualification about this in place though, you're right.
your second point I think is not so much true. there are 2 sets of
graphs for each cache unit, with different y axes: number of symbols
and number of samples. each categorization reflects -- in a slightly
different way -- changes to code expanded inline in a function as well
as new functions added. neither "ignores" macros; macro code makes
load and store requests and consumes clock cycles just like any other
code. the tentative conclusion that the cache units are performing
reasonably well under garbage collection is simply extrapolation from
the fact that neither graph of cycle/miss ratios has a glaring jump to
the left that coincides with the introduction of GC.
Granted.
Also, what type of machine was this done on?
I don't remember noticing it written somewhere.
The evidence David Edelsohn gave was for an AIX machine, i forget which
POWER processor model it was using.
I've also gotten similar numbers to Dave on a g4 running OSX.