This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GNU C++ 4.0.1/4.1.0 cache misses on MICO sources.


On Tue, 17 May 2005, Mike Stump wrote:

On May 17, 2005, at 3:16 PM, Karel Gardas wrote:
1) the most expensive seems to be comptypes -- at least from data L2
  refill point of view (~17%)

2) comptypes is also the most CPU intensive operation since the most
  of time is spent there

I think comptypes can be sped up by canonicalizing types better, and also adding a conservative hash and checking it first.

Perhaps, anyway this is box with 1GB RAM. Now, I've just for fun used:


0) compiler params used were:
   -I../include  --param ggc-min-expand=30 --param ggc-min-heapsize=4096
   -Wall -D_REENTRANT -D_GNU_SOURCE   -DPIC -fPIC  -c

and the picture at least for 4.1.0 is completely different, see below, which means that for machine with small memory gcc misses L2 cache much more, about 529 CLK per one miss, also the top cache misses provider seems to be GC, second comptypes.

Cheers,
Karel


CPU: AMD64 processors, speed 1802.33 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 Counted DATA_CACHE_MISSES events (Data cache misses) with a unit mask of 0x00 (No unit mask) count 1000 Counted ICACHE_MISSES events (Instruction cache misses) with a unit mask of 0x00 (No unit mask) count 1000 Counted DATA_CACHE_REFILLS_FROM_SYSTEM events (Data cache refills from system) with a unit mask of 0x1f (All cache states ) count 1000 CPU_CLK_UNHALT...|DATA_CACHE_MIS...|ICACHE_MISSES:...|DATA_CACHE_REF...| samples| %| samples| %| samples| %| samples| %| ------------------------------------------------------------------------ 5795921 100.000 3695597 100.000 2946594 100.000 1095111 100.000 cc1plus

CPU: AMD64 processors, speed 1802.33 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000
Counted DATA_CACHE_MISSES events (Data cache misses) with a unit mask of 0x00 (No unit mask) count 1000
Counted ICACHE_MISSES events (Instruction cache misses) with a unit mask of 0x00 (No unit mask) count 1000
Counted DATA_CACHE_REFILLS_FROM_SYSTEM events (Data cache refills from system) with a unit mask of 0x1f (All cache states
) count 1000
samples  %        samples  %        samples  %        samples  %        symbol name
442873    7.6411  277095    7.4980  406       0.0138  210537   19.2252  gt_ggc_mx_lang_tree_node
357714    6.1718  297393    8.0472  341       0.0116  92100     8.4101  ggc_set_mark
208484    3.5971  364311    9.8580  48844     1.6576  88551     8.0860  comptypes
176284    3.0415  96291     2.6056  66753     2.2654  27903     2.5480  ggc_alloc_stat
158048    2.7269  188948    5.1128  26549     0.9010  13119     1.1980  lookup_fnfields_1
120791    2.0841  17681     0.4784  12771     0.4334  1178      0.1076  dfs_walk_all
101900    1.7581  8530      0.2308  4541      0.1541  1293      0.1181  record_reg_classes
97854     1.6883  28305     0.7659  9740      0.3306  5843      0.5336  walk_tree
80856     1.3951  6314      0.1709  33168     1.1256  990       0.0904  find_reloads
79626     1.3738  4311      0.1167  743       0.0252  640       0.0584  _cpp_lex_direct
75468     1.3021  64101     1.7345  22       7.5e-04  20321     1.8556  cp_tree_node_structure
60301     1.0404  7343      0.1987  6487      0.2202  2986      0.2727  splay_tree_splay_helper
57714     0.9958  41027     1.1102  4436      0.1505  16364     1.4943  ht_lookup_with_hash
56687     0.9780  7502      0.2030  313       0.0106  422       0.0385  _cpp_clean_line
51682     0.8917  71809     1.9431  1513      0.0513  21801     1.9908  compparms
51528     0.8890  65441     1.7708  10699     0.3631  4356      0.3978  lookup_field_1
51470     0.8880  41211     1.1151  20647     0.7007  17549     1.6025  tsubst
50100     0.8644  43384     1.1739  19750     0.6703  18065     1.6496  htab_find_slot_with_hash
49868     0.8604  91428     2.4740  2472      0.0839  41355     3.7763  push_to_top_level


-- Karel Gardas kgardas@objectsecurity.com ObjectSecurity Ltd. http://www.objectsecurity.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]