This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

CHUD tool [Was: Faster compilation speed: cache behavior]

On Tuesday, August 20, 2002, at 02:25  PM, Matt Austern wrote:

FYI, here are the results of a fairly crude test that I did
using one of the Apple performance tools.
I've written a little library that can be used to help gather this sort of information with CHUD (which is what Matt is using).

This project has a dylib target in it that has module load/unload routines that invoke the CHUD remote client API to start the CHUD sampling as soon as the app starts and shut it down when the app is about to exit. It isn't perfect (the disconnect call seems to hang if the server quit listening to you due to it filling its sample buffer). Hopefully you'll find it of use.

First, put Shikari into remote listening mode (shift-cmd-r, under the main app menu), then do something like:

OAKeepAllocationStatistics=1 DYLD_INSERT_LIBRARIES=/Users/Shared/bungi/Build/libCHUDChassis.dylib ./cc1 -quiet reload1.i

(The 'OAKeepAllocationStatistics' goo is a hack to cause CoreFoundation to do a symbol lookup that provokes the module load routine in the library -- a wrapper script would be easy to write here).

I'm not sure what settings Matt was using, so the numbers below may not be something that can be compared against his numbers. I configured Shikari to the preset "Data Cache Misses (7450)" which increments a counter every 1000 dL3 cache misses.

(For those that haven't used CHUD, the toolkit will measure performance across the entire system, hence the entries for the system library and mach_kernel -- the later presumably being zero-fill page faults). Not sure what the __floatdidf is...

I'm running this on a 2x800 Quicksilver w/1.5GB, from the head of the mainline as of a couple hours ago.

./cc1 -quiet reload1.i :
15.8% gt_ggc_mx_lang_tree_node cc1
12.7% memset libSystem.B.dylib
6.9% __floatdidf cc1
5.9% poison_pages cc1
4.6% ggc_alloc cc1
3.7% bzero libSystem.B.dylib
3.5% .L_phys_zero_loop mach_kernel
3.3% ggc_mark_rtx_children_1 cc1
3.3% gt_ggc_mx_emit_status cc1
3.2% ggc_mark_rtx_children cc1
2.7% ggc_set_mark cc1
1.3% gt_ggc_mx_function cc1
1.2% bitmap_initialize cc1
0.9% build_insn_chain cc1
0.8% bitmap_operation cc1
0.8% gt_ggc_mx_varasm_status cc1
0.7% reload cc1
0.7% verify_flow_info cc1
0.6% find_reg_note cc1
0.6% ggc_collect cc1
0.6% scan_one_insn cc1
0.6% purge_hard_subreg_sets cc1
0.6% yyparse cc1

./cc1 -O2 -quiet reload1.i:
10.4% memset libSystem.B.dylib
9.9% gt_ggc_mx_lang_tree_node cc1
7.3% .L_phys_zero_loop mach_kernel
7.0% __floatdidf cc1
6.9% ggc_alloc cc1
6.5% poison_pages cc1
5.8% bzero libSystem.B.dylib
2.4% ggc_pop_context cc1
1.6% ggc_set_mark cc1
1.2% ggc_mark_rtx_children_1 cc1
1.0% ggc_mark_rtx_children cc1
1.0% bitmap_operation cc1
0.9% vm_page_lookup mach_kernel
0.9% find_reg_note cc1
0.9% allocate_reg_life_data cc1
0.8% init_alias_analysis cc1
0.6% vm_map_enter mach_kernel
0.6% verify_flow_info cc1

(I wasn't able to get cc1plus to compile my reload1.i for some reason -- I'm sure someone at Apple and fiddle with that :)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]