This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Performance testing of c-decl rewrite
- From: Zack Weinberg <zack at codesourcery dot com>
- To: gcc at gcc dot gnu dot org, Geoff Keating <geoffk at geoffk dot org>
- Date: Sun, 04 Apr 2004 00:09:42 -0800
- Subject: Performance testing of c-decl rewrite
I built GCC snapshots immediately before and after the patch and
compiled X Windows (being a large collection of mostly-C source code
that I happened to have lying around) with both, using oprofile to
collect performance data. The machine was otherwise completely idle.
Before (numbers are CPU_CLK_UNHALTED counts - an arbitrary scale,
smaller is better)
23633380 59.4095 cc1
4338756 10.9068 vmlinux
3817502 9.5964 libc-2.3.2.so
3028414 7.6128 perl
1074030 2.6999 as
After:
23767768 57.9481 cc1
5383616 13.1258 vmlinux
3839303 9.3606 libc-2.3.2.so
3058201 7.4562 perl
1072668 2.6153 as
So, at first glance, this does look notably slower. Breaking it down
a bit more, before:
samples % app name symbol name
796285 2.5049 cc1 yyparse
741056 2.3311 cc1 _cpp_lex_direct
617899 1.9437 cc1 ht_lookup
531869 1.6731 vmlinux fast_clear_page
509135 1.6016 cc1 for_each_rtx
487438 1.5333 cc1 ggc_alloc_stat
392484 1.2346 libc-2.3.2.so _int_malloc
385878 1.2138 cc1 constrain_operands
382554 1.2034 cc1 htab_find_slot_with_hash
350330 1.1020 cc1 cse_insn
348599 1.0966 cc1 _cpp_clean_line
341925 1.0756 vmlinux mark_offset_tsc
339794 1.0689 libc-2.3.2.so memset
337738 1.0624 libc-2.3.2.so strcmp
after:
samples % app name symbol name
769567 2.3327 cc1 yyparse
750881 2.2760 cc1 _cpp_lex_direct
631608 1.9145 vmlinux mark_offset_tsc
621143 1.8828 cc1 ht_lookup
536384 1.6259 vmlinux fast_clear_page
501450 1.5200 cc1 for_each_rtx
496292 1.5043 cc1 ggc_alloc_stat
390526 1.1837 libc-2.3.2.so _int_malloc
383197 1.1615 cc1 htab_find_slot_with_hash
382171 1.1584 cc1 constrain_operands
352961 1.0699 cc1 _cpp_clean_line
344353 1.0438 cc1 cse_insn
340974 1.0335 libc-2.3.2.so memset
339150 1.0280 libc-2.3.2.so strcmp
There isn't an obvious culprit here. In particular, none of the
functions in c-decl.c is a bottleneck, and fast_clear_page is not
being called *that* much more often so I doubt we are allocating very
much more memory.
Mind, I have no idea why mark_offset_tsc has doubled its cost.
Thoughts?
zw