[Bug target/67396] [4.9/5.0 regression] Performance regression compiling variadic function with many arguments
ppluzhnikov at google dot com
gcc-bugzilla@gcc.gnu.org
Sun Aug 30 17:14:00 GMT 2015
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67396
--- Comment #3 from Paul Pluzhnikov <ppluzhnikov at google dot com> ---
(In reply to Andrew Pinski from comment #2)
> Can you provide -ftime-report ?
Sure:
perl gen_bz18872.pl 2000 > t.c && gcc-svn-r227321/bin/gcc -c -O2 t.c
-ftime-report
Execution times (seconds)
phase setup : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
1066 kB (19%) ggc
phase parsing : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
1486 kB (27%) ggc
phase opt and generate : 102.09 (100%) usr 0.01 (100%) sys 102.29 (100%)
wall 2997 kB (54%) ggc
garbage collection : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
CFG verifier : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
df use-def / def-use chains: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%)
wall 0 kB ( 0%) ggc
preprocessing : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
948 kB (17%) ggc
lexical analysis : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
parser (global) : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
297 kB ( 5%) ggc
tree STMT verifier : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
dominance computation : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
expand : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
484 kB ( 9%) ggc
dead store elim1 : 101.85 (100%) usr 0.01 (100%) sys 101.96 (100%)
wall 171 kB ( 3%) ggc
dead store elim2 : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall
171 kB ( 3%) ggc
CSE 2 : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
integrated RA : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
24 kB ( 0%) ggc
reload CSE regs : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
280 kB ( 5%) ggc
thread pro- & epilogue : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
1 kB ( 0%) ggc
peephole 2 : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
scheduling 2 : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
1558 kB (28%) ggc
initialize rtl : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
12 kB ( 0%) ggc
rest of compilation : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
2 kB ( 0%) ggc
unaccounted todo : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
0 kB ( 0%) ggc
verify RTL sharing : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
0 kB ( 0%) ggc
TOTAL : 102.10 0.01 102.31
5558 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.
perf report says:
Samples: 409K of event 'cycles', Event count (approx.): 374080784459
32.25% cc1 cc1 [.] find_base_term(rtx_def*)
27.15% cc1 cc1 [.] get_addr(rtx_def*) [clone .part.39]
16.57% cc1 cc1 [.] rtx_equal_for_memref_p(rtx_def const*,
rtx_def const*)
11.37% cc1 cc1 [.] memrefs_conflict_p(int, rtx_def*, int,
rtx_def*, long) [clone .constprop.113]
5.76% cc1 cc1 [.] addr_side_effect_eval(rtx_def*, int, int)
[clone .constprop.114]
2.81% cc1 cc1 [.] ix86_find_base_term(rtx_def*)
2.52% cc1 cc1 [.] cselib_sp_based_value_p(cselib_val*)
1.11% cc1 cc1 [.] cselib_have_permanent_equivalences()
0.13% cc1 cc1 [.] record_store(rtx_def*, dse_bb_info_type*)
...
More information about the Gcc-bugs
mailing list