This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Is -fnew-ra ready for real testing?
- From: Brad Lucier <lucier at math dot purdue dot edu>
- To: matzmich at cs dot tu-berlin dot de (Michael Matz)
- Cc: lucier at math dot purdue dot edu (Brad Lucier), gcc at gcc dot gnu dot org, feeley at iro dot umontreal dot ca
- Date: Sat, 20 Jul 2002 22:50:36 -0500 (EST)
- Subject: Re: Is -fnew-ra ready for real testing?
Here is the compiler I used:
banach-43% gcc -v
Reading specs from /home/c/lucier/local/gcc-test/lib/gcc-lib/sparcv9-sun-solaris2.8/3.2/specs
Configured with: ../configure --prefix=/home/c/lucier/local/gcc-test --enable-languages=c --enable-checking=no sparcv9-sun-solaris2.8
Thread model: posix
gcc version 3.2 20020720 (experimental)
The input file is at
http://www.math.purdue.edu/~lucier/_io.i.gz
Running cc1 on it yielded:
banach-40% /home/c/lucier/local/gcc-test/lib/gcc-lib/sparcv9-sun-solaris2.8/3.2/cc1 -fnew-ra -m64 -O1 -fschedule-insns2 -fno-strict-aliasing -fno-math-errno -mcpu=ultrasparc -mtune=ultrasparc _io.i
___H__20___io {GC 76786k -> 27318k} {GC 44700k -> 26261k} {GC 34496k -> 26570k} {GC 37726k -> 29087k} {GC 50822k -> 28993k} {GC 52634k -> 28763k} {GC 39486k -> 31964k} {GC 55279k -> 34103k} ___init_proc ____20___io
Execution times (seconds)
garbage collection : 10.38 ( 0%) usr 0.15 ( 0%) sys 19.12 ( 0%) wall
cfg construction : 35.40 ( 0%) usr 4.35 ( 1%) sys 40.12 ( 0%) wall
cfg cleanup : 75.98 ( 0%) usr 0.00 ( 0%) sys 76.00 ( 0%) wall
trivially dead code : 7.28 ( 0%) usr 0.00 ( 0%) sys 7.62 ( 0%) wall
life analysis : 339.25 ( 2%) usr 0.06 ( 0%) sys 344.12 ( 2%) wall
life info update : 51.28 ( 0%) usr 0.00 ( 0%) sys 51.25 ( 0%) wall
preprocessing : 4.40 ( 0%) usr 2.60 ( 1%) sys 7.75 ( 0%) wall
lexical analysis : 3.36 ( 0%) usr 4.93 ( 1%) sys 7.50 ( 0%) wall
parser : 10.38 ( 0%) usr 2.65 ( 1%) sys 13.25 ( 0%) wall
expand : 4.13 ( 0%) usr 0.18 ( 0%) sys 4.38 ( 0%) wall
varconst : 1.49 ( 0%) usr 0.03 ( 0%) sys 1.62 ( 0%) wall
integration : 1.11 ( 0%) usr 0.02 ( 0%) sys 1.12 ( 0%) wall
jump : 44.31 ( 0%) usr 0.00 ( 0%) sys 44.25 ( 0%) wall
CSE : 11.08 ( 0%) usr 0.00 ( 0%) sys 11.25 ( 0%) wall
loop analysis : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
branch prediction : 801.27 ( 4%) usr 1.20 ( 0%) sys 803.25 ( 4%) wall
flow analysis : 2.97 ( 0%) usr 0.00 ( 0%) sys 3.12 ( 0%) wall
combiner : 14.60 ( 0%) usr 0.00 ( 0%) sys 14.62 ( 0%) wall
if-conversion : 3.89 ( 0%) usr 0.00 ( 0%) sys 3.88 ( 0%) wall
local alloc :16881.28 (92%) usr 394.29 (96%) sys17848.62 (92%) wall
global alloc : 13.67 ( 0%) usr 0.10 ( 0%) sys 18.75 ( 0%) wall
reload CSE regs : 62.17 ( 0%) usr 0.13 ( 0%) sys 67.25 ( 0%) wall
flow 2 : 1.45 ( 0%) usr 0.00 ( 0%) sys 1.25 ( 0%) wall
if-conversion 2 : 3.86 ( 0%) usr 0.01 ( 0%) sys 4.75 ( 0%) wall
rename registers : 8.96 ( 0%) usr 0.00 ( 0%) sys 9.25 ( 0%) wall
scheduling 2 : 11.13 ( 0%) usr 0.00 ( 0%) sys 11.00 ( 0%) wall
delay branch sched : 9.41 ( 0%) usr 0.00 ( 0%) sys 9.50 ( 0%) wall
shorten branches : 0.90 ( 0%) usr 0.00 ( 0%) sys 1.00 ( 0%) wall
final : 4.29 ( 0%) usr 0.02 ( 0%) sys 4.12 ( 0%) wall
rest of compilation : 8.68 ( 0%) usr 0.01 ( 0%) sys 9.25 ( 0%) wall
TOTAL :18428.51 410.74 19439.25
So there does seem to be a hot spot ;-).
The gprof output is at
http://www.math.purdue.edu/~lucier/_io.out.gz
It begins:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
38.80 696.78 696.78 internal_mcount
21.75 1087.39 390.61 294854188 0.00 0.00 alias
3.27 1146.11 58.72 113963455 0.00 0.00 m16m
2.21 1185.77 39.66 8607198 0.00 0.00 et_forest_common_ancestor
1.76 1217.45 31.68 85377366 0.00 0.00 eshup1
1.71 1248.14 30.69 159163 0.00 0.00 live_in
1.69 1278.45 30.31 138969436 0.00 0.00 live_in_edge
1.62 1307.52 29.07 139883812 0.00 0.00 add_conflict_edge
1.33 1331.48 23.96 147700043 0.00 0.00 bitmap_operation
1.28 1354.41 22.93 80040592 0.00 0.00 record_conflict
1.16 1375.30 20.90 83334362 0.00 0.00 esubm
1.03 1393.80 18.50 7575849 0.00 0.00 edivm
1.02 1412.17 18.37 173967163 0.00 0.00 eshup6
0.90 1428.27 16.10 55044663 0.00 0.00 eshdn1
0.89 1444.19 15.92 286502 0.00 0.00 try_forward_edges
0.85 1459.46 15.27 43 0.36 0.36 find_unreachable_blocks
0.84 1474.58 15.12 17 0.89 2.65 calculate_global_regs_live
0.78 1488.55 13.97 24 0.58 0.58 calc_idoms
0.74 1501.90 13.36 146052286 0.00 0.00 bitmap_set_bit
0.74 1515.14 13.24 22722 0.00 0.00 combine
0.69 1527.58 12.44 7 1.78 1.79 simplify
0.67 1539.57 11.99 89354871 0.00 0.00 emovi
0.63 1550.81 11.24 24 0.47 0.47 calc_dfs_tree_nonrec
0.58 1561.22 10.42 42656049 0.00 0.00 eaddm
0.55 1571.07 9.85 moncontrol
0.50 1580.07 9.00 29496756 0.00 0.00 emdnorm
0.49 1588.83 8.77 16616747 0.00 0.00 cached_make_edge
0.48 1597.36 8.53 _mcount
0.47 1605.80 8.44 31923947 0.00 0.00 splay
0.47 1614.17 8.37 55540952 0.00 0.00 decrement_degree
0.45 1622.34 8.17 69960976 0.00 0.00 hard_regs_intersect_p
0.45 1630.46 8.12 8 1.02 20.40 propagate_freq
0.36 1637.01 6.55 52189680 0.00 0.00 emovo
0.33 1643.00 6.00 217709929 0.00 0.00 eisnan
0.32 1648.76 5.76 141146593 0.00 0.00 ra_alloc
0.29 1653.93 5.17 83338568 0.00 0.00 ecmpm
0.28 1658.89 4.97 44611212 0.00 0.00 enormlz
0.27 1663.71 4.82 73128 0.00 0.00 calculate_dont_begin
0.24 1668.04 4.33 107078806 0.00 0.00 eisinf
0.24 1672.31 4.27 45316353 0.00 0.00 ecleaz
0.23 1676.51 4.20 34284597 0.00 0.00 eshdn6
0.23 1680.61 4.11 419937542 0.00 0.00 bitmap_element_link
0.22 1684.55 3.94 18912 0.00 0.00 clear_table
0.21 1688.29 3.74 6 0.62 0.62 flow_dfs_compute_reverse_execute
0.21 1691.99 3.70 6 0.62 0.62 mark_dfs_back_edges
0.20 1695.52 3.53 16616747 0.00 0.00 free_edge
0.19 1699.00 3.48 18512201 0.00 0.00 eshift
0.19 1702.43 3.43 34699 0.00 0.00 purge_dead_edges
0.19 1705.78 3.35 14414489 0.00 0.00 earith
0.18 1709.07 3.29 16585816 0.00 0.00 make_label_edge
0.16 1712.01 2.94 8301486 0.00 0.00 alloc_aux_for_edge
0.16 1714.83 2.82 5 0.56 0.56 flow_depth_first_order_compute
...
-----------------------------------------------
0.00 593.72 3/3 rest_of_compilation [9]
[11] 33.2 0.00 593.72 3 reg_alloc [11]
0.00 573.09 4/4 one_pass [12]
0.00 5.32 3/9 life_analysis [69]
0.00 4.93 3/52 cleanup_cfg <cycle 10> [46]
0.00 4.93 3/52 update_life_info <cycle 10> [40]
0.00 4.07 4/4 df_analyse [103]
0.03 0.47 4/7 regclass [171]
0.08 0.16 4/4 check_df [263]
0.00 0.22 3/3 emit_colors [272]
0.00 0.13 1/17 delete_trivially_dead_insns [127]
0.09 0.00 4/4 create_insn_info [385]
0.00 0.07 1/1 reg_scan_update [451]
0.05 0.00 3/11 count_or_remove_death_notes [278]
0.00 0.03 4/4 reset_lists [551]
0.00 0.02 3/3 init_ra [607]
0.00 0.01 3/3 delete_moves [751]
0.01 0.00 3/13 delete_dead_jumptables [517]
0.00 0.00 1/7 compute_bb_for_insn [579]
0.00 0.00 3/6 recompute_reg_usage [888]
0.00 0.00 6/6 setup_renumber [963]
0.00 0.00 3/3 remove_suspicious_death_notes [985]
0.00 0.00 4/42 allocate_reg_info [646]
0.00 0.00 3/3 free_all_mem [1026]
0.00 0.00 4/4 ra_build_free [1075]
0.00 0.00 15/74010 get_insns [647]
0.00 0.00 16/305866 max_reg_num [546]
0.00 0.00 4/4 alloc_mem [1349]
0.00 0.00 3/3 df_finish [1373]
0.00 0.00 13/147415 ra_debug_msg [1441]
0.00 0.00 4/4 dump_ra [1957]
0.00 0.00 4/20 free_dlist [1847]
0.00 0.00 4/4 free_mem [1963]
0.00 0.00 3/32123 get_last_insn [1545]
0.00 0.00 3/3 ra_rewrite_init [2037]
0.00 0.00 3/3 df_init [1986]
0.00 0.00 3/6 fixup_abnormal_edges [1920]
0.00 0.00 3/3 dump_static_insn_cost [1992]
0.00 0.00 3/3 allocate_initial_values [1975]
0.00 0.00 3/3 dump_cost [1990]
0.00 0.00 3/3 dump_constraints [1989]
-----------------------------------------------
0.00 573.09 4/4 reg_alloc [11]
[12] 32.1 0.00 573.09 4 one_pass [12]
0.00 389.39 4/4 ra_colorize_graph [14]
0.00 105.13 1/1 actual_spill [27]
0.00 76.64 4/4 build_i_graph [33]
0.00 1.93 4/12 check_colors [95]
0.00 0.00 4/213308 get_max_uid [838]
0.00 0.00 3/3 dump_igraph_machine [1991]
-----------------------------------------------
0.00 0.00 6/294854188 spill_is_free [1254]
0.00 0.00 6/294854188 emit_loads [1176]
0.00 0.00 2836/294854188 spill_coalescing [944]
0.00 0.00 2838/294854188 delete_moves [751]
0.01 0.00 5674/294854188 aggressive_coalesce [864]
0.01 0.00 5674/294854188 check_uncoalesced_moves [867]
0.01 0.00 11006/294854188 detect_web_parts_to_rebuild [612]
0.03 0.00 22722/294854188 reset_lists [551]
0.06 0.00 45444/294854188 assign_colors [23]
0.07 0.00 54880/294854188 sort_and_combine_web_pairs [30]
0.08 0.00 56825/294854188 insert_stores [404]
0.19 0.00 146203/294854188 emit_colors [272]
0.28 0.00 213571/294854188 extended_coalesce_2 [44]
5.80 0.00 4378479/294854188 check_colors [95]
8.81 0.00 6652968/294854188 update_spill_colors [86]
61.74 0.00 46607536/294854188 colorize_one_web [24]
77.82 0.00 58745499/294854188 try_recolor_web [20]
96.19 0.00 72609307/294854188 rewrite_program2 [28]
139.49 0.00 105292714/294854188 calculate_dont_begin [21]
[13] 21.9 390.61 0.00 294854188 alias [13]
-----------------------------------------------
0.00 389.39 4/4 one_pass [12]
[14] 21.8 0.00 389.39 4 ra_colorize_graph [14]
0.00 221.61 4/4 recolor_spills [17]
0.00 63.82 4/8 assign_colors [23]
0.12 43.87 4/4 extended_coalesce_2 [44]
0.01 43.55 4/8 sort_and_combine_web_pairs [30]
12.44 0.07 7/7 simplify [77]
0.00 3.87 8/12 check_colors [95]
0.02 0.00 4/4 build_worklists [614]
0.00 0.01 4/4 aggressive_coalesce [864]
0.00 0.01 4/4 check_uncoalesced_moves [867]
0.00 0.00 3/3 select_spill [1394]
0.00 0.00 12/16 dump_graph_cost [1854]
0.00 0.00 4/4 break_coalesced_spills [1951]
0.00 0.00 3/147415 ra_debug_msg [1441]
...
Brad