This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Out-of-sight compile times for calculate_loop_depth
- To: gcc at gcc dot gnu dot org
- Subject: Out-of-sight compile times for calculate_loop_depth
- From: Brad Lucier <lucier at math dot purdue dot edu>
- Date: Mon, 7 Feb 2000 10:21:31 -0500 (EST)
- Cc: lucier at math dot purdue dot edu
With this compiler:
popov-50% /export/u10/egcs-profile/bin/gcc -v
Reading specs from /export/u10/egcs-profile/lib/gcc-lib/alphaev6-unknown-linux-gnu/2.96/specs
gcc version 2.96 20000127 (experimental)
and the input at
http://www.math.purdue.edu/~lucier/_meroon.i.gz
I get the following compile times:
popov-754% /export/u10/egcs-profile/lib/gcc-lib/alphaev6-unknown-linux-gnu/2.96/cc1 -mcpu=ev6 -fno-math-errno -mieee -fPIC -O1 _meroon.i
___H__20___meroon {GC 167029k -> 34257k in 0.801} {GC 58812k -> 37713k in 0.921} {GC 51743k -> 39567k in 0.975} ___init_proc {GC 72784k -> 3025k in 0.065} ____20___meroon
time in parse: 23.625056 (3%)
time in integration: 0.000000 (0%)
time in jump: 81.106576 (9%)
time in cse: 9.978624 (1%)
time in gcse: 0.000000 (0%)
time in loop: 0.039040 (0%)
time in cse2: 0.000000 (0%)
time in branch-prob: 0.000000 (0%)
time in flow: 738.114640 (81%) <=!!!!!!!!!!!!!!!
time in combine: 13.319472 (1%)
time in regmove: 0.000000 (0%)
time in sched: 0.000000 (0%)
time in local-alloc: 4.859504 (1%)
time in global-alloc: 15.603312 (2%)
time in flow2: 7.224352 (1%)
time in peephole2: 0.000000 (0%)
time in sched2: 0.000000 (0%)
time in shorten-branch: 0.551440 (0%)
time in final: 3.018768 (0%)
time in varconst: 0.008784 (0%)
time in symout: 0.000000 (0%)
time in dump: 0.000000 (0%)
time in gc: 2.763056 (0%)
All routines with more than 5% runtime are:
Each sample counts as 0.000976562 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
36.66 104.63 104.63 59615 1.76 1.76 sbitmap_intersection_of_preds
21.87 167.07 62.44 21551141 0.00 0.00 sbitmap_a_subset_b_p
6.41 185.37 18.30 6596 2.77 2.77 flow_loop_exits_find
5.15 200.08 14.71 6596 2.23 2.23 flow_loop_pre_header_find
And the interesting part of the detailed function-by-function report is:
-----------------------------------------------
0.00 202.77 3/3 rest_of_compilation [5]
[6] 71.0 0.00 202.77 3 calculate_loop_depth [6]
0.07 202.70 3/3 flow_loops_find [7]
0.00 0.00 3/3 flow_loops_free [788]
-----------------------------------------------
0.07 202.70 3/3 calculate_loop_depth [6]
[7] 71.0 0.07 202.70 3 flow_loops_find [7]
0.07 105.36 3/3 compute_flow_dominators [8]
0.00 64.10 3/3 flow_loops_tree_build [10]
18.30 0.00 6596/6596 flow_loop_exits_find [16]
14.71 0.00 6596/6596 flow_loop_pre_header_find [17]
0.11 0.00 3/6 sbitmap_vector_alloc [206]
0.04 0.00 6596/6596 flow_loop_nodes_find [407]
0.01 0.00 1/1 flow_depth_first_order_compute [593]
0.00 0.00 6598/6608 sbitmap_alloc [649]
0.00 0.00 3/3 flow_loops_level_compute [724]
0.00 0.00 2/26511 sbitmap_zero [716]
0.00 0.00 1/202709 xmalloc [434]
0.00 0.00 1/112725 xcalloc [553]
-----------------------------------------------
0.07 105.36 3/3 flow_loops_find [7]
[8] 36.9 0.07 105.36 3 compute_flow_dominators [8]
104.63 0.01 59615/59615 sbitmap_intersection_of_preds [9]
0.60 0.00 59618/59618 sbitmap_a_and_b [113]
0.11 0.00 3/6 sbitmap_vector_alloc [206]
0.00 0.00 3/3 sbitmap_vector_zero [728]
0.00 0.00 3/3 sbitmap_vector_ones [792]
0.00 0.00 3/202709 xmalloc [434]
0.00 0.00 3/26511 sbitmap_zero [716]
-----------------------------------------------
104.63 0.01 59615/59615 compute_flow_dominators [8]
[9] 36.7 104.63 0.01 59615 sbitmap_intersection_of_preds [9]
0.01 0.00 59615/59615 sbitmap_copy [596]
-----------------------------------------------
0.00 64.10 3/3 flow_loops_find [7]
[10] 22.5 0.00 64.10 3 flow_loops_tree_build [10]
0.31 63.79 6595/6595 flow_loop_tree_node_add [11]
-----------------------------------------------
21 flow_loop_tree_node_add [11]
0.31 63.79 6595/6595 flow_loops_tree_build [10]
[11] 22.5 0.31 63.79 6595+21 flow_loop_tree_node_add [11]
1.35 62.44 21551141/21551141 flow_loop_nested_p [12]
21 flow_loop_tree_node_add [11]
-----------------------------------------------
1.35 62.44 21551141/21551141 flow_loop_tree_node_add [11]
[12] 22.3 1.35 62.44 21551141 flow_loop_nested_p [12]
62.44 0.00 21551141/21551141 sbitmap_a_subset_b_p [13]
-----------------------------------------------
62.44 0.00 21551141/21551141 flow_loop_nested_p [12]
[13] 21.9 62.44 0.00 21551141 sbitmap_a_subset_b_p [13]
-----------------------------------------------
I timed it with a profiled egcs-1.1.2, but there really is no comparison,
so it isn't worth reporting the results.
Brad Lucier