This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Out-of-sight compile times for calculate_loop_depth


With this compiler:

popov-50% /export/u10/egcs-profile/bin/gcc -v
Reading specs from /export/u10/egcs-profile/lib/gcc-lib/alphaev6-unknown-linux-gnu/2.96/specs
gcc version 2.96 20000127 (experimental)

and the input at

http://www.math.purdue.edu/~lucier/_meroon.i.gz

I get the following compile times:

popov-754% /export/u10/egcs-profile/lib/gcc-lib/alphaev6-unknown-linux-gnu/2.96/cc1 -mcpu=ev6 -fno-math-errno -mieee -fPIC -O1  _meroon.i
 ___H__20___meroon {GC 167029k -> 34257k in 0.801} {GC 58812k -> 37713k in 0.921} {GC 51743k -> 39567k in 0.975} ___init_proc {GC 72784k -> 3025k in 0.065} ____20___meroon
time in parse: 23.625056 (3%)
time in integration: 0.000000 (0%)
time in jump: 81.106576 (9%)
time in cse: 9.978624 (1%)
time in gcse: 0.000000 (0%)
time in loop: 0.039040 (0%)
time in cse2: 0.000000 (0%)
time in branch-prob: 0.000000 (0%)
time in flow: 738.114640 (81%)       <=!!!!!!!!!!!!!!!
time in combine: 13.319472 (1%)
time in regmove: 0.000000 (0%)
time in sched: 0.000000 (0%)
time in local-alloc: 4.859504 (1%)
time in global-alloc: 15.603312 (2%)
time in flow2: 7.224352 (1%)
time in peephole2: 0.000000 (0%)
time in sched2: 0.000000 (0%)
time in shorten-branch: 0.551440 (0%)
time in final: 3.018768 (0%)
time in varconst: 0.008784 (0%)
time in symout: 0.000000 (0%)
time in dump: 0.000000 (0%)
time in gc: 2.763056 (0%)

All routines with more than 5% runtime are:

Each sample counts as 0.000976562 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 36.66    104.63   104.63    59615     1.76     1.76  sbitmap_intersection_of_preds
 21.87    167.07    62.44 21551141     0.00     0.00  sbitmap_a_subset_b_p
  6.41    185.37    18.30     6596     2.77     2.77  flow_loop_exits_find
  5.15    200.08    14.71     6596     2.23     2.23  flow_loop_pre_header_find

And the interesting part of the detailed function-by-function report is:

-----------------------------------------------
                0.00  202.77       3/3           rest_of_compilation [5]
[6]     71.0    0.00  202.77       3         calculate_loop_depth [6]
                0.07  202.70       3/3           flow_loops_find [7]
                0.00    0.00       3/3           flow_loops_free [788]
-----------------------------------------------
                0.07  202.70       3/3           calculate_loop_depth [6]
[7]     71.0    0.07  202.70       3         flow_loops_find [7]
                0.07  105.36       3/3           compute_flow_dominators [8]
                0.00   64.10       3/3           flow_loops_tree_build [10]
               18.30    0.00    6596/6596        flow_loop_exits_find [16]
               14.71    0.00    6596/6596        flow_loop_pre_header_find [17]
                0.11    0.00       3/6           sbitmap_vector_alloc [206]
                0.04    0.00    6596/6596        flow_loop_nodes_find [407]
                0.01    0.00       1/1           flow_depth_first_order_compute [593]
                0.00    0.00    6598/6608        sbitmap_alloc [649]
                0.00    0.00       3/3           flow_loops_level_compute [724]
                0.00    0.00       2/26511       sbitmap_zero [716]
                0.00    0.00       1/202709      xmalloc [434]
                0.00    0.00       1/112725      xcalloc [553]
-----------------------------------------------
                0.07  105.36       3/3           flow_loops_find [7]
[8]     36.9    0.07  105.36       3         compute_flow_dominators [8]
              104.63    0.01   59615/59615       sbitmap_intersection_of_preds [9]
                0.60    0.00   59618/59618       sbitmap_a_and_b [113]
                0.11    0.00       3/6           sbitmap_vector_alloc [206]
                0.00    0.00       3/3           sbitmap_vector_zero [728]
                0.00    0.00       3/3           sbitmap_vector_ones [792]
                0.00    0.00       3/202709      xmalloc [434]
                0.00    0.00       3/26511       sbitmap_zero [716]
-----------------------------------------------
              104.63    0.01   59615/59615       compute_flow_dominators [8]
[9]     36.7  104.63    0.01   59615         sbitmap_intersection_of_preds [9]
                0.01    0.00   59615/59615       sbitmap_copy [596]
-----------------------------------------------
                0.00   64.10       3/3           flow_loops_find [7]
[10]    22.5    0.00   64.10       3         flow_loops_tree_build [10]
                0.31   63.79    6595/6595        flow_loop_tree_node_add [11]
-----------------------------------------------
                                  21             flow_loop_tree_node_add [11]
                0.31   63.79    6595/6595        flow_loops_tree_build [10]
[11]    22.5    0.31   63.79    6595+21      flow_loop_tree_node_add [11]
                1.35   62.44 21551141/21551141     flow_loop_nested_p [12]
                                  21             flow_loop_tree_node_add [11]
-----------------------------------------------
                1.35   62.44 21551141/21551141     flow_loop_tree_node_add [11]
[12]    22.3    1.35   62.44 21551141         flow_loop_nested_p [12]
               62.44    0.00 21551141/21551141     sbitmap_a_subset_b_p [13]
-----------------------------------------------
               62.44    0.00 21551141/21551141     flow_loop_nested_p [12]
[13]    21.9   62.44    0.00 21551141         sbitmap_a_subset_b_p [13]
-----------------------------------------------

I timed it with a profiled egcs-1.1.2, but there really is no comparison,
so it isn't worth reporting the results.

Brad Lucier

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]