This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Out-of-sight compile times for calculate_loop_depth
- To: lucier at math dot purdue dot edu, m dot hayes at elec dot canterbury dot ac dot nz
- Subject: Re: Out-of-sight compile times for calculate_loop_depth
- From: Brad Lucier <lucier at math dot purdue dot edu>
- Date: Wed, 9 Feb 2000 12:05:09 -0500 (EST)
- Cc: law at cygnus dot com, gcc at gcc dot gnu dot org
> From mph@elec.canterbury.ac.nz Wed Feb 9 05:36:34 2000
> Could you please try the attached patch on your program. This should
> improve the loop tree construction time. Currently it has quadratic
> behaviour in the worst case. This patch should nail this problem in
> the interim although I've got an inlinking for an even better scheme.
It made some difference; after your patch, I got:
/export/u10/egcs-profile/lib/gcc-lib/alphaev6-unknown-linux-gnu/2.96/cc1 -mcpu=ev6 -fno-math-errno -mieee -fPIC -O1 _meroon.i
___H__20___meroon {GC 167029k -> 34257k in 0.803} {GC 58812k -> 37713k in 0.927} {GC 51743k -> 39567k in 0.990} ___init_proc {GC 72784k -> 3025k in 0.064} ____20___meroon
time in parse: 22.902816 (3%)
time in jump: 81.758544 (9%)
time in cse: 9.965936 (1%)
time in loop: 0.040992 (0%)
time in flow: 692.931696 (80%) <= down from 738 seconds
time in combine: 13.778192 (2%)
time in local-alloc: 4.948320 (1%)
time in global-alloc: 15.723360 (2%)
time in flow2: 7.622560 (1%)
time in shorten-branch: 0.572912 (0%)
time in final: 3.108560 (0%)
time in varconst: 0.006832 (0%)
time in gc: 2.784528 (0%)
and
% cumulative self self total
time seconds seconds calls ms/call ms/call name
53.32 136.89 136.89 59615 2.30 2.30 sbitmap_intersection_of_preds
6.80 154.34 17.45 6596 2.65 2.65 flow_loop_exits_find
6.04 169.84 15.50 6596 2.35 2.35 flow_loop_pre_header_find
2.23 175.56 5.72 8 714.60 4413.22 jump_optimize_1
2.18 181.15 5.59 10438001 0.00 0.00 rtx_renumbered_equal_p
2.03 186.36 5.21 9035782 0.00 0.00 find_cross_jump
1.90 191.24 4.88 25373 0.19 0.19 delete_from_jump_chain
1.53 195.16 3.92 12537008 0.00 0.00 next_active_insn
1.27 198.43 3.27 70351 0.05 0.05 make_edge
0.84 200.59 2.17 147364 0.01 0.01 record_one_conflict
0.75 202.53 1.93 1 1931.64 256480.41 yyparse
with some details:
-----------------------------------------------
0.00 170.87 3/3 rest_of_compilation [5]
[6] 66.6 0.00 170.87 3 calculate_loop_depth [6]
0.06 170.82 3/3 flow_loops_find [7]
0.00 0.00 3/3 flow_loops_free [1222]
-----------------------------------------------
0.06 170.82 3/3 calculate_loop_depth [6]
[7] 66.6 0.06 170.82 3 flow_loops_find [7]
0.06 137.59 3/3 compute_flow_dominators [8]
17.45 0.00 6596/6596 flow_loop_exits_find [12]
15.50 0.00 6596/6596 flow_loop_pre_header_find [13]
0.11 0.00 3/6 sbitmap_vector_alloc [197]
0.04 0.00 6596/6596 flow_loop_nodes_find [377]
0.00 0.03 3/3 flow_loops_tree_build [428]
0.01 0.00 1/1 flow_depth_first_order_compute [584]
0.01 0.00 6598/6608 sbitmap_alloc [622]
0.00 0.00 2/26511 sbitmap_zero [611]
0.00 0.00 1/202709 xmalloc [422]
0.00 0.00 1/112725 xcalloc [545]
0.00 0.00 3/3 flow_loops_level_compute [1223]
-----------------------------------------------
0.06 137.59 3/3 flow_loops_find [7]
[8] 53.6 0.06 137.59 3 compute_flow_dominators [8]
136.89 0.01 59615/59615 sbitmap_intersection_of_preds [9]
0.58 0.00 59618/59618 sbitmap_a_and_b [115]
0.11 0.00 3/6 sbitmap_vector_alloc [197]
0.00 0.01 3/3 sbitmap_vector_zero [661]
0.00 0.00 3/3 sbitmap_vector_ones [801]
0.00 0.00 3/26511 sbitmap_zero [611]
0.00 0.00 3/202709 xmalloc [422]
-----------------------------------------------
136.89 0.01 59615/59615 compute_flow_dominators [8]
[9] 53.3 136.89 0.01 59615 sbitmap_intersection_of_preds [9]
0.01 0.00 59615/59615 sbitmap_copy [614]
-----------------------------------------------
So you did kill the flow_loops_tree_build/sbitmap_a_subset_b_p times,
which is good. For some reason, sbitmap_intersection_of_preds seems to
take a lot more time now.
Brad