This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: if-conversion a performance bottleneck


> Please try the attached diff (against actual CVS) if they make also a
> difference for you ;)

Your changes to flow.c have cut the number of calls to 
sbitmap_intersection_of_succs from 40604 to 24225, so they are definitely
worthwhile.  Bootstrapped on alphaev6-unknown-linux-gnu.

Brad

Here are the new statistics with your changes:
Execution times (seconds)
 garbage collection    :   1.30 ( 1%) usr   0.00 ( 0%) sys   1.30 ( 1%) wall
 parser                :   6.45 ( 3%) usr   0.19 (15%) sys   6.64 ( 3%) wall
 varconst              :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 integration           :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 jump                  :  20.06 ( 9%) usr   0.84 (66%) sys  20.90 ( 9%) wall
 CSE                   :   2.77 ( 1%) usr   0.00 ( 0%) sys   2.77 ( 1%) wall
 global CSE            :   5.51 ( 2%) usr   0.01 ( 1%) sys   5.52 ( 2%) wall
 loop analysis         :   0.24 ( 0%) usr   0.00 ( 0%) sys   0.24 ( 0%) wall
 CSE 2                 :   2.29 ( 1%) usr   0.00 ( 0%) sys   2.29 ( 1%) wall
 flow analysis         :  52.11 (23%) usr   0.06 ( 5%) sys  52.16 (23%) wall
 combiner              :   2.80 ( 1%) usr   0.00 ( 0%) sys   2.80 ( 1%) wall
 if-conversion         :  44.42 (20%) usr   0.02 ( 2%) sys  44.43 (20%) wall
 regmove               :   0.50 ( 0%) usr   0.00 ( 0%) sys   0.50 ( 0%) wall
 scheduling            :   6.26 ( 3%) usr   0.01 ( 1%) sys   6.27 ( 3%) wall
 local alloc           :   1.49 ( 1%) usr   0.00 ( 0%) sys   1.49 ( 1%) wall
 global alloc          :   2.78 ( 1%) usr   0.07 ( 6%) sys   2.86 ( 1%) wall
 reload CSE regs       :   7.92 ( 4%) usr   0.01 ( 1%) sys   7.93 ( 3%) wall
 flow 2                :  19.36 ( 9%) usr   0.00 ( 0%) sys  19.35 ( 9%) wall
 if-conversion 2       :  36.38 (16%) usr   0.01 ( 1%) sys  36.38 (16%) wall
 peephole 2            :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall
 schedulding 2         :   8.88 ( 4%) usr   0.00 ( 0%) sys   8.88 ( 4%) wall
 shorten branches      :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 final                 :   3.22 ( 1%) usr   0.00 ( 0%) sys   3.22 ( 1%) wall
 symout                :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 rest of compilation   :   1.07 ( 0%) usr   0.00 ( 0%) sys   1.07 ( 0%) wall
 TOTAL                 : 226.10             1.27           227.32


Flat profile:

Each sample counts as 0.000976562 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 42.54     69.80    69.80    24225     2.88     2.88  sbitmap_intersection_of_succs
 17.24     98.09    28.29    15025     1.88     1.88  sbitmap_intersection_of_preds
  5.03    106.34     8.25       42   196.36   196.36  mark_critical_edges
  3.98    112.86     6.52 25085231     0.00     0.00  bitmap_operation
  3.56    118.70     5.84    10248     0.57     0.62  compute_block_backward_dependences
  3.44    124.35     5.65        9   627.60 11546.70  compute_flow_dominators
  2.37    128.24     3.89       21   185.36   185.93  delete_unreachable_blocks
  2.31    132.03     3.79        6   631.02  1760.19  calculate_global_regs_live
...
-----------------------------------------------
                1.88   32.76       3/9           flow_loops_find [9]
                3.77   65.51       6/9           if_convert [8]
[6]     63.3    5.65   98.27       9         compute_flow_dominators [6]
               69.80    0.00   24225/24225       sbitmap_intersection_of_succs [7]
               28.29    0.00   15025/15025       sbitmap_intersection_of_preds [10]
                0.16    0.00   39259/39259       sbitmap_a_and_b [135]
                0.02    0.00       9/27          sbitmap_vector_alloc [255]
                0.00    0.00       9/18          sbitmap_vector_zero [771]
                0.00    0.00       9/74716       sbitmap_zero [725]
                0.00    0.00       9/9           sbitmap_vector_ones [1365]
-----------------------------------------------
               69.80    0.00   24225/24225       compute_flow_dominators [6]
[7]     42.5   69.80    0.00   24225         sbitmap_intersection_of_succs [7]
                0.00    0.00   24225/39250       sbitmap_copy [612]
-----------------------------------------------
                0.00   69.40      12/12          rest_of_compilation [5]
[8]     42.3    0.00   69.40      12         if_convert [8]
                3.77   65.51       6/9           compute_flow_dominators [6]
                0.00    0.06      12/43          free_basic_block_vars [112]
                0.03    0.00      12/48          compute_bb_for_insn [147]
                0.00    0.01   20509/20509       find_if_header [391]
                0.01    0.00       6/27          sbitmap_vector_alloc [255]
                0.00    0.00       1/10258       update_life_info [12]
                0.00    0.00       1/37          allocate_reg_info [540]
                0.00    0.00       1/20500       count_or_remove_death_notes [36]
                0.00    0.00       1/995         sbitmap_alloc [679]
                0.00    0.00       2/145008      max_reg_num [374]
                0.00    0.00       1/74716       sbitmap_zero [725]
                0.00    0.00      12/5225        get_max_uid [1134]
-----------------------------------------------
                2.96   36.17       3/3           rest_of_compilation [5]
[9]     23.8    2.96   36.17       3         flow_loops_find [9]
                1.88   32.76       3/9           compute_flow_dominators [6]
                0.76    0.00     964/964         flow_loop_exits_find [38]
                0.75    0.00       1/1           flow_depth_first_order_compute [39]
                0.01    0.00       3/27          sbitmap_vector_alloc [255]
                0.00    0.00     964/964         flow_loop_pre_header_find [627]
                0.00    0.00     966/995         sbitmap_alloc [679]
                0.00    0.00       3/3           flow_loops_tree_build [747]
                0.00    0.00     964/964         flow_loop_nodes_find [791]
                0.00    0.00     964/964         sbitmap_last_set_bit [834]
                0.00    0.00       2/74716       sbitmap_zero [725]
                0.00    0.00     964/964         sbitmap_first_set_bit [1174]
                0.00    0.00       3/3           flow_loops_level_compute [1435]
-----------------------------------------------
               28.29    0.00   15025/15025       compute_flow_dominators [6]
[10]    17.2   28.29    0.00   15025         sbitmap_intersection_of_preds [10]
                0.00    0.00   15025/39250       sbitmap_copy [612]
-----------------------------------------------

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]