This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: test patch for computed gotos


> Ah, never mind.  I'll try to profile reorder_blocks and see if things can
> be speeded up there.

With a slightly simpler file:

http://www.math.purdue.edu/~lucier/all2.i.gz

gcc version 3.4 20030305 gives the following times:

popov-65% /export/home/lucier/programs/gcc/objdir/gcc/cc1 -fPIC -O1 -fno-trapping-math -fomit-frame-pointer -mieee -fno-math-errno -mcpu=ev6 -fschedule-insns2 -fno-strict-aliasing -freorder-blocks all.i
<lots of function names deleted>
Execution times (seconds)
 cfg construction      :   1.71 ( 0%) usr   0.10 ( 3%) sys   1.81 ( 0%) wall
 cfg cleanup           :   7.81 ( 2%) usr   0.01 ( 0%) sys   7.82 ( 2%) wall
 trivially dead code   :   4.54 ( 1%) usr   0.00 ( 0%) sys   4.55 ( 1%) wall
 life analysis         :  17.93 ( 4%) usr   0.02 ( 0%) sys  17.95 ( 4%) wall
 life info update      :   6.44 ( 2%) usr   0.00 ( 0%) sys   6.44 ( 2%) wall
 alias analysis        :   2.53 ( 1%) usr   0.03 ( 1%) sys   2.57 ( 1%) wall
 register scan         :   1.38 ( 0%) usr   0.00 ( 0%) sys   1.38 ( 0%) wall
 rebuild jump labels   :   0.63 ( 0%) usr   0.00 ( 0%) sys   0.63 ( 0%) wall
 preprocessing         :   4.27 ( 1%) usr   0.40 (10%) sys   4.69 ( 1%) wall
 lexical analysis      :   5.73 ( 1%) usr   0.90 (23%) sys   6.64 ( 2%) wall
 parser                :  12.27 ( 3%) usr   0.75 (19%) sys  13.13 ( 3%) wall
 expand                :   6.63 ( 2%) usr   0.12 ( 3%) sys   6.75 ( 2%) wall
 varconst              :   0.86 ( 0%) usr   0.02 ( 1%) sys   0.88 ( 0%) wall
 integration           :   1.48 ( 0%) usr   0.04 ( 1%) sys   1.52 ( 0%) wall
 jump                  :   0.17 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall
 CSE                   :   8.66 ( 2%) usr   0.02 ( 0%) sys   8.67 ( 2%) wall
 loop analysis         :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall
 branch prediction     :   6.12 ( 1%) usr   0.02 ( 1%) sys   6.14 ( 1%) wall
 flow analysis         :   0.46 ( 0%) usr   0.00 ( 0%) sys   0.47 ( 0%) wall
 combiner              :  14.78 ( 4%) usr   0.04 ( 1%) sys  14.82 ( 4%) wall
 if-conversion         :   2.71 ( 1%) usr   0.00 ( 0%) sys   2.71 ( 1%) wall
 local alloc           :   3.41 ( 1%) usr   0.01 ( 0%) sys   3.42 ( 1%) wall
 global alloc          :   9.04 ( 2%) usr   0.21 ( 5%) sys   9.25 ( 2%) wall
 reload CSE regs       :  15.84 ( 4%) usr   0.06 ( 1%) sys  15.90 ( 4%) wall
 flow 2                :   1.47 ( 0%) usr   0.02 ( 0%) sys   1.49 ( 0%) wall
 if-conversion 2       :   4.26 ( 1%) usr   0.00 ( 0%) sys   4.26 ( 1%) wall
 rename registers      :   2.96 ( 1%) usr   0.02 ( 0%) sys   2.98 ( 1%) wall
 scheduling 2          :  12.26 ( 3%) usr   0.07 ( 2%) sys  12.41 ( 3%) wall
 reorder blocks        : 246.99 (59%) usr   0.91 (23%) sys 247.93 (59%) wall
 shorten branches      :   1.31 ( 0%) usr   0.00 ( 0%) sys   1.32 ( 0%) wall
 final                 :   5.06 ( 1%) usr   0.08 ( 2%) sys   5.16 ( 1%) wall
 rest of compilation   :   7.22 ( 2%) usr   0.05 ( 1%) sys   7.29 ( 2%) wall
 TOTAL                 : 417.08             3.93           421.27

The profile results are fairly simple; cached_make_edge seems to take
a long time for this problem---is the cache enabled?  Do we need to
take the quadratic path through cached_make_edge each time?
(This is on a 500MHz alphaev6.)

Flat profile:

Each sample counts as 0.000976562 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 55.96    106.17   106.17  4615324     0.00     0.00  cached_make_edge
  3.21    112.25     6.08                             htab_traverse
  1.61    115.32     3.06   124722     0.00     0.00  et_forest_common_ancestor
  1.31    117.81     2.49      486     0.01     0.01  compute_alignments
  1.24    120.15     2.34    26877     0.00     0.00  bb_to_key
  1.17    122.37     2.22     3888     0.00     0.00  calc_dfs_tree_nonrec
  1.03    124.33     1.95     8172     0.00     0.00  find_unreachable_blocks
  0.97    126.16     1.83    27619     0.00     0.00  find_if_block
  0.94    127.94     1.78   807650     0.00     0.00  constrain_operands
  0.87    129.60     1.66   370361     0.00     0.00  try_forward_edges
  0.82    131.15     1.55        1     1.55   180.42  yyparse
...
-----------------------------------------------
                0.01  170.94     498/498         c_expand_body_1 [7]
[8]     90.1    0.01  170.94     498         rest_of_compilation [8]
                0.00  109.52     486/486         reorder_basic_blocks [10]
                0.02    8.69    5347/8325        cleanup_cfg <cycle 7> [19]
...
-----------------------------------------------
                0.00  109.52     486/486         rest_of_compilation [8]
[10]    57.7    0.00  109.52     486         reorder_basic_blocks [10]
                1.13  104.60     484/484         connect_traces [12]
                0.00    2.48     484/484         find_traces [55]
                0.00    0.97     484/484         cfg_layout_finalize [109]
                0.00    0.26     484/484         cfg_layout_initialize [244]
                0.01    0.06     484/484         set_edge_can_fallthru_flag [453]
                0.01    0.00     484/1455        mark_dfs_back_edges [683]
                0.00    0.00     484/484         record_effective_endpoints [1101]
                0.00    0.00     484/484         break_superblocks [1196]
                0.00    0.00       1/1           get_uncond_jump_length [1560]
                0.00    0.00     484/5893        hook_bool_void_false [1722]
-----------------------------------------------
                0.08    0.00    3597/4615324     make_single_succ_edge [416]
                0.10    0.00    4137/4615324     force_nonfallthru_and_redirect [206]
                1.17    0.01   51030/4615324     make_edges [33]
                2.33    0.02  101432/4615324     make_label_edge [59]
              102.48    0.96 4455128/4615324     cfg_layout_duplicate_bb [14]
[11]    56.5  106.17    0.99 4615324         cached_make_edge [11]
                0.99    0.00 4608732/5147236     pool_alloc [100]
-----------------------------------------------
                1.13  104.60     484/484         reorder_basic_blocks [10]
[12]    55.7    1.13  104.60     484         connect_traces [12]
                0.00  104.38    3273/3273        copy_bb [13]
                0.00    0.22    3673/5382        copy_bb_p [227]
-----------------------------------------------


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]