This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

optimization/6007: cfg cleanup tremendous performance hog with -O1



>Number:         6007
>Category:       optimization
>Synopsis:       cfg cleanup tremendous performance hog with -O1
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    unassigned
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Mar 19 12:26:00 PST 2002
>Closed-Date:
>Last-Modified:
>Originator:     B. Lucier
>Release:        3.1 20020318 (prerelease)
>Organization:
>Environment:
sparc-sun-solaris2.8
Solaris as/ld
>Description:
The input file is

http://www.math.purdue.edu/~lucier/GNATS/GNATS-4/all.i.gz

It takes about 16 hours and about 3 GB of memory to compile
this program on a 500 MHz UltraSPARC II.  This is with a
profiled version of cc1.

There is no way that it should take this long with -O1.
-O1 is for local optimizations; it should be the
optimization level one can use when a routine is too large
to use -O2.

This is definitely a regression from 2.95*.

banach-169% /home/c/lucier/local/gcc-3.1/lib/gcc-lib/sparc-sun-solaris2.8/3.1/cc1 -fpreprocessed all.i -mptr64 -mstack-bias -mno-v8plus -dumpbase all.c -m64 -mcpu=ultrasparc -mtune=ultrasparc -O1 -Wall -W -Wno-unused -version -fPIC -fschedule-insns2 -fno-math-errno -fno-strict-aliasing -o all.s
GNU CPP version 3.1 20020318 (prerelease) (cpplib) (sparc ELF)
GNU C version 3.1 20020318 (prerelease) (sparc-sun-solaris2.8)
        compiled by GNU C version 3.1 20020318 (prerelease).
options passed:  -fpreprocessed -mptr64 -mstack-bias -mno-v8plus -m64
 -mcpu=ultrasparc -mtune=ultrasparc -O1 -Wall -W -Wno-unused -fPIC
 -fschedule-insns2 -fno-math-errno -fno-strict-aliasing
options enabled:  -fdefer-pop -fomit-frame-pointer -fthread-jumps
 -fpeephole -ffunction-cse -fkeep-static-consts -freg-struct-return
 -fdelayed-branch -fgcse-lm -fgcse-sm -fschedule-insns2 -fsched-interblock
 -fsched-spec -fbranch-count-reg -fPIC -fcprop-registers -fcommon
 -fgnu-linker -fargument-alias -fmerge-constants -fident
 -fguess-branch-probability -ftrapping-math -mepilogue -mptr64 -m64
 -mstack-bias -mcpu=ultrasparc -mtune=ultrasparc
 ___H__20_all {GC 104370k -> 44446k} {GC 57858k -> 45642k} {GC 60148k -> 47950k} {GC 85130k -> 47937k} {GC 82606k -> 23072k} {GC 31452k -> 25858k} {GC 33844k -> 26890k} {GC 44252k -> 26464k} ___init_proc ____20_all
Execution times (seconds)
 garbage collection    :  12.87 ( 0%) usr   0.04 ( 0%) sys  17.00 ( 0%) wall
 cfg construction      : 341.31 ( 1%) usr 108.04 (47%) sys 449.00 ( 1%) wall
 cfg cleanup           :65630.05 (98%) usr   3.19 ( 1%) sys65995.00 (97%) wall
 life analysis         : 301.17 ( 0%) usr   0.02 ( 0%) sys 304.00 ( 0%) wall
 life info update      :   4.24 ( 0%) usr   0.00 ( 0%) sys   6.00 ( 0%) wall
 preprocessing         :   4.62 ( 0%) usr   4.01 ( 2%) sys   8.00 ( 0%) wall
 lexical analysis      :   5.38 ( 0%) usr   9.45 ( 4%) sys  16.00 ( 0%) wall
 parser                :  19.31 ( 0%) usr   5.38 ( 2%) sys  22.00 ( 0%) wall
 expand                :  10.77 ( 0%) usr   0.52 ( 0%) sys  11.00 ( 0%) wall
 varconst              :   2.87 ( 0%) usr   0.04 ( 0%) sys   3.00 ( 0%) wall
 integration           :   1.91 ( 0%) usr   0.08 ( 0%) sys   3.00 ( 0%) wall
 jump                  :   1.85 ( 0%) usr   0.03 ( 0%) sys   2.00 ( 0%) wall
 CSE                   :  26.05 ( 0%) usr   0.00 ( 0%) sys  25.00 ( 0%) wall
 loop analysis         :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 flow analysis         : 435.58 ( 1%) usr  93.09 (40%) sys 530.00 ( 1%) wall
 combiner              :  37.52 ( 0%) usr   0.02 ( 0%) sys  37.00 ( 0%) wall
 if-conversion         :  35.11 ( 0%) usr   0.01 ( 0%) sys  36.00 ( 0%) wall
 local alloc           :  10.35 ( 0%) usr   0.00 ( 0%) sys  11.00 ( 0%) wall
 global alloc          :  85.55 ( 0%) usr   4.51 ( 2%) sys 134.00 ( 0%) wall
 reload CSE regs       : 220.85 ( 0%) usr   0.02 ( 0%) sys 223.00 ( 0%) wall
 flow 2                :  11.20 ( 0%) usr   0.04 ( 0%) sys  15.00 ( 0%) wall
 if-conversion 2       :   0.62 ( 0%) usr   0.03 ( 0%) sys   3.00 ( 0%) wall
 rename registers      :   6.03 ( 0%) usr   0.07 ( 0%) sys   9.00 ( 0%) wall
 scheduling 2          :  12.87 ( 0%) usr   0.00 ( 0%) sys  12.00 ( 0%) wall
 delay branch sched    :   8.40 ( 0%) usr   0.00 ( 0%) sys   8.00 ( 0%) wall
 shorten branches      :   0.82 ( 0%) usr   0.00 ( 0%) sys   3.00 ( 0%) wall
 final                 :   1.52 ( 0%) usr   0.03 ( 0%) sys   6.00 ( 0%) wall
 rest of compilation   :  11.21 ( 0%) usr   2.34 ( 1%) sys 188.00 ( 0%) wall
 TOTAL                 :67240.23           230.97          68076.00

Here is the top of the profile.  The whole gprof output
can be found at:

http://www.math.purdue.edu/~lucier/GNATS/GNATS-4/all.i.gprof.gz

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 37.31   1106.44  1106.44 2123667007     0.00     0.00  try_crossjump_to_edge
 11.27   1440.70   334.26                             internal_mcount
  6.85   1643.67   202.97   395788     0.00     0.00  cselib_invalidate_regno
  6.53   1837.39   193.72                             htab_traverse
  4.67   1975.95   138.56     4987     0.03     0.03  propagate_freq
  2.87   2061.08    85.13       29     2.94     2.94  find_unreachable_blocks
  2.50   2135.09    74.01       15     4.93     4.94  calc_idoms
  2.48   2208.53    73.44   468802     0.00     0.00  try_forward_edges
  2.46   2281.48    72.95 173160573     0.00     0.00  cached_make_edge
  2.41   2353.01    71.53 175996207     0.00     0.00  bitmap_operation
  2.14   2416.55    63.54        9     7.06    15.68  calculate_global_regs_live
  2.14   2480.01    63.46       15     4.23     4.23  calc_dfs_tree_nonrec
  1.90   2536.38    56.37 173039017     0.00     0.00  make_label_edge
  1.60   2583.90    47.52        3    15.84    31.63  flow_loops_find
  0.98   2613.04    29.14 173160566     0.00     0.00  free_edge
  0.97   2641.89    28.85        5     5.77     5.77  mark_dfs_back_edges
  0.70   2662.75    20.86        3     6.95    64.74  estimate_bb_frequencies
  0.60   2680.56    17.81                             __lshrdi3
  0.56   2697.23    16.67   266211     0.00     0.00  record_one_conflict
  0.55   2713.54    16.31     5058     0.00     0.00  flow_loop_exit_edges_find
  0.55   2729.73    16.19    64492     0.00     0.00  purge_dead_edges
  0.51   2744.83    15.10 43310765     0.00     0.00  remove_edge

>How-To-Repeat:

>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted:


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]