[Bug tree-optimization/54488] tree loop invariant motion uses an excessive amount of memory

evgeniya.maenkova at gmail dot com gcc-bugzilla@gcc.gnu.org
Sun Oct 19 16:02:00 GMT 2014


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54488

--- Comment #3 from Evgeniya Maenkova <evgeniya.maenkova at gmail dot com> ---
Could you please clarify your comment #36
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46590#c36) in PR4596? I mean "LIM
is now the pass that pushes
memory usage to 1.8GB - all other optimization passes are happy with just
~800MB."

How did you measure lim impact (1,8G)? (Was it by -ftime-report being run
with/without lim optimization? Or was it top? Or some other tool which allow to
see memory footprint for each optimization pass.)

I can see this (for the full test case om 46590). ( I mean based on
-ftime-report we see ~5,5Mb, but this only GC memory, right? ):

 MAIN__ main
Analyzing compilation unit
 {GC 41969k -> 36104k} {GC 72488k -> 52189k} {GC 70118k -> 55388k}Performing
interprocedural optimizations
 <*free_lang_data> <visibility> <early_local_cleanups> {GC 81054k -> 73552k}
<free-inline-summary> <whole-program> <profile_estimate> <devirt> <cp> <inline>
<pure-const> <static-var> <single-use> <comdats>Assembling functions:
 MAIN__ {GC 137793k -> 78714k} {GC 107079k -> 70685k} {GC 97203k -> 77229k} {GC
100487k -> 71679k} {GC 148921k -> 129443k} {GC 190666k -> 128409k} {GC 168488k
-> 127341k} main
Execution times (seconds)
 phase setup             :   0.05 ( 0%) usr   0.01 ( 0%) sys   0.08 ( 0%) wall 
   107 kB ( 0%) ggc
 phase parsing           :  13.79 ( 1%) usr   0.15 ( 0%) sys  13.94 ( 1%) wall 
 41869 kB ( 7%) ggc
 phase lang. deferred    :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 phase opt and generate  :2103.63 (99%) usr  89.40 (100%) sys2195.36 (99%) wall
 519770 kB (93%) ggc
 phase finalize          :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 garbage collection      :   5.42 ( 0%) usr   0.04 ( 0%) sys   5.48 ( 0%) wall 
     0 kB ( 0%) ggc
 callgraph construction  :   0.91 ( 0%) usr   0.00 ( 0%) sys   0.88 ( 0%) wall 
  7644 kB ( 1%) ggc
 callgraph optimization  :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall 
     0 kB ( 0%) ggc
 ipa dead code removal   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 ipa cp                  :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall 
   256 kB ( 0%) ggc
 ipa inlining heuristics :   2.54 ( 0%) usr   0.00 ( 0%) sys   2.54 ( 0%) wall 
  3253 kB ( 1%) ggc
 ipa profile             :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall 
     0 kB ( 0%) ggc
 ipa pure const          :   0.42 ( 0%) usr   0.00 ( 0%) sys   0.42 ( 0%) wall 
     0 kB ( 0%) ggc
 cfg construction        :   0.31 ( 0%) usr   0.01 ( 0%) sys   0.33 ( 0%) wall 
  2278 kB ( 0%) ggc
 cfg cleanup             :   1.75 ( 0%) usr   0.05 ( 0%) sys   1.90 ( 0%) wall 
   752 kB ( 0%) ggc
 CFG verifier            :  22.85 ( 1%) usr   0.10 ( 0%) sys  22.83 ( 1%) wall 
     0 kB ( 0%) ggc
 trivially dead code     :   1.12 ( 0%) usr   0.00 ( 0%) sys   1.10 ( 0%) wall 
     0 kB ( 0%) ggc
 df scan insns           :   2.79 ( 0%) usr   0.12 ( 0%) sys   2.92 ( 0%) wall 
     0 kB ( 0%) ggc
 df multiple defs        :   0.75 ( 0%) usr   0.01 ( 0%) sys   0.77 ( 0%) wall 
     0 kB ( 0%) ggc
 df reaching defs        :  76.35 ( 4%) usr  68.69 (77%) sys 144.67 ( 7%) wall 
     0 kB ( 0%) ggc
 df live regs            :   6.93 ( 0%) usr   0.79 ( 1%) sys   7.65 ( 0%) wall 
     0 kB ( 0%) ggc
 df live&initialized regs:   2.74 ( 0%) usr   0.69 ( 1%) sys   3.41 ( 0%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:   1.04 ( 0%) usr   0.00 ( 0%) sys   1.10 ( 0%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   4.37 ( 0%) usr   0.00 ( 0%) sys   4.36 ( 0%) wall 
  6401 kB ( 1%) ggc
 register information    :   0.56 ( 0%) usr   0.00 ( 0%) sys   0.55 ( 0%) wall 
     0 kB ( 0%) ggc
 alias analysis          :   3.12 ( 0%) usr   0.00 ( 0%) sys   3.15 ( 0%) wall 
  9226 kB ( 2%) ggc
 alias stmt walking      : 632.31 (30%) usr   2.05 ( 2%) sys 635.16 (29%) wall 
  2380 kB ( 0%) ggc
 register scan           :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.21 ( 0%) wall 
   274 kB ( 0%) ggc
 rebuild jump labels     :   0.54 ( 0%) usr   0.01 ( 0%) sys   0.54 ( 0%) wall 
     0 kB ( 0%) ggc
 parser (global)         :  13.79 ( 1%) usr   0.15 ( 0%) sys  13.94 ( 1%) wall 
 41869 kB ( 7%) ggc
 early inlining heuristics:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
      0 kB ( 0%) ggc
 inline parameters       :   1.06 ( 0%) usr   0.00 ( 0%) sys   1.07 ( 0%) wall 
     1 kB ( 0%) ggc
 integration             :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
     0 kB ( 0%) ggc
 tree gimplify           :   5.23 ( 0%) usr   0.04 ( 0%) sys   5.26 ( 0%) wall 
 38114 kB ( 7%) ggc
 tree eh                 :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall 
     0 kB ( 0%) ggc
 tree CFG construction   :   0.49 ( 0%) usr   0.02 ( 0%) sys   0.52 ( 0%) wall 
 13855 kB ( 2%) ggc
 tree CFG cleanup        :  93.20 ( 4%) usr   0.37 ( 0%) sys  93.55 ( 4%) wall 
  3894 kB ( 1%) ggc
 tree tail merge         :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall 
     0 kB ( 0%) ggc
 tree VRP                :  15.11 ( 1%) usr   0.66 ( 1%) sys  15.87 ( 1%) wall 
 15190 kB ( 3%) ggc
 tree copy propagation   :   1.96 ( 0%) usr   0.01 ( 0%) sys   1.96 ( 0%) wall 
     7 kB ( 0%) ggc
 tree PTA                :  68.62 ( 3%) usr   0.35 ( 0%) sys  68.97 ( 3%) wall 
  5002 kB ( 1%) ggc
 tree PHI insertion      :   0.34 ( 0%) usr   0.01 ( 0%) sys   0.34 ( 0%) wall 
  6280 kB ( 1%) ggc
 tree SSA rewrite        :   2.01 ( 0%) usr   0.01 ( 0%) sys   2.01 ( 0%) wall 
 12549 kB ( 2%) ggc
 tree SSA other          :   0.56 ( 0%) usr   0.26 ( 0%) sys   0.82 ( 0%) wall 
     0 kB ( 0%) ggc
 tree SSA incremental    :   3.62 ( 0%) usr   0.05 ( 0%) sys   3.78 ( 0%) wall 
  9579 kB ( 2%) ggc
 tree operand scan       :   2.71 ( 0%) usr   0.27 ( 0%) sys   3.02 ( 0%) wall 
 12294 kB ( 2%) ggc
 dominator optimization  : 637.95 (30%) usr   0.02 ( 0%) sys 637.98 (29%) wall 
 11954 kB ( 2%) ggc
 tree SRA                :   2.15 ( 0%) usr   0.01 ( 0%) sys   2.14 ( 0%) wall 
   172 kB ( 0%) ggc
 isolate eroneous paths  :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall 
     0 kB ( 0%) ggc
 tree CCP                :  24.10 ( 1%) usr   0.02 ( 0%) sys  24.14 ( 1%) wall 
  2100 kB ( 0%) ggc
 tree PHI const/copy prop:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     7 kB ( 0%) ggc
 tree split crit edges   :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall 
  1860 kB ( 0%) ggc
 tree reassociation      :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall 
     0 kB ( 0%) ggc
 tree PRE                :  11.70 ( 1%) usr   0.35 ( 0%) sys  13.19 ( 1%) wall 
 10433 kB ( 2%) ggc
 tree FRE                :  16.29 ( 1%) usr   0.12 ( 0%) sys  15.63 ( 1%) wall 
 12925 kB ( 2%) ggc
 tree code sinking       :   0.72 ( 0%) usr   0.00 ( 0%) sys   0.72 ( 0%) wall 
   291 kB ( 0%) ggc
 tree linearize phis     :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall 
     1 kB ( 0%) ggc
 tree forward propagate  :   0.82 ( 0%) usr   0.02 ( 0%) sys   0.87 ( 0%) wall 
   613 kB ( 0%) ggc
 tree phiprop            :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 tree conservative DCE   :   0.99 ( 0%) usr   0.24 ( 0%) sys   1.14 ( 0%) wall 
   127 kB ( 0%) ggc
 tree aggressive DCE     :   4.79 ( 0%) usr   0.10 ( 0%) sys   5.04 ( 0%) wall 
  6137 kB ( 1%) ggc
 tree buildin call DCE   :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 tree DSE                :   5.18 ( 0%) usr   0.00 ( 0%) sys   5.17 ( 0%) wall 
  2048 kB ( 0%) ggc
 PHI merge               :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 tree loop bounds        :   1.12 ( 0%) usr   0.00 ( 0%) sys   1.14 ( 0%) wall 
  2311 kB ( 0%) ggc
 tree loop invariant motion:   0.57 ( 0%) usr   0.00 ( 0%) sys   0.57 ( 0%)
wall       0 kB ( 0%) ggc
 tree canonical iv       :   1.54 ( 0%) usr   0.02 ( 0%) sys   1.53 ( 0%) wall 
  3193 kB ( 1%) ggc
 scev constant prop      :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall 
   514 kB ( 0%) ggc
 complete unrolling      :  45.19 ( 2%) usr   1.07 ( 1%) sys  46.24 ( 2%) wall 
 54512 kB (10%) ggc
 tree iv optimization    :  16.47 ( 1%) usr   0.11 ( 0%) sys  16.67 ( 1%) wall 
 15757 kB ( 3%) ggc
 tree copy headers       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 tree SSA uncprop        :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall 
     0 kB ( 0%) ggc
 tree rename SSA copies  :   0.29 ( 0%) usr   0.00 ( 0%) sys   0.29 ( 0%) wall 
     0 kB ( 0%) ggc
 tree SSA verifier       :  48.93 ( 2%) usr   0.00 ( 0%) sys  49.03 ( 2%) wall 
     0 kB ( 0%) ggc
 tree STMT verifier      : 119.16 ( 6%) usr   0.00 ( 0%) sys 119.09 ( 5%) wall 
     0 kB ( 0%) ggc
 tree switch conversion  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 tree strlen optimization:   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
     0 kB ( 0%) ggc
 callgraph verifier      :   2.82 ( 0%) usr   0.00 ( 0%) sys   2.84 ( 0%) wall 
     0 kB ( 0%) ggc
 dominance frontiers     :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall 
     0 kB ( 0%) ggc
 dominance computation   :   3.47 ( 0%) usr   0.02 ( 0%) sys   3.36 ( 0%) wall 
     0 kB ( 0%) ggc
 control dependences     :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
     0 kB ( 0%) ggc
 out of ssa              :   1.27 ( 0%) usr   0.00 ( 0%) sys   1.27 ( 0%) wall 
     3 kB ( 0%) ggc
 expand vars             :   0.70 ( 0%) usr   0.00 ( 0%) sys   0.70 ( 0%) wall 
  2685 kB ( 0%) ggc
 expand                  :   9.20 ( 0%) usr   0.07 ( 0%) sys   9.47 ( 0%) wall 
 75267 kB (13%) ggc
 post expand cleanups    :   0.25 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall 
     0 kB ( 0%) ggc
 lower subreg            :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 forward prop            :   2.40 ( 0%) usr   0.02 ( 0%) sys   2.42 ( 0%) wall 
  4462 kB ( 1%) ggc
 CSE                     :   5.95 ( 0%) usr   0.00 ( 0%) sys   5.94 ( 0%) wall 
  3584 kB ( 1%) ggc
 dead code elimination   :   1.08 ( 0%) usr   0.00 ( 0%) sys   1.10 ( 0%) wall 
     0 kB ( 0%) ggc
 dead store elim1        :   3.79 ( 0%) usr   0.03 ( 0%) sys   3.83 ( 0%) wall 
 13064 kB ( 2%) ggc
 dead store elim2        :   6.27 ( 0%) usr   0.00 ( 0%) sys   6.27 ( 0%) wall 
 13104 kB ( 2%) ggc
 loop init               :  50.96 ( 2%) usr   0.01 ( 0%) sys  50.95 ( 2%) wall 
 28100 kB ( 5%) ggc
 loop invariant motion   :   5.97 ( 0%) usr  12.23 (14%) sys  18.62 ( 1%) wall 
    14 kB ( 0%) ggc
 loop fini               :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall 
     0 kB ( 0%) ggc
 CPROP                   :   2.88 ( 0%) usr   0.00 ( 0%) sys   2.90 ( 0%) wall 
     0 kB ( 0%) ggc
 PRE                     :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall 
     0 kB ( 0%) ggc
 CSE 2                   :   5.33 ( 0%) usr   0.01 ( 0%) sys   5.33 ( 0%) wall 
  2902 kB ( 1%) ggc
 branch prediction       :   1.63 ( 0%) usr   0.00 ( 0%) sys   1.63 ( 0%) wall 
  1851 kB ( 0%) ggc
 combiner                :   3.75 ( 0%) usr   0.01 ( 0%) sys   3.85 ( 0%) wall 
  5136 kB ( 1%) ggc
 if-conversion           :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall 
     0 kB ( 0%) ggc
 integrated RA           :  17.72 ( 1%) usr   0.08 ( 0%) sys  17.86 ( 1%) wall 
 34933 kB ( 6%) ggc
 LRA non-specific        :   5.26 ( 0%) usr   0.03 ( 0%) sys   5.27 ( 0%) wall 
  5148 kB ( 1%) ggc
 LRA virtuals elimination:   2.09 ( 0%) usr   0.00 ( 0%) sys   2.11 ( 0%) wall 
 10650 kB ( 2%) ggc
 LRA create live ranges  :   0.49 ( 0%) usr   0.00 ( 0%) sys   0.49 ( 0%) wall 
   857 kB ( 0%) ggc
 LRA hard reg assignment :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall 
     0 kB ( 0%) ggc
 reload                  :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
     0 kB ( 0%) ggc
 reload CSE regs         :   7.30 ( 0%) usr   0.00 ( 0%) sys   7.29 ( 0%) wall 
 20497 kB ( 4%) ggc
 ree                     :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall 
     0 kB ( 0%) ggc
 thread pro- & epilogue  :   1.16 ( 0%) usr   0.00 ( 0%) sys   1.16 ( 0%) wall 
     3 kB ( 0%) ggc
 if-conversion 2         :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall 
     0 kB ( 0%) ggc
 combine stack adjustments:   0.19 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall
     84 kB ( 0%) ggc
 peephole 2              :   0.86 ( 0%) usr   0.00 ( 0%) sys   0.84 ( 0%) wall 
   405 kB ( 0%) ggc
 hard reg cprop          :   4.49 ( 0%) usr   0.05 ( 0%) sys   4.53 ( 0%) wall 
     2 kB ( 0%) ggc
 scheduling 2            :  15.49 ( 1%) usr   0.06 ( 0%) sys  15.55 ( 1%) wall 
  4668 kB ( 1%) ggc
 machine dep reorg       :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
     0 kB ( 0%) ggc
 reorder blocks          :   1.08 ( 0%) usr   0.02 ( 0%) sys   1.15 ( 0%) wall 
  3889 kB ( 1%) ggc
 shorten branches        :   0.94 ( 0%) usr   0.00 ( 0%) sys   0.94 ( 0%) wall 
     0 kB ( 0%) ggc
 final                   :   2.83 ( 0%) usr   0.03 ( 0%) sys   2.90 ( 0%) wall 
 12173 kB ( 2%) ggc
 variable output         :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     2 kB ( 0%) ggc
 tree if-combine         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 straight-line strength reduction:   0.85 ( 0%) usr   0.00 ( 0%) sys   0.84 (
0%) wall     276 kB ( 0%) ggc
 rest of compilation     :   3.58 ( 0%) usr   0.00 ( 0%) sys   3.70 ( 0%) wall 
  1791 kB ( 0%) ggc
 remove unused locals    :   0.92 ( 0%) usr   0.00 ( 0%) sys   0.93 ( 0%) wall 
     0 kB ( 0%) ggc
 address taken           :   0.60 ( 0%) usr   0.00 ( 0%) sys   0.61 ( 0%) wall 
     0 kB ( 0%) ggc
 unaccounted todo        :   3.97 ( 0%) usr   0.04 ( 0%) sys   4.14 ( 0%) wall 
     0 kB ( 0%) ggc
 verify loop closed      :   0.72 ( 0%) usr   0.00 ( 0%) sys   0.75 ( 0%) wall 
     0 kB ( 0%) ggc
 verify RTL sharing      :  21.04 ( 1%) usr   0.00 ( 0%) sys  21.04 ( 1%) wall 
     0 kB ( 0%) ggc
 repair loop structures  :   1.50 ( 0%) usr   0.00 ( 0%) sys   1.58 ( 0%) wall 
     0 kB ( 0%) ggc
 TOTAL                 :2117.50            89.56          2209.41            
561747 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.

My command line was:
time /users/egm/gcc_objdir/libexec/gcc/i686-pc-linux-gnu/5.0.0/f951
-ftime-report -O2  gener-max.f90 > max_out 2>&1

real    36m49.416s
user    35m17.508s
sys     1m29.654s



More information about the Gcc-bugs mailing list