[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

hubicka at ucw dot cz gcc-bugzilla@gcc.gnu.org
Sat Jul 22 19:30:00 GMT 2006



------- Comment #14 from hubicka at ucw dot cz  2006-07-22 19:30 -------
Subject: Re:  [4.1/4.2 regression] A file that can not be compiled in
reasonable time/space

Hi,
with the attached patch I can cure the regmove quadratic behaviour and
the time report is not so unresonable now:

 gnu_dev_major gnu_dev_minor gnu_dev_makedev max min f fx fy fz add addl addr
sub subl subr mul mull mulr divl ipow fi
Analyzing compilation unitPerforming intraprocedural optimizations
Assembling functions:
 max min add addl addr sub subl subr mul mull mulr divl ipow fz fy fx f fi {GC
126177k -> 85112k} {GC 327625k -> 39474k}
Execution times (seconds)
 garbage collection    :   0.83 ( 0%) usr   0.00 ( 0%) sys   0.82 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.16 ( 0%) usr   0.02 ( 1%) sys   0.16 ( 0%) wall   
1147 kB ( 0%) ggc
 callgraph optimization:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 533 kB ( 0%) ggc
 ipa reference         :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa type escape       :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   0.45 ( 0%) usr   0.00 ( 0%) sys   0.42 ( 0%) wall   
   0 kB ( 0%) ggc
 life analysis         :  21.38 ( 3%) usr   0.02 ( 1%) sys  21.39 ( 3%) wall   
1120 kB ( 0%) ggc
 life info update      :   0.54 ( 0%) usr   0.00 ( 0%) sys   0.61 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   0.87 ( 0%) usr   0.00 ( 0%) sys   0.89 ( 0%) wall   
4266 kB ( 1%) ggc
 register scan         :   0.42 ( 0%) usr   0.00 ( 0%) sys   0.40 ( 0%) wall   
 150 kB ( 0%) ggc
 rebuild jump labels   :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   0 kB ( 0%) ggc
 preprocessing         :   0.27 ( 0%) usr   0.06 ( 2%) sys   0.36 ( 0%) wall   
 471 kB ( 0%) ggc
 lexical analysis      :   0.04 ( 0%) usr   0.05 ( 2%) sys   0.08 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   0.12 ( 0%) usr   0.03 ( 1%) sys   0.17 ( 0%) wall   
3207 kB ( 1%) ggc
 inline heuristics     :  15.14 ( 2%) usr   0.01 ( 0%) sys  15.26 ( 2%) wall   
1486 kB ( 0%) ggc
 integration           :  21.35 ( 3%) usr   0.12 ( 4%) sys  21.71 ( 3%) wall  
33445 kB ( 8%) ggc
 tree gimplify         :   0.18 ( 0%) usr   0.01 ( 0%) sys   0.19 ( 0%) wall   
3341 kB ( 1%) ggc
 tree eh               :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
1338 kB ( 0%) ggc
 tree CFG cleanup      :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
  20 kB ( 0%) ggc
 tree VRP              :   0.38 ( 0%) usr   0.01 ( 0%) sys   0.42 ( 0%) wall   
  11 kB ( 0%) ggc
 tree copy propagation :   0.23 ( 0%) usr   0.01 ( 0%) sys   0.28 ( 0%) wall   
 222 kB ( 0%) ggc
 tree store copy prop  :   0.11 ( 0%) usr   0.01 ( 0%) sys   0.14 ( 0%) wall   
   4 kB ( 0%) ggc
 tree find ref. vars   :   0.10 ( 0%) usr   0.01 ( 0%) sys   0.11 ( 0%) wall   
8137 kB ( 2%) ggc
 tree PTA              :   1.29 ( 0%) usr   0.04 ( 1%) sys   1.36 ( 0%) wall   
  57 kB ( 0%) ggc
 tree alias analysis   :   1.89 ( 0%) usr   0.20 ( 7%) sys   2.10 ( 0%) wall   
   0 kB ( 0%) ggc
 tree PHI insertion    :   1.68 ( 0%) usr   0.01 ( 0%) sys   1.70 ( 0%) wall   
  18 kB ( 0%) ggc
 tree SSA rewrite      :   0.62 ( 0%) usr   0.04 ( 1%) sys   0.65 ( 0%) wall  
17084 kB ( 4%) ggc
 tree SSA other        :   0.48 ( 0%) usr   0.08 ( 3%) sys   0.56 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA incremental  :   1.20 ( 0%) usr   0.00 ( 0%) sys   1.24 ( 0%) wall   
   0 kB ( 0%) ggc
 tree operand scan     :   1.48 ( 0%) usr   0.34 (11%) sys   1.93 ( 0%) wall  
15634 kB ( 4%) ggc
 dominator optimization:   1.05 ( 0%) usr   0.05 ( 2%) sys   1.05 ( 0%) wall   
2698 kB ( 1%) ggc
 tree SRA              :   1.05 ( 0%) usr   0.09 ( 3%) sys   1.15 ( 0%) wall  
24835 kB ( 6%) ggc
 tree STORE-CCP        :   0.09 ( 0%) usr   0.01 ( 0%) sys   0.11 ( 0%) wall   
   4 kB ( 0%) ggc
 tree CCP              :   0.51 ( 0%) usr   0.02 ( 1%) sys   0.56 ( 0%) wall   
 154 kB ( 0%) ggc
 tree reassociation    :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   0 kB ( 0%) ggc
 tree PRE              : 296.46 (45%) usr   0.49 (16%) sys 298.81 (45%) wall  
19481 kB ( 5%) ggc
 tree FRE              :   0.96 ( 0%) usr   0.05 ( 2%) sys   1.00 ( 0%) wall   
7991 kB ( 2%) ggc
 tree forward propagate:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree conservative DCE :   0.54 ( 0%) usr   0.00 ( 0%) sys   0.54 ( 0%) wall   
   0 kB ( 0%) ggc
 tree aggressive DCE   :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   0 kB ( 0%) ggc
 tree DSE              :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   8 kB ( 0%) ggc
 tree SSA uncprop      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA to normal    :  27.19 ( 4%) usr   0.01 ( 0%) sys  27.33 ( 4%) wall   
  22 kB ( 0%) ggc
 tree rename SSA copies:   0.15 ( 0%) usr   0.01 ( 0%) sys   0.16 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance frontiers   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                :   2.96 ( 0%) usr   0.09 ( 3%) sys   3.05 ( 0%) wall  
24095 kB ( 6%) ggc
 jump                  :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 CSE                   :   1.87 ( 0%) usr   0.00 ( 0%) sys   1.88 ( 0%) wall   
 118 kB ( 0%) ggc
 global CSE            :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 CPROP 1               :   0.31 ( 0%) usr   0.00 ( 0%) sys   0.31 ( 0%) wall   
1620 kB ( 0%) ggc
 PRE                   :  21.36 ( 3%) usr   0.01 ( 0%) sys  21.41 ( 3%) wall   
 200 kB ( 0%) ggc
 CPROP 2               :   0.31 ( 0%) usr   0.00 ( 0%) sys   0.31 ( 0%) wall   
 390 kB ( 0%) ggc
 bypass jumps          :   0.36 ( 0%) usr   0.00 ( 0%) sys   0.37 ( 0%) wall   
 389 kB ( 0%) ggc
 CSE 2                 :   1.05 ( 0%) usr   0.00 ( 0%) sys   1.07 ( 0%) wall   
  72 kB ( 0%) ggc
 branch prediction     :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   1 kB ( 0%) ggc
 flow analysis         :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 combiner              :   0.87 ( 0%) usr   0.01 ( 0%) sys   0.88 ( 0%) wall   
1745 kB ( 0%) ggc
 if-conversion         :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
   3 kB ( 0%) ggc
 regmove               :  21.69 ( 3%) usr   0.02 ( 1%) sys  21.78 ( 3%) wall   
   2 kB ( 0%) ggc
 local alloc           :   7.60 ( 1%) usr   0.00 ( 0%) sys   7.62 ( 1%) wall   
1480 kB ( 0%) ggc
 global alloc          :  16.47 ( 2%) usr   0.35 (12%) sys  16.91 ( 3%) wall  
16915 kB ( 4%) ggc
 reload CSE regs       : 107.52 (16%) usr   0.15 ( 5%) sys 108.55 (16%) wall   
4783 kB ( 1%) ggc
 flow 2                :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
 225 kB ( 0%) ggc
 peephole 2            :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall   
   0 kB ( 0%) ggc
 rename registers      :   0.41 ( 0%) usr   0.00 ( 0%) sys   0.39 ( 0%) wall   
   0 kB ( 0%) ggc
 scheduling 2          :  75.09 (11%) usr   0.53 (18%) sys  76.86 (12%) wall 
206227 kB (51%) ggc
 machine dep reorg     :   0.36 ( 0%) usr   0.00 ( 0%) sys   0.35 ( 0%) wall   
   0 kB ( 0%) ggc
 reorder blocks        :   0.22 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%) wall   
  15 kB ( 0%) ggc
 reg stack             :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
  37 kB ( 0%) ggc
 final                 :   0.66 ( 0%) usr   0.02 ( 1%) sys   0.74 ( 0%) wall   
1156 kB ( 0%) ggc
 TOTAL                 : 659.57             2.99           668.06            
407297 kB

PRE is somewhat slow, but I will leave this to Danny.

For scheduling the situation is quite clear - we have huge basic blocks
and produce huge amount of dependencies.  For reload, I am also not
really surprised since the code produces is regalloc nightmare and
reload manages to create very huge bitmaps that results in quadratic
behaviour.

Since Danny asked for allocpools:

Alloc-pool Kind        Pools  Allocated      Peak        Leak
-------------------------------------------------------------
Value sets                18    2230608    1929200          0
Bitmap sets               18       9504       8432          0
Value set nodes           18    2032208    1768488          0
Binary tree nodes         18    1291320     783992          0
value                     48    3875872    1246744          0
et_occ pool              127     238144      48040          0
et_node pool             127     159680      36024          0
Reference tree nodes      18    1430880    1437864          0
Expression tree nodes     18     426240     428840          0
elt_list                  48    3639816     397672          0
List tree nodes           18     511488     516880          0
elt_loc_list              48   14186784     975240          0
Comparison tree nodes     18       4520       4832          0
original_copy             26         48         88          0
Constraint pool          108    4335432    1501136          0
Unary tree nodes          18         96        968          0
Variable info pool       108   12261704    4550848          0
Constraint edges         108       2112        496          0
operand entry pool        36        512        248          0
cselib_val_list           48   11627616     974144          0
-------------------------------------------------------------
Total                    994   58264584

Memory consumption is now dominated by scheduler's dependency info:

ggc-common.c:193 (ggc_calloc)                       6303224: 1.9%   
5139976:12.3%    1863696: 8.8%    1073688:21.8%        530
gimplify.c:453 (create_tmp_var_raw)                 7325032: 2.2%          0:
0.0%     889240: 4.2%          0: 0.0%      93344
genrtl.c:17 (gen_rtx_fmt_ee)                        9819384: 2.9%          0:
0.0%     138900: 0.7%          0: 0.0%     829857
tree-dfa.c:186 (create_stmt_ann)                    9970168: 2.9%     763932:
1.8%       3692: 0.0%          0: 0.0%     206496
tree-ssanames.c:147 (make_ssa_name)                 9740544: 2.9%          0:
0.0%    2373936:11.2%          0: 0.0%     252385
bitmap.c:139 (bitmap_element_allocate)             18876340: 5.6%          0:
0.0%          0: 0.0%          0: 0.0%     674155
genrtl.c:32 (gen_rtx_fmt_ue)                      193579104:57.2%          0:
0.0%          0: 0.0%          0: 0.0%   16131592
Total                                             338496482         41839722   
     21146495          4929007         22457179

I am now looking into -O3 compilation that creases at into-ssa by overly
large stack.

Honza


------- Comment #15 from hubicka at ucw dot cz  2006-07-22 19:30 -------
Created an attachment (id=11920)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11920&action=view)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



More information about the Gcc-bugs mailing list