This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

why 6Gb RAM not enough to compile a 14Mb source [MELT]?


Hello All,

my MELT branch http://gcc.gnu.org/wiki/MiddleEndLispTranslator has a big source file in it warm-basilys-0.c. It is "self" generated, about 14Mbytes & almost 280KLOC (in rev136334). It ends with a big initialization routine of 100KLOC which mostly fills a 5000 member structure (each member being itself a small structure) and calls a few routines. This initialization routine has a simple control structure (no deeply nested blocks or loops).

But gcc (either gcc-4.1 or 4.2 or 4.3 from Debian, or the bootsrapped trunk rev136331) can compile this file without any optimisation ie with -O0 -g3 in about 16 seconds and less than 1Gb RAM.

But on my 6 Gbytes machine (Core2, 2400MHz, Debian/Sid/AMD64) the cc1 process with -O2 (either 4.2, 4.3 or the trunk) eats nearly 10Gb of virtual memory and trashes (using 4.8Gb of RAM, 1% cpu time, waiting for the swap IO). The same happens with -O1. -Os is a bit better.

The time to run the
./built-melt-cc-script warm-basilys-0.c warm-basilys-0.so
which compiles warm-basilys-0.c with -O2 -fPIC is

(you can set the MELT_EXTRACFLAGS environment variable to pass
real    84m23.594s
user    6m23.496s
sys     1m5.032s

I am attaching the -ftime-report output for information. One of the most demanding passes is tree operand scan

I find this report misleading on the memory consumption total (1591718kB = 1.6Gb). The top command gives that cc1 needs nearly 10Gb of process space, and uses nearly 5G (and trashes).

I won't be annoyed for long by this, since I'll soon split the warm-basilys.bysl file (and hence the generated files) in several distinct files. Until then, -O0 is enough for me.

Are there any specific flags to pass to gcc to lower the RAM consumption (even at the expense of generated code quality)?

Are there any pragma-s to disable (or lower) optimisation of a single routine?

My intuition (and experience) is that gcc -O2 (or even -O1) time and space consumption is nearly quadratic on the size of the longest routine.

Thanks for reading.


-- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Execution times (seconds)
 garbage collection    :   7.16 ( 2%) usr   0.45 ( 1%) sys  47.16 ( 1%) wall       0 kB ( 0%) ggc
 callgraph construction:  16.83 ( 4%) usr   0.10 ( 0%) sys  16.87 ( 0%) wall   41478 kB ( 3%) ggc
 callgraph optimization:   9.82 ( 3%) usr   0.11 ( 0%) sys   9.95 ( 0%) wall    9184 kB ( 1%) ggc
 ipa reference         :   0.25 ( 0%) usr   0.02 ( 0%) sys   0.26 ( 0%) wall      52 kB ( 0%) ggc
 ipa pure const        :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 cfg cleanup           :   2.76 ( 1%) usr   0.03 ( 0%) sys   2.91 ( 0%) wall    5120 kB ( 0%) ggc
 CFG verifier          :  11.22 ( 3%) usr   0.69 ( 1%) sys 177.08 ( 3%) wall       0 kB ( 0%) ggc
 trivially dead code   :   0.75 ( 0%) usr   0.00 ( 0%) sys   0.80 ( 0%) wall       0 kB ( 0%) ggc
 df reaching defs      :   3.01 ( 1%) usr   0.49 ( 1%) sys  34.85 ( 1%) wall       0 kB ( 0%) ggc
 df live regs          :   3.46 ( 1%) usr   0.06 ( 0%) sys   3.57 ( 0%) wall       0 kB ( 0%) ggc
 df live&initialized regs:   2.12 ( 1%) usr   0.00 ( 0%) sys   2.16 ( 0%) wall       0 kB ( 0%) ggc
 df use-def / def-use chains:   1.61 ( 0%) usr   0.02 ( 0%) sys   1.75 ( 0%) wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   1.07 ( 0%) usr   0.04 ( 0%) sys   1.10 ( 0%) wall   15075 kB ( 1%) ggc
 register information  :   0.51 ( 0%) usr   0.01 ( 0%) sys   0.45 ( 0%) wall       0 kB ( 0%) ggc
 alias analysis        :   1.05 ( 0%) usr   0.01 ( 0%) sys   0.91 ( 0%) wall   19781 kB ( 1%) ggc
 register scan         :   0.25 ( 0%) usr   0.01 ( 0%) sys   0.23 ( 0%) wall     163 kB ( 0%) ggc
 rebuild jump labels   :   0.53 ( 0%) usr   0.00 ( 0%) sys   0.53 ( 0%) wall       0 kB ( 0%) ggc
 preprocessing         :   1.24 ( 0%) usr   0.56 ( 1%) sys   1.93 ( 0%) wall   46597 kB ( 3%) ggc
 lexical analysis      :   0.30 ( 0%) usr   0.81 ( 1%) sys   1.29 ( 0%) wall       0 kB ( 0%) ggc
 parser                :   1.70 ( 0%) usr   0.49 ( 1%) sys   2.24 ( 0%) wall  123365 kB ( 8%) ggc
 inline heuristics     :   0.63 ( 0%) usr   0.01 ( 0%) sys   0.62 ( 0%) wall    5491 kB ( 0%) ggc
 integration           :   2.11 ( 1%) usr   0.22 ( 0%) sys   2.25 ( 0%) wall  168932 kB (11%) ggc
 tree gimplify         :   1.86 ( 0%) usr   0.05 ( 0%) sys   1.78 ( 0%) wall  109046 kB ( 7%) ggc
 tree eh               :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall       0 kB ( 0%) ggc
 tree CFG construction :   0.22 ( 0%) usr   0.01 ( 0%) sys   0.23 ( 0%) wall   69444 kB ( 4%) ggc
 tree CFG cleanup      :   3.42 ( 1%) usr   0.03 ( 0%) sys   4.15 ( 0%) wall    7307 kB ( 0%) ggc
 tree VRP              :   3.69 ( 1%) usr   0.24 ( 0%) sys  11.89 ( 0%) wall  115325 kB ( 7%) ggc
 tree copy propagation :   1.80 ( 0%) usr   0.05 ( 0%) sys   3.50 ( 0%) wall    3511 kB ( 0%) ggc
 tree find ref. vars   :   0.12 ( 0%) usr   0.01 ( 0%) sys   0.12 ( 0%) wall    9570 kB ( 1%) ggc
 tree PTA              :   2.59 ( 1%) usr   0.61 ( 1%) sys  57.50 ( 1%) wall   17158 kB ( 1%) ggc
 tree alias analysis   :   1.13 ( 0%) usr   0.33 ( 1%) sys  26.66 ( 1%) wall    2461 kB ( 0%) ggc
 tree call clobbering  :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%) wall      10 kB ( 0%) ggc
 tree flow sensitive alias:   0.46 ( 0%) usr   0.00 ( 0%) sys   0.53 ( 0%) wall   10992 kB ( 1%) ggc
 tree flow insensitive alias:   8.41 ( 2%) usr   0.06 ( 0%) sys   8.96 ( 0%) wall       0 kB ( 0%) ggc
 tree memory partitioning:   0.38 ( 0%) usr   0.01 ( 0%) sys   0.41 ( 0%) wall     111 kB ( 0%) ggc
 tree PHI insertion    :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall     119 kB ( 0%) ggc
 tree SSA rewrite      :   1.44 ( 0%) usr   0.03 ( 0%) sys   1.46 ( 0%) wall   44376 kB ( 3%) ggc
 tree SSA other        :   0.09 ( 0%) usr   0.09 ( 0%) sys   0.27 ( 0%) wall       0 kB ( 0%) ggc
 tree SSA incremental  :   2.11 ( 1%) usr   0.14 ( 0%) sys   4.59 ( 0%) wall    4795 kB ( 0%) ggc
 tree operand scan     :  80.93 (21%) usr   0.92 ( 1%) sys  82.92 ( 2%) wall   71551 kB ( 4%) ggc
 dominator optimization:   3.97 ( 1%) usr   0.06 ( 0%) sys   3.92 ( 0%) wall   84156 kB ( 5%) ggc
 tree SRA              :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall       0 kB ( 0%) ggc
 tree STORE-CCP        :   0.47 ( 0%) usr   0.05 ( 0%) sys   0.69 ( 0%) wall     992 kB ( 0%) ggc
 tree CCP              :   0.93 ( 0%) usr   0.00 ( 0%) sys   0.94 ( 0%) wall    1205 kB ( 0%) ggc
 tree PHI const/copy prop:   0.06 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall      77 kB ( 0%) ggc
 tree split crit edges :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   21401 kB ( 1%) ggc
 tree reassociation    :   0.43 ( 0%) usr   0.01 ( 0%) sys   0.45 ( 0%) wall     236 kB ( 0%) ggc
 tree PRE              :  13.92 ( 4%) usr  52.21 (81%) sys4339.32 (86%) wall  109776 kB ( 7%) ggc
 tree FRE              :   4.18 ( 1%) usr   2.51 ( 4%) sys   6.69 ( 0%) wall   61570 kB ( 4%) ggc
 tree code sinking     :   0.53 ( 0%) usr   0.03 ( 0%) sys   1.54 ( 0%) wall    1578 kB ( 0%) ggc
 tree linearize phis   :   0.16 ( 0%) usr   0.01 ( 0%) sys   0.14 ( 0%) wall       0 kB ( 0%) ggc
 tree forward propagate:   0.36 ( 0%) usr   0.03 ( 0%) sys   0.35 ( 0%) wall    2466 kB ( 0%) ggc
 tree phiprop          :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall       0 kB ( 0%) ggc
 tree conservative DCE :   0.93 ( 0%) usr   0.01 ( 0%) sys   0.91 ( 0%) wall      20 kB ( 0%) ggc
 tree aggressive DCE   :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall       0 kB ( 0%) ggc
 tree DSE              :   0.35 ( 0%) usr   0.01 ( 0%) sys   0.33 ( 0%) wall     562 kB ( 0%) ggc
 PHI merge             :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall       0 kB ( 0%) ggc
 loop invariant motion :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       6 kB ( 0%) ggc
 complete unrolling    :   0.31 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall     316 kB ( 0%) ggc
 tree iv optimization  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       7 kB ( 0%) ggc
 tree loop init        :   0.29 ( 0%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall     281 kB ( 0%) ggc
 tree loop fini        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 tree copy headers     :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall     524 kB ( 0%) ggc
 tree SSA uncprop      :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall       0 kB ( 0%) ggc
 tree SSA to normal    :  52.85 (14%) usr   0.27 ( 0%) sys  53.12 ( 1%) wall   25180 kB ( 2%) ggc
 tree rename SSA copies:   0.22 ( 0%) usr   0.00 ( 0%) sys   0.28 ( 0%) wall       0 kB ( 0%) ggc
 tree SSA verifier     :  21.08 ( 6%) usr   0.19 ( 0%) sys  21.67 ( 0%) wall    4603 kB ( 0%) ggc
 tree STMT verifier    :  47.77 (12%) usr   1.47 ( 2%) sys  49.16 ( 1%) wall       0 kB ( 0%) ggc
 callgraph verifier    :   0.86 ( 0%) usr   0.00 ( 0%) sys   0.93 ( 0%) wall    2891 kB ( 0%) ggc
 dominance frontiers   :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall       0 kB ( 0%) ggc
 dominance computation :   3.59 ( 1%) usr   0.04 ( 0%) sys   3.55 ( 0%) wall       0 kB ( 0%) ggc
 control dependences   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall       0 kB ( 0%) ggc
 expand                :  11.91 ( 3%) usr   0.31 ( 0%) sys  21.34 ( 0%) wall  172552 kB (11%) ggc
 lower subreg          :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 jump                  :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall       0 kB ( 0%) ggc
 forward prop          :   0.71 ( 0%) usr   0.01 ( 0%) sys   0.87 ( 0%) wall   18126 kB ( 1%) ggc
 CSE                   :   4.33 ( 1%) usr   0.03 ( 0%) sys   4.51 ( 0%) wall    7344 kB ( 0%) ggc
 dead code elimination :   0.63 ( 0%) usr   0.00 ( 0%) sys   0.58 ( 0%) wall       0 kB ( 0%) ggc
 dead store elim1      :   1.24 ( 0%) usr   0.00 ( 0%) sys   1.27 ( 0%) wall   14629 kB ( 1%) ggc
 dead store elim2      :   0.65 ( 0%) usr   0.01 ( 0%) sys   0.65 ( 0%) wall   11488 kB ( 1%) ggc
 loop analysis         :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.21 ( 0%) wall     278 kB ( 0%) ggc
 global CSE            :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall       0 kB ( 0%) ggc
 CPROP 1               :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall    4114 kB ( 0%) ggc
 PRE                   :   0.34 ( 0%) usr   0.00 ( 0%) sys   0.46 ( 0%) wall    3000 kB ( 0%) ggc
 CPROP 2               :   0.31 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall    3110 kB ( 0%) ggc
 bypass jumps          :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.24 ( 0%) wall    2539 kB ( 0%) ggc
 CSE 2                 :   4.29 ( 1%) usr   0.02 ( 0%) sys   4.21 ( 0%) wall    5306 kB ( 0%) ggc
 branch prediction     :   0.66 ( 0%) usr   0.01 ( 0%) sys   0.67 ( 0%) wall    3048 kB ( 0%) ggc
 combiner              :   1.60 ( 0%) usr   0.01 ( 0%) sys   1.72 ( 0%) wall   22097 kB ( 1%) ggc
 if-conversion         :   0.70 ( 0%) usr   0.01 ( 0%) sys   0.78 ( 0%) wall     456 kB ( 0%) ggc
 regmove               :   0.91 ( 0%) usr   0.01 ( 0%) sys   0.87 ( 0%) wall     118 kB ( 0%) ggc
 local alloc           :   4.45 ( 1%) usr   0.01 ( 0%) sys   4.49 ( 0%) wall   11555 kB ( 1%) ggc
 global alloc          :   9.35 ( 2%) usr   0.03 ( 0%) sys   9.42 ( 0%) wall   37993 kB ( 2%) ggc
 reload CSE regs       :   1.83 ( 0%) usr   0.02 ( 0%) sys   1.90 ( 0%) wall   30852 kB ( 2%) ggc
 thread pro- & epilogue:   0.24 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall    1494 kB ( 0%) ggc
 if-conversion 2       :   0.17 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall     143 kB ( 0%) ggc
 peephole 2            :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.28 ( 0%) wall    2505 kB ( 0%) ggc
 rename registers      :   0.93 ( 0%) usr   0.00 ( 0%) sys   0.94 ( 0%) wall      93 kB ( 0%) ggc
 scheduling 2          :   2.72 ( 1%) usr   0.01 ( 0%) sys   2.75 ( 0%) wall    1617 kB ( 0%) ggc
 machine dep reorg     :   0.34 ( 0%) usr   0.00 ( 0%) sys   0.40 ( 0%) wall     385 kB ( 0%) ggc
 reorder blocks        :   0.72 ( 0%) usr   0.00 ( 0%) sys   0.66 ( 0%) wall    6485 kB ( 0%) ggc
 final                 :   1.07 ( 0%) usr   0.02 ( 0%) sys   1.16 ( 0%) wall    8151 kB ( 1%) ggc
 symout                :   0.03 ( 0%) usr   0.01 ( 0%) sys   0.04 ( 0%) wall    2181 kB ( 0%) ggc
 tree if-combine       :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall       0 kB ( 0%) ggc
 TOTAL                 : 382.44            64.16          5061.26            1591718 kB

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]