This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: better wpa [1/n]: merge types during read-in

From: Jan Hubicka <hubicka at ucw dot cz>
To: Richard Guenther <richard dot guenther at gmail dot com>
Cc: Michael Matz <matz at suse dot de>, gcc-patches at gcc dot gnu dot org
Date: Fri, 22 Apr 2011 13:58:50 +0200
Subject: Re: better wpa [1/n]: merge types during read-in
References: <Pine.LNX.4.64.1104192124520.1989@wotan.suse.de> <BANLkTi=5eze-ZVy+B94hEG-hQB1DyRx-0g@mail.gmail.com> <Pine.LNX.4.64.1104201421330.1989@wotan.suse.de> <Pine.LNX.4.64.1104211544500.1989@wotan.suse.de> <BANLkTimJ3FApWeQ-Mu-AvPVDFk1vfT0rDw@mail.gmail.com>

Hi,
I run the patch on Mozilla.  W/o the patch it is:
Execution times (seconds)
 garbage collection    :  20.19 ( 3%) usr   0.02 ( 0%) sys  20.22 ( 3%) wall       0 kB ( 0%) ggc
 callgraph optimization:   3.53 ( 1%) usr   0.01 ( 0%) sys   3.53 ( 1%) wall   15248 kB ( 1%) ggc
 varpool construction  :   0.77 ( 0%) usr   0.02 ( 0%) sys   0.80 ( 0%) wall   51607 kB ( 4%) ggc
 ipa cp                :   2.12 ( 0%) usr   0.10 ( 1%) sys   2.23 ( 0%) wall  119701 kB (10%) ggc
 ipa lto gimple in     :   0.07 ( 0%) usr   0.02 ( 0%) sys   0.07 ( 0%) wall       0 kB ( 0%) ggc
 ipa lto gimple out    :  11.63 ( 2%) usr   1.01 ( 8%) sys  12.63 ( 2%) wall       0 kB ( 0%) ggc
 ipa lto decl in       : 182.15 (28%) usr   4.06 (32%) sys 188.10 (28%) wall  392863 kB (31%) ggc
 ipa lto decl out      : 149.86 (23%) usr   0.32 ( 3%) sys 150.25 (22%) wall       0 kB ( 0%) ggc
 ipa lto decl init I/O :   0.14 ( 0%) usr   0.03 ( 0%) sys   0.16 ( 0%) wall      31 kB ( 0%) ggc
 ipa lto cgraph I/O    :   2.09 ( 0%) usr   0.27 ( 2%) sys   2.37 ( 0%) wall  428623 kB (34%) ggc
 ipa lto decl merge    : 219.70 (33%) usr   1.93 (15%) sys 221.75 (33%) wall  162687 kB (13%) ggc
 ipa lto cgraph merge  :   2.68 ( 0%) usr   0.00 ( 0%) sys   2.69 ( 0%) wall   15895 kB ( 1%) ggc
 whopr wpa             :   1.65 ( 0%) usr   0.04 ( 0%) sys   1.71 ( 0%) wall       1 kB ( 0%) ggc
 whopr wpa I/O         :   2.20 ( 0%) usr   4.55 (36%) sys   7.20 ( 1%) wall       0 kB ( 0%) ggc
 ipa reference         :   4.12 ( 1%) usr   0.00 ( 0%) sys   4.09 ( 1%) wall       0 kB ( 0%) ggc
 ipa profile           :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall       0 kB ( 0%) ggc
 ipa pure const        :   3.15 ( 0%) usr   0.04 ( 0%) sys   3.19 ( 0%) wall       0 kB ( 0%) ggc
 parser                :   1.56 ( 0%) usr   0.00 ( 0%) sys   1.56 ( 0%) wall   37684 kB ( 3%) ggc
 inline heuristics     :  47.26 ( 7%) usr   0.05 ( 0%) sys  47.33 ( 7%) wall   21988 kB ( 2%) ggc
 callgraph verifier    :   0.42 ( 0%) usr   0.04 ( 0%) sys   0.47 ( 0%) wall       0 kB ( 0%) ggc
 varconst              :   0.02 ( 0%) usr   0.01 ( 0%) sys   0.06 ( 0%) wall       0 kB ( 0%) ggc
 unaccounted todo      :   1.19 ( 0%) usr   0.00 ( 0%) sys   1.17 ( 0%) wall       0 kB ( 0%) ggc
 TOTAL                 : 657.07            12.64           672.26            1247550 kB

note that total GGC use seems obviously wrong.  The peak GGC report reads: {GC
4079042k -> 4043085k}

with the patch
Execution times (seconds)
 garbage collection    :  13.85 ( 3%) usr   0.02 ( 0%) sys  13.88 ( 3%) wall       0 kB ( 0%) ggc
 callgraph optimization:   2.40 ( 0%) usr   0.00 ( 0%) sys   2.40 ( 0%) wall   15248 kB ( 1%) ggc
 varpool construction  :   0.69 ( 0%) usr   0.03 ( 0%) sys   0.71 ( 0%) wall   51621 kB ( 4%) ggc
 ipa cp                :   1.86 ( 0%) usr   0.11 ( 1%) sys   1.97 ( 0%) wall  119697 kB ( 9%) ggc
 ipa lto gimple in     :   0.04 ( 0%) usr   0.02 ( 0%) sys   0.06 ( 0%) wall       0 kB ( 0%) ggc
 ipa lto gimple out    :  11.86 ( 2%) usr   0.92 ( 9%) sys  12.80 ( 2%) wall       0 kB ( 0%) ggc
 ipa lto decl in       : 287.52 (54%) usr   3.49 (35%) sys 291.13 (54%) wall  713694 kB (51%) ggc
 ipa lto decl out      : 127.76 (24%) usr   0.94 ( 9%) sys 128.79 (24%) wall       0 kB ( 0%) ggc
 ipa lto decl init I/O :   0.13 ( 0%) usr   0.02 ( 0%) sys   0.15 ( 0%) wall      31 kB ( 0%) ggc
 ipa lto cgraph I/O    :   1.66 ( 0%) usr   0.29 ( 3%) sys   1.94 ( 0%) wall  428623 kB (30%) ggc
 ipa lto decl merge    :  18.12 ( 3%) usr   0.13 ( 1%) sys  18.26 ( 3%) wall     978 kB ( 0%) ggc
 ipa lto cgraph merge  :   1.90 ( 0%) usr   0.00 ( 0%) sys   1.91 ( 0%) wall   15143 kB ( 1%) ggc
 whopr wpa             :   1.99 ( 0%) usr   0.05 ( 0%) sys   2.01 ( 0%) wall       1 kB ( 0%) ggc
 whopr wpa I/O         :   2.40 ( 0%) usr   3.77 (38%) sys   6.47 ( 1%) wall       0 kB ( 0%) ggc
 ipa reference         :   4.56 ( 1%) usr   0.00 ( 0%) sys   4.58 ( 1%) wall       0 kB ( 0%) ggc
 ipa profile           :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall       0 kB ( 0%) ggc
 ipa pure const        :   3.33 ( 1%) usr   0.03 ( 0%) sys   3.36 ( 1%) wall       0 kB ( 0%) ggc
 parser                :   1.85 ( 0%) usr   0.03 ( 0%) sys   1.87 ( 0%) wall   37684 kB ( 3%) ggc
 inline heuristics     :  47.34 ( 9%) usr   0.04 ( 0%) sys  47.42 ( 9%) wall   21988 kB ( 2%) ggc
 tree CFG construction :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       1 kB ( 0%) ggc
 callgraph verifier    :   0.45 ( 0%) usr   0.05 ( 0%) sys   0.55 ( 0%) wall       0 kB ( 0%) ggc
 varconst              :   0.00 ( 0%) usr   0.03 ( 0%) sys   0.03 ( 0%) wall       0 kB ( 0%) ggc
 unaccounted todo      :   1.38 ( 0%) usr   0.00 ( 0%) sys   1.37 ( 0%) wall       0 kB ( 0%) ggc
 TOTAL                 : 531.66            10.05           542.31            1405930 kB

and peak memory use 2688637k -> 2136908k.  So 50% GGC memory (we need another
about 4G for non-GGC memory, probaly largely in mmap pool) and 23% compile time
improvements.

So great job! And as a note for myself, the inliner facelifting made it 3.5 times
slower here.  It is obviously because of recomputing badness.  I do have plan
for this.

Note that this is non-debugging build.  We are stil way above my original
results from gcc summit paper that was
 TOTAL                 : 186.41             8.27           195.10           3491946 kB
I think most slowdown was caused by making free-lang-data to not free stuff
that might make dwarf2out ICE. DECL in was 48s, merge 45s, decl out 48s,
inliner 15s.

Honza

Follow-Ups:
- Re: better wpa [1/n]: merge types during read-in
  - From: Richard Guenther

References:
- better wpa [1/n]: merge types during read-in
  - From: Michael Matz
- Re: better wpa [1/n]: merge types during read-in
  - From: Richard Guenther
- Re: better wpa [1/n]: merge types during read-in
  - From: Michael Matz
- Re: better wpa [1/n]: merge types during read-in
  - From: Michael Matz
- Re: better wpa [1/n]: merge types during read-in
  - From: Richard Guenther

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]