This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: better wpa [1/n]: merge types during read-in


On Fri, Apr 22, 2011 at 1:58 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> I run the patch on Mozilla. ?W/o the patch it is:
> Execution times (seconds)
> ?garbage collection ? ?: ?20.19 ( 3%) usr ? 0.02 ( 0%) sys ?20.22 ( 3%) wall ? ? ? 0 kB ( 0%) ggc
> ?callgraph optimization: ? 3.53 ( 1%) usr ? 0.01 ( 0%) sys ? 3.53 ( 1%) wall ? 15248 kB ( 1%) ggc
> ?varpool construction ?: ? 0.77 ( 0%) usr ? 0.02 ( 0%) sys ? 0.80 ( 0%) wall ? 51607 kB ( 4%) ggc
> ?ipa cp ? ? ? ? ? ? ? ?: ? 2.12 ( 0%) usr ? 0.10 ( 1%) sys ? 2.23 ( 0%) wall ?119701 kB (10%) ggc
> ?ipa lto gimple in ? ? : ? 0.07 ( 0%) usr ? 0.02 ( 0%) sys ? 0.07 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa lto gimple out ? ?: ?11.63 ( 2%) usr ? 1.01 ( 8%) sys ?12.63 ( 2%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa lto decl in ? ? ? : 182.15 (28%) usr ? 4.06 (32%) sys 188.10 (28%) wall ?392863 kB (31%) ggc
> ?ipa lto decl out ? ? ?: 149.86 (23%) usr ? 0.32 ( 3%) sys 150.25 (22%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa lto decl init I/O : ? 0.14 ( 0%) usr ? 0.03 ( 0%) sys ? 0.16 ( 0%) wall ? ? ?31 kB ( 0%) ggc
> ?ipa lto cgraph I/O ? ?: ? 2.09 ( 0%) usr ? 0.27 ( 2%) sys ? 2.37 ( 0%) wall ?428623 kB (34%) ggc
> ?ipa lto decl merge ? ?: 219.70 (33%) usr ? 1.93 (15%) sys 221.75 (33%) wall ?162687 kB (13%) ggc
> ?ipa lto cgraph merge ?: ? 2.68 ( 0%) usr ? 0.00 ( 0%) sys ? 2.69 ( 0%) wall ? 15895 kB ( 1%) ggc
> ?whopr wpa ? ? ? ? ? ? : ? 1.65 ( 0%) usr ? 0.04 ( 0%) sys ? 1.71 ( 0%) wall ? ? ? 1 kB ( 0%) ggc
> ?whopr wpa I/O ? ? ? ? : ? 2.20 ( 0%) usr ? 4.55 (36%) sys ? 7.20 ( 1%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa reference ? ? ? ? : ? 4.12 ( 1%) usr ? 0.00 ( 0%) sys ? 4.09 ( 1%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa profile ? ? ? ? ? : ? 0.18 ( 0%) usr ? 0.00 ( 0%) sys ? 0.17 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa pure const ? ? ? ?: ? 3.15 ( 0%) usr ? 0.04 ( 0%) sys ? 3.19 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?parser ? ? ? ? ? ? ? ?: ? 1.56 ( 0%) usr ? 0.00 ( 0%) sys ? 1.56 ( 0%) wall ? 37684 kB ( 3%) ggc
> ?inline heuristics ? ? : ?47.26 ( 7%) usr ? 0.05 ( 0%) sys ?47.33 ( 7%) wall ? 21988 kB ( 2%) ggc
> ?callgraph verifier ? ?: ? 0.42 ( 0%) usr ? 0.04 ( 0%) sys ? 0.47 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?varconst ? ? ? ? ? ? ?: ? 0.02 ( 0%) usr ? 0.01 ( 0%) sys ? 0.06 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?unaccounted todo ? ? ?: ? 1.19 ( 0%) usr ? 0.00 ( 0%) sys ? 1.17 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?TOTAL ? ? ? ? ? ? ? ? : 657.07 ? ? ? ? ? ?12.64 ? ? ? ? ? 672.26 ? ? ? ? ? ?1247550 kB
>
> note that total GGC use seems obviously wrong. ?The peak GGC report reads: {GC
> 4079042k -> 4043085k}
>
> with the patch
> Execution times (seconds)
> ?garbage collection ? ?: ?13.85 ( 3%) usr ? 0.02 ( 0%) sys ?13.88 ( 3%) wall ? ? ? 0 kB ( 0%) ggc
> ?callgraph optimization: ? 2.40 ( 0%) usr ? 0.00 ( 0%) sys ? 2.40 ( 0%) wall ? 15248 kB ( 1%) ggc
> ?varpool construction ?: ? 0.69 ( 0%) usr ? 0.03 ( 0%) sys ? 0.71 ( 0%) wall ? 51621 kB ( 4%) ggc
> ?ipa cp ? ? ? ? ? ? ? ?: ? 1.86 ( 0%) usr ? 0.11 ( 1%) sys ? 1.97 ( 0%) wall ?119697 kB ( 9%) ggc
> ?ipa lto gimple in ? ? : ? 0.04 ( 0%) usr ? 0.02 ( 0%) sys ? 0.06 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa lto gimple out ? ?: ?11.86 ( 2%) usr ? 0.92 ( 9%) sys ?12.80 ( 2%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa lto decl in ? ? ? : 287.52 (54%) usr ? 3.49 (35%) sys 291.13 (54%) wall ?713694 kB (51%) ggc
> ?ipa lto decl out ? ? ?: 127.76 (24%) usr ? 0.94 ( 9%) sys 128.79 (24%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa lto decl init I/O : ? 0.13 ( 0%) usr ? 0.02 ( 0%) sys ? 0.15 ( 0%) wall ? ? ?31 kB ( 0%) ggc
> ?ipa lto cgraph I/O ? ?: ? 1.66 ( 0%) usr ? 0.29 ( 3%) sys ? 1.94 ( 0%) wall ?428623 kB (30%) ggc
> ?ipa lto decl merge ? ?: ?18.12 ( 3%) usr ? 0.13 ( 1%) sys ?18.26 ( 3%) wall ? ? 978 kB ( 0%) ggc
> ?ipa lto cgraph merge ?: ? 1.90 ( 0%) usr ? 0.00 ( 0%) sys ? 1.91 ( 0%) wall ? 15143 kB ( 1%) ggc
> ?whopr wpa ? ? ? ? ? ? : ? 1.99 ( 0%) usr ? 0.05 ( 0%) sys ? 2.01 ( 0%) wall ? ? ? 1 kB ( 0%) ggc
> ?whopr wpa I/O ? ? ? ? : ? 2.40 ( 0%) usr ? 3.77 (38%) sys ? 6.47 ( 1%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa reference ? ? ? ? : ? 4.56 ( 1%) usr ? 0.00 ( 0%) sys ? 4.58 ( 1%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa profile ? ? ? ? ? : ? 0.16 ( 0%) usr ? 0.00 ( 0%) sys ? 0.15 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?ipa pure const ? ? ? ?: ? 3.33 ( 1%) usr ? 0.03 ( 0%) sys ? 3.36 ( 1%) wall ? ? ? 0 kB ( 0%) ggc
> ?parser ? ? ? ? ? ? ? ?: ? 1.85 ( 0%) usr ? 0.03 ( 0%) sys ? 1.87 ( 0%) wall ? 37684 kB ( 3%) ggc
> ?inline heuristics ? ? : ?47.34 ( 9%) usr ? 0.04 ( 0%) sys ?47.42 ( 9%) wall ? 21988 kB ( 2%) ggc
> ?tree CFG construction : ? 0.00 ( 0%) usr ? 0.00 ( 0%) sys ? 0.01 ( 0%) wall ? ? ? 1 kB ( 0%) ggc
> ?callgraph verifier ? ?: ? 0.45 ( 0%) usr ? 0.05 ( 0%) sys ? 0.55 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?varconst ? ? ? ? ? ? ?: ? 0.00 ( 0%) usr ? 0.03 ( 0%) sys ? 0.03 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?unaccounted todo ? ? ?: ? 1.38 ( 0%) usr ? 0.00 ( 0%) sys ? 1.37 ( 0%) wall ? ? ? 0 kB ( 0%) ggc
> ?TOTAL ? ? ? ? ? ? ? ? : 531.66 ? ? ? ? ? ?10.05 ? ? ? ? ? 542.31 ? ? ? ? ? ?1405930 kB
>
> and peak memory use 2688637k -> 2136908k. ?So 50% GGC memory (we need another
> about 4G for non-GGC memory, probaly largely in mmap pool) and 23% compile time
> improvements.
>
> So great job! And as a note for myself, the inliner facelifting made it 3.5 times
> slower here. ?It is obviously because of recomputing badness. ?I do have plan
> for this.
>
> Note that this is non-debugging build. ?We are stil way above my original
> results from gcc summit paper that was
> ?TOTAL ? ? ? ? ? ? ? ? : 186.41 ? ? ? ? ? ? 8.27 ? ? ? ? ? 195.10 ? ? ? ? ? 3491946 kB
> I think most slowdown was caused by making free-lang-data to not free stuff
> that might make dwarf2out ICE. DECL in was 48s, merge 45s, decl out 48s,
> inliner 15s.

Yes, that's very likely.  If we'd get around to re-do the LTO option saving code
we might want to forbid -g0 compile and -g link (dropping -g at link
time as soon
as we see a single module compiled with -g0).  Then we can free some more
stuff, at least with -g0 - though I'm not sure -g0 matters in practice.

Maybe we can shift numbers back from ipa lto decl in to ipa lto decl
merge by some
timevar adjustments?

Richard.

> Honza
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]