compiling kdebindings hangs after a while. the attached testcase (680k) takes a very long time to compile. abuild@tangelo:~> /usr/bin/time ./install_gcc41-1-O1/libexec/gcc/powerpc-unknown-linux-gnu/4.1.0/cc1plus -fpreprocessed /tmp/sipqtpart0.ii -quiet -dumpbase sipqtpart0.cpp -auxbase-strip sipqtpart0.o -O2 -O2 -Wall -Wall -Wall -W -version -fmessage-length=0 -fPIC -fmessage-length=0 -o sipqtpart0.s -O2 -v ignoring nonexistent directory "/home/abuild/install_gcc41-1-O1/lib/gcc/powerpc-unknown-linux-gnu/4.1.0/../../../../powerpc-unknown-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /home/abuild/install_gcc41-1-O1/lib/gcc/powerpc-unknown-linux-gnu/4.1.0/../../../../include/c++/4.1.0 /home/abuild/install_gcc41-1-O1/lib/gcc/powerpc-unknown-linux-gnu/4.1.0/../../../../include/c++/4.1.0/powerpc-unknown-linux-gnu /home/abuild/install_gcc41-1-O1/lib/gcc/powerpc-unknown-linux-gnu/4.1.0/../../../../include/c++/4.1.0/backward /usr/local/include /home/abuild/install_gcc41-1-O1/include /home/abuild/install_gcc41-1-O1/lib/gcc/powerpc-unknown-linux-gnu/4.1.0/include /usr/include End of search list. GNU C++ version 4.1.0 20050429 (experimental) (powerpc-unknown-linux-gnu) compiled by GNU C version 4.1.0 20050429 (experimental). GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 /usr/lib/qt3/include/qnetworkprotocol.h:58: warning: 'class QNetworkProtocolFactoryBase' has virtual functions but non-virtual destructor /usr/lib/qt3/include/qtooltip.h:86: warning: 'class QToolTip' has virtual functions but non-virtual destructor /usr/lib/qt3/include/qfiledialog.h:78: warning: 'class QFilePreview' has virtual functions but non-virtual destructor 16616.21user 2.85system 4:37:01elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+275763minor)pagefaults 0swaps gcc 4.0 -O0 == 0:02:15 h:mm:ss gcc 4.0 -O1 == 1:06:00 gcc 4.0 -O2 == 4:05:00 gcc 4.1 -O2 == 4:37:00 This is on a 1.5GHz POWER5. ./install_gcc40-1-O1/bin/g++ -v Using built-in specs. Target: powerpc-unknown-linux-gnu Configured with: /home/abuild/src/gcc-4_0-branch/configure --prefix=/home/abuild/install_gcc40-1-O1 --enable-threads=posix --enable-languages=c,c++ --enable-checking --with-system-zlib --enable-shared --enable-__cxa_atexit --disable-nls Thread model: posix gcc version 4.0.1 20050429 (prerelease) I'm trying mainline with --disable-checking at the moment.
Created attachment 8772 [details] sipqtpart0.ii.bz2
For a profile on ppc-darwin at -O0 we see that a lot (10% or so) of the time is spent in reload or walk_tree
The profile at -O2, says that may_alias is taking 50% of the time and this is with "4.1.0 20050323".
Yup, compute_flow_insensitive_aliasing is taking forever on these files (I stopped it at >2 hours for the TV_ALIAS_ANALYSIS timevar) Probably another reason we shouldn't compute aliasing 5 times :)
(In reply to comment #4) > Yup, compute_flow_insensitive_aliasing is taking forever on these files (I > stopped it at >2 hours for the TV_ALIAS_ANALYSIS timevar) > Probably another reason we shouldn't compute aliasing 5 times :) But two hours/5 is still high.
gcc 4.1 with --disable-checking took 3:28:00 h:mm:ss
gcc-3_4-branch takes only 5 minutes to complete.
This small testcase exhibits similar behaviour (though profile says most of time spent in SSA verifier): ------------------------------------------------------------------------------- #define A0(a) a, #define A1(a) A0(a##0) A0(a##1) A0(a##2) A0(a##3) A0(a##4) A0(a##5) A0(a##6) #define A2(a) A1(a##0) A1(a##1) A1(a##2) A1(a##3) A1(a##4) A1(a##5) A1(a##6) #define A3(a) A2(a##0) A2(a##1) A2(a##2) A2(a##3) A2(a##4) A2(a##5) A2(a##6) #define A4(a) A3(a##0) A3(a##1) A3(a##2) A3(a##3) A3(a##4) A3(a##5) A3(a##6) #define A5(a) A4(a##0) A4(a##1) A4(a##2) A4(a##3) A4(a##4) A4(a##5) A4(a##6) #define F0(a) int a (void) { bar (table); } #define F1(a) F0(a##0) F0(a##1) F0(a##2) F0(a##3) F0(a##4) F0(a##5) F0(a##6) #define F2(a) F1(a##0) F1(a##1) F1(a##2) F1(a##3) F1(a##4) F1(a##5) F1(a##6) #define F3(a) F2(a##0) F2(a##1) F2(a##2) F2(a##3) F2(a##4) F2(a##5) F2(a##6) //#define F4(a) F3(a##0) F3(a##1) F3(a##2) F3(a##3) F3(a##4) F3(a##5) F3(a##6) //#define F5(a) F4(a##0) F4(a##1) F4(a##2) F4(a##3) F4(a##4) F4(a##5) F4(a##6) int A5(j) *table [] = { A5(&j) 0 }; void bar (int **); F3(f); -------------------------------------------------------------------------------
Subject: Re: [4.0/4.1 regression] very long compile times with large cpp file from kdebindings On May 1, 2005, at 11:33 PM, belyshev at depni dot sinp dot msu dot ru wrote: > (though profile says most of time spent in SSA verifier): > Did you forget to configure with --disable-checking :). -- Pinski
with CFLAGS=-O2 on ppc and --disable-checking: ==> 344.log <== 268.28user 0.82system 4:29.15elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+104891minor)pagefaults 0swaps ==> 401.log <== 9658.50user 6.57system 2:41:06elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (10major+197022minor)pagefaults 0swaps ==> 410.log <== 12455.82user 10.93system 3:27:49elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (5major+215621minor)pagefaults 0swaps on i686-linux, 3GHz xeon: GNU C++ version 3.4.4 20050430 (prerelease) (i686-pc-linux-gnu) 205.85user 1.89system 3:29.32elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (3major+95172minor)pagefaults 0swaps GNU C++ version 4.0.1 20050429 (prerelease) (i686-pc-linux-gnu) 6245.58user 4.21system 1:44:21elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (74major+309929minor)pagefaults 0swaps GNU C++ version 4.1.0 20050429 (experimental) (i686-pc-linux-gnu) 6409.50user 10.53system 1:51:28elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (4867major+351033minor)pagefaults 0swaps
For -O0 on the mainline on powerpc-darwin, we have: parser : 17.61 (14%) usr 11.85 (22%) sys 31.13 (16%) wall 342512 kB (29%) ggc name lookup : 14.05 (11%) usr 22.61 (42%) sys 37.95 (20%) wall 18474 kB ( 2%) ggc expand : 19.81 (16%) usr 2.91 ( 5%) sys 23.61 (12%) wall 371057 kB (32%) ggc global alloc : 17.84 (15%) usr 0.71 ( 1%) sys 20.29 (11%) wall 108491 kB ( 9%) ggc final : 10.47 ( 9%) usr 3.06 ( 6%) sys 16.27 ( 8%) wall 30424 kB ( 3%) ggc tree gimplify : 4.29 ( 4%) usr 0.52 ( 1%) sys 5.12 ( 3%) wall 73535 kB ( 6%) ggc This is also memory hog too. This have a different pattern for OVL as PR 8361 and PR 12850: 4.7 or so.
Most of the time is spent checking for avoiding duplicates in tree-ssa-alias.c:1625-1627. Again maybe a hash table or something to mark it as being aliased already.
Note -O0 compile time is faster in 4.0 than in 3.4.
I have a patch which I am testing which should fix this by reducing the number of referenced variables which in turns reduces virtual operands. Now we get the following -ftime-report for -O2 on powerpc-darwin with cc1plus compiled with -O0 and with checking still enabled, a 40x decrease: garbage collection : 20.72 (10%) usr 0.75 ( 1%) sys 29.13 ( 6%) wall 0 kB ( 0%) ggc callgraph construction: 9.23 ( 5%) usr 1.00 ( 2%) sys 13.52 ( 3%) wall 25403 kB ( 4%) ggc callgraph optimization: 0.10 ( 0%) usr 0.00 ( 0%) sys 0.42 ( 0%) wall 0 kB ( 0%) ggc CFG verifier : 3.70 ( 2%) usr 0.38 ( 1%) sys 6.37 ( 1%) wall 0 kB ( 0%) ggc rebuild jump labels : 0.64 ( 0%) usr 0.18 ( 0%) sys 1.14 ( 0%) wall 0 kB ( 0%) ggc preprocessing : 4.47 ( 2%) usr 6.35 (12%) sys 13.55 ( 3%) wall 2112 kB ( 0%) ggc parser : 64.77 (33%) usr 15.36 (29%) sys 195.42 (37%) wall 299871 kB (42%) ggc name lookup : 22.59 (11%) usr 23.24 (43%) sys 76.74 (15%) wall 16057 kB ( 2%) ggc inline heuristics : 0.24 ( 0%) usr 0.01 ( 0%) sys 0.46 ( 0%) wall 662 kB ( 0%) ggc integration : 0.00 ( 0%) usr 0.02 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc tree gimplify : 20.46 (10%) usr 0.89 ( 2%) sys 41.40 ( 8%) wall 57280 kB ( 8%) ggc tree eh : 0.76 ( 0%) usr 0.15 ( 0%) sys 1.19 ( 0%) wall 4251 kB ( 1%) ggc tree CFG construction : 1.87 ( 1%) usr 0.51 ( 1%) sys 2.91 ( 1%) wall 61517 kB ( 9%) ggc tree CFG cleanup : 2.34 ( 1%) usr 0.44 ( 1%) sys 4.01 ( 1%) wall 36 kB ( 0%) ggc tree STMT verifier : 8.10 ( 4%) usr 0.39 ( 1%) sys 11.28 ( 2%) wall 0 kB ( 0%) ggc expand : 34.94 (18%) usr 2.55 ( 5%) sys 74.96 (14%) wall 240800 kB (34%) ggc varconst : 2.88 ( 1%) usr 1.34 ( 2%) sys 51.80 (10%) wall 2382 kB ( 0%) ggc final : 0.50 ( 0%) usr 0.12 ( 0%) sys 1.10 ( 0%) wall 0 kB ( 0%) ggc symout : 0.00 ( 0%) usr 0.03 ( 0%) sys 0.30 ( 0%) wall 28 kB ( 0%) ggc TOTAL : 198.38 53.88 526.74 711174 kB It also fixes the C testcase in comment #8 too.
Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2005-10/msg00737.html
Subject: Bug 21304 CVSROOT: /cvs/gcc Module name: gcc Changes by: pinskia@gcc.gnu.org 2005-10-14 03:01:42 Modified files: gcc : ChangeLog tree-dfa.c Log message: 2005-10-13 Andrew Pinski <pinskia@physics.uc.edu> PR tree-opt/21304 * tree-dfa.c (add_referenced_var): Only look at decls which have TREE_CONSTANT or TREE_READONLY set instead of if !TREE_PUBLIC or !TREE_CONSTANT. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.10155&r2=2.10156 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-dfa.c.diff?cvsroot=gcc&r1=2.64&r2=2.65
Fixed on the mainline, if someone wants to back port the patch, that is fine with me but I don't have time to do it.
(In reply to comment #17) Oh and you need also to backport: 2005-03-03 Jan Hubicka <jh@suse.cz> * tree-dfa.c (add_referenced_var): Don't walk initializer of external and non-constant public variables. http://gcc.gnu.org/ml/gcc-patches/2005-03/msg00209.html
Fixed in GCC-4.1.1 and higher. Won't fix in GCC-4.0.x