Bug 21304 - [4.0 regression] very long compile times with large cpp file from kdebindings
Summary: [4.0 regression] very long compile times with large cpp file from kdebindings
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.0.1
: P2 normal
Target Milestone: 4.1.1
Assignee: Not yet assigned to anyone
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: compile-time-hog, memory-hog, patch
Depends on:
Blocks:
 
Reported: 2005-04-30 19:53 UTC by olh
Modified: 2007-01-18 04:05 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work: 3.4.4 4.1.0 4.1.1 4.1.2
Known to fail: 4.0.0 4.0.1 4.0.2
Last reconfirmed: 2006-01-15 04:14:21


Attachments
sipqtpart0.ii.bz2 (680.11 KB, application/x-bzip2)
2005-04-30 19:54 UTC, olh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description olh 2005-04-30 19:53:49 UTC
compiling kdebindings hangs after a while.
the attached testcase (680k) takes a very long time to compile.

abuild@tangelo:~> /usr/bin/time ./install_gcc41-1-O1/libexec/gcc/powerpc-unknown-linux-gnu/4.1.0/cc1plus -fpreprocessed /tmp/sipqtpart0.ii -quiet -dumpbase sipqtpart0.cpp -auxbase-strip sipqtpart0.o -O2 -O2 -Wall -Wall -Wall -W -version -fmessage-length=0 -fPIC -fmessage-length=0 -o sipqtpart0.s -O2 -v
ignoring nonexistent directory "/home/abuild/install_gcc41-1-O1/lib/gcc/powerpc-unknown-linux-gnu/4.1.0/../../../../powerpc-unknown-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /home/abuild/install_gcc41-1-O1/lib/gcc/powerpc-unknown-linux-gnu/4.1.0/../../../../include/c++/4.1.0
 /home/abuild/install_gcc41-1-O1/lib/gcc/powerpc-unknown-linux-gnu/4.1.0/../../../../include/c++/4.1.0/powerpc-unknown-linux-gnu
 /home/abuild/install_gcc41-1-O1/lib/gcc/powerpc-unknown-linux-gnu/4.1.0/../../../../include/c++/4.1.0/backward
 /usr/local/include
 /home/abuild/install_gcc41-1-O1/include
 /home/abuild/install_gcc41-1-O1/lib/gcc/powerpc-unknown-linux-gnu/4.1.0/include
 /usr/include
End of search list.
GNU C++ version 4.1.0 20050429 (experimental) (powerpc-unknown-linux-gnu)
        compiled by GNU C version 4.1.0 20050429 (experimental).
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
/usr/lib/qt3/include/qnetworkprotocol.h:58: warning: 'class QNetworkProtocolFactoryBase' has virtual functions but non-virtual destructor
/usr/lib/qt3/include/qtooltip.h:86: warning: 'class QToolTip' has virtual functions but non-virtual destructor
/usr/lib/qt3/include/qfiledialog.h:78: warning: 'class QFilePreview' has virtual functions but non-virtual destructor
16616.21user 2.85system 4:37:01elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+275763minor)pagefaults 0swaps

gcc 4.0 -O0  == 0:02:15 h:mm:ss
gcc 4.0 -O1  == 1:06:00
gcc 4.0 -O2  == 4:05:00
gcc 4.1 -O2  == 4:37:00 

This is on a 1.5GHz POWER5.

./install_gcc40-1-O1/bin/g++ -v
Using built-in specs.
Target: powerpc-unknown-linux-gnu
Configured with: /home/abuild/src/gcc-4_0-branch/configure --prefix=/home/abuild/install_gcc40-1-O1 --enable-threads=posix --enable-languages=c,c++ --enable-checking --with-system-zlib --enable-shared --enable-__cxa_atexit --disable-nls
Thread model: posix
gcc version 4.0.1 20050429 (prerelease)

I'm trying mainline with --disable-checking at the moment.
Comment 1 olh 2005-04-30 19:54:47 UTC
Created attachment 8772 [details]
sipqtpart0.ii.bz2
Comment 2 Andrew Pinski 2005-04-30 20:19:33 UTC
For a profile on ppc-darwin at -O0 we see that a lot (10% or so) of the time is spent in reload or 
walk_tree
Comment 3 Andrew Pinski 2005-04-30 20:25:46 UTC
The profile at -O2, says that may_alias is taking 50% of the time and this is with "4.1.0 20050323".
Comment 4 Daniel Berlin 2005-04-30 23:04:09 UTC
Yup, compute_flow_insensitive_aliasing is taking forever on these files (I
stopped it at >2 hours for the TV_ALIAS_ANALYSIS timevar)
Probably another reason we shouldn't compute aliasing 5 times :)
Comment 5 Andrew Pinski 2005-04-30 23:37:13 UTC
(In reply to comment #4)
> Yup, compute_flow_insensitive_aliasing is taking forever on these files (I
> stopped it at >2 hours for the TV_ALIAS_ANALYSIS timevar)
> Probably another reason we shouldn't compute aliasing 5 times :)

But two hours/5 is still high.
Comment 6 olh 2005-05-01 04:11:39 UTC
gcc 4.1 with --disable-checking took 3:28:00 h:mm:ss
Comment 7 olh 2005-05-01 08:10:37 UTC
gcc-3_4-branch takes only 5 minutes to complete.
Comment 8 Serge Belyshev 2005-05-02 03:33:56 UTC
This small testcase exhibits similar behaviour
(though profile says most of time spent in SSA verifier):

-------------------------------------------------------------------------------
#define A0(a) a, 
#define A1(a) A0(a##0) A0(a##1) A0(a##2) A0(a##3) A0(a##4) A0(a##5) A0(a##6)
#define A2(a) A1(a##0) A1(a##1) A1(a##2) A1(a##3) A1(a##4) A1(a##5) A1(a##6)
#define A3(a) A2(a##0) A2(a##1) A2(a##2) A2(a##3) A2(a##4) A2(a##5) A2(a##6)
#define A4(a) A3(a##0) A3(a##1) A3(a##2) A3(a##3) A3(a##4) A3(a##5) A3(a##6)
#define A5(a) A4(a##0) A4(a##1) A4(a##2) A4(a##3) A4(a##4) A4(a##5) A4(a##6)

#define F0(a) int a (void) { bar (table); }
#define F1(a) F0(a##0) F0(a##1) F0(a##2) F0(a##3) F0(a##4) F0(a##5) F0(a##6)
#define F2(a) F1(a##0) F1(a##1) F1(a##2) F1(a##3) F1(a##4) F1(a##5) F1(a##6)
#define F3(a) F2(a##0) F2(a##1) F2(a##2) F2(a##3) F2(a##4) F2(a##5) F2(a##6)
//#define F4(a) F3(a##0) F3(a##1) F3(a##2) F3(a##3) F3(a##4) F3(a##5) F3(a##6)
//#define F5(a) F4(a##0) F4(a##1) F4(a##2) F4(a##3) F4(a##4) F4(a##5) F4(a##6)

int A5(j) *table [] = { A5(&j) 0 };
void bar (int **);

F3(f);
-------------------------------------------------------------------------------
Comment 9 Andrew Pinski 2005-05-02 05:14:08 UTC
Subject: Re:  [4.0/4.1 regression] very long compile times with large cpp file from kdebindings


On May 1, 2005, at 11:33 PM, belyshev at depni dot sinp dot msu dot ru 
wrote:

> (though profile says most of time spent in SSA verifier):
>

Did you forget to configure with --disable-checking :).

-- Pinski

Comment 10 olh 2005-05-02 05:57:36 UTC
with CFLAGS=-O2 on ppc and --disable-checking:
==> 344.log <==
268.28user 0.82system 4:29.15elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+104891minor)pagefaults 0swaps

==> 401.log <==
9658.50user 6.57system 2:41:06elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (10major+197022minor)pagefaults 0swaps

==> 410.log <==
12455.82user 10.93system 3:27:49elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (5major+215621minor)pagefaults 0swaps

on i686-linux, 3GHz xeon:

GNU C++ version 3.4.4 20050430 (prerelease) (i686-pc-linux-gnu)
205.85user 1.89system 3:29.32elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (3major+95172minor)pagefaults 0swaps

GNU C++ version 4.0.1 20050429 (prerelease) (i686-pc-linux-gnu)
6245.58user 4.21system 1:44:21elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (74major+309929minor)pagefaults 0swaps

GNU C++ version 4.1.0 20050429 (experimental) (i686-pc-linux-gnu)
6409.50user 10.53system 1:51:28elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (4867major+351033minor)pagefaults 0swaps

Comment 11 Andrew Pinski 2005-07-23 23:50:22 UTC
For -O0 on the mainline on powerpc-darwin, we have:
 parser                :  17.61 (14%) usr  11.85 (22%) sys  31.13 (16%) wall  342512 kB (29%) ggc
 name lookup           :  14.05 (11%) usr  22.61 (42%) sys  37.95 (20%) wall   18474 kB ( 2%) ggc
 expand                :  19.81 (16%) usr   2.91 ( 5%) sys  23.61 (12%) wall  371057 kB (32%) ggc
 global alloc          :  17.84 (15%) usr   0.71 ( 1%) sys  20.29 (11%) wall  108491 kB ( 9%) ggc
 final                 :  10.47 ( 9%) usr   3.06 ( 6%) sys  16.27 ( 8%) wall   30424 kB ( 3%) ggc
 tree gimplify         :   4.29 ( 4%) usr   0.52 ( 1%) sys   5.12 ( 3%) wall   73535 kB ( 6%) ggc

This is also memory hog too.  This have a different pattern for OVL as PR 8361 and PR 12850:
4.7 or so.
Comment 12 Andrew Pinski 2005-07-24 00:04:54 UTC
Most of the time is spent checking for avoiding duplicates in tree-ssa-alias.c:1625-1627.

Again maybe a hash table or something to mark it as being aliased already.
Comment 13 Andrew Pinski 2005-09-19 01:08:02 UTC
Note -O0 compile time is faster in 4.0 than in 3.4.
Comment 14 Andrew Pinski 2005-10-12 16:12:33 UTC
I have a patch which I am testing which should fix this by reducing the number of referenced variables which in turns reduces virtual operands.
Now we get the following -ftime-report for -O2 on powerpc-darwin with cc1plus compiled with -O0 and with checking still enabled, a 40x decrease:
 garbage collection    :  20.72 (10%) usr   0.75 ( 1%) sys  29.13 ( 6%) wall       0 kB ( 0%) ggc
 callgraph construction:   9.23 ( 5%) usr   1.00 ( 2%) sys  13.52 ( 3%) wall   25403 kB ( 4%) ggc
 callgraph optimization:   0.10 ( 0%) usr   0.00 ( 0%) sys   0.42 ( 0%) wall       0 kB ( 0%) ggc
 CFG verifier          :   3.70 ( 2%) usr   0.38 ( 1%) sys   6.37 ( 1%) wall       0 kB ( 0%) ggc
 rebuild jump labels   :   0.64 ( 0%) usr   0.18 ( 0%) sys   1.14 ( 0%) wall       0 kB ( 0%) ggc
 preprocessing         :   4.47 ( 2%) usr   6.35 (12%) sys  13.55 ( 3%) wall    2112 kB ( 0%) ggc
 parser                :  64.77 (33%) usr  15.36 (29%) sys 195.42 (37%) wall  299871 kB (42%) ggc
 name lookup           :  22.59 (11%) usr  23.24 (43%) sys  76.74 (15%) wall   16057 kB ( 2%) ggc
 inline heuristics     :   0.24 ( 0%) usr   0.01 ( 0%) sys   0.46 ( 0%) wall     662 kB ( 0%) ggc
 integration           :   0.00 ( 0%) usr   0.02 ( 0%) sys   0.05 ( 0%) wall       0 kB ( 0%) ggc
 tree gimplify         :  20.46 (10%) usr   0.89 ( 2%) sys  41.40 ( 8%) wall   57280 kB ( 8%) ggc
 tree eh               :   0.76 ( 0%) usr   0.15 ( 0%) sys   1.19 ( 0%) wall    4251 kB ( 1%) ggc
 tree CFG construction :   1.87 ( 1%) usr   0.51 ( 1%) sys   2.91 ( 1%) wall   61517 kB ( 9%) ggc
 tree CFG cleanup      :   2.34 ( 1%) usr   0.44 ( 1%) sys   4.01 ( 1%) wall      36 kB ( 0%) ggc
 tree STMT verifier    :   8.10 ( 4%) usr   0.39 ( 1%) sys  11.28 ( 2%) wall       0 kB ( 0%) ggc
 expand                :  34.94 (18%) usr   2.55 ( 5%) sys  74.96 (14%) wall  240800 kB (34%) ggc
 varconst              :   2.88 ( 1%) usr   1.34 ( 2%) sys  51.80 (10%) wall    2382 kB ( 0%) ggc
 final                 :   0.50 ( 0%) usr   0.12 ( 0%) sys   1.10 ( 0%) wall       0 kB ( 0%) ggc
 symout                :   0.00 ( 0%) usr   0.03 ( 0%) sys   0.30 ( 0%) wall      28 kB ( 0%) ggc
 TOTAL                 : 198.38            53.88           526.74             711174 kB


It also fixes the C testcase in comment #8 too.
Comment 15 Andrew Pinski 2005-10-13 03:09:59 UTC
Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2005-10/msg00737.html
Comment 16 GCC Commits 2005-10-14 03:01:47 UTC
Subject: Bug 21304

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	pinskia@gcc.gnu.org	2005-10-14 03:01:42

Modified files:
	gcc            : ChangeLog tree-dfa.c 

Log message:
	2005-10-13  Andrew Pinski  <pinskia@physics.uc.edu>
	
	PR tree-opt/21304
	* tree-dfa.c (add_referenced_var): Only look at decls which
	have TREE_CONSTANT or TREE_READONLY set instead of if
	!TREE_PUBLIC or !TREE_CONSTANT.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.10155&r2=2.10156
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-dfa.c.diff?cvsroot=gcc&r1=2.64&r2=2.65

Comment 17 Andrew Pinski 2005-10-14 03:02:09 UTC
Fixed on the mainline, if someone wants to back port the patch, that is fine with me but I don't have time to do it.
Comment 18 Andrew Pinski 2005-10-16 00:35:50 UTC
(In reply to comment #17)
Oh and you need also to backport:
2005-03-03  Jan Hubicka  <jh@suse.cz>

        * tree-dfa.c (add_referenced_var): Don't walk initializer of external
        and non-constant public variables.

http://gcc.gnu.org/ml/gcc-patches/2005-03/msg00209.html
Comment 19 Gabriel Dos Reis 2007-01-18 04:05:06 UTC
Fixed in GCC-4.1.1 and higher.
Won't fix in GCC-4.0.x