See http://gcc.gnu.org/ml/gcc-bugs/2004-07/msg02181.html because i didn't see how to attach a file in the bugzilla formular, i first sent my bug to gcc-bugs.gcc.org regards andre
I wonder if this is because of the new unroller.
On my Athlon64 system, it is 2 seconds to compile without __DEBUGGING__ and 1 minutes 40 seconds with. I can reproduce the problem with -O2 -finline-functions, and it goes away if I compile with just -O2. It also goes away if I ifdef out the cerr uses in the average_n functions. The .s file increases in size by a factor of 5-7 when __DEBUGGING__ is defined, depending on the exact options used. This explains why the compile takes so much longer, because we are generating and optimizing so much more code.
Postponed until GCC 3.4.3.
Postponed until GCC 3.4.4.
Confirmed. GCC 4.0 doesn't have this problem.
Here's the time report on x86-64: Execution times (seconds) garbage collection : 3.72 ( 3%) usr 0.09 ( 1%) sys 4.03 ( 3%) wall callgraph construction: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall cfg construction : 0.48 ( 0%) usr 0.04 ( 0%) sys 0.47 ( 0%) wall cfg cleanup : 0.65 ( 1%) usr 0.05 ( 1%) sys 0.71 ( 1%) wall trivially dead code : 0.68 ( 1%) usr 0.00 ( 0%) sys 0.66 ( 1%) wall life analysis : 1.99 ( 2%) usr 0.02 ( 0%) sys 2.03 ( 2%) wall life info update : 0.22 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall alias analysis : 1.08 ( 1%) usr 0.04 ( 0%) sys 1.16 ( 1%) wall register scan : 0.52 ( 0%) usr 0.00 ( 0%) sys 0.52 ( 0%) wall rebuild jump labels : 0.26 ( 0%) usr 0.00 ( 0%) sys 0.27 ( 0%) wall preprocessing : 0.05 ( 0%) usr 0.03 ( 0%) sys 0.77 ( 1%) wall parser : 0.42 ( 0%) usr 0.11 ( 1%) sys 0.46 ( 0%) wall name lookup : 0.09 ( 0%) usr 0.04 ( 0%) sys 0.19 ( 0%) wall expand : 0.68 ( 1%) usr 0.05 ( 1%) sys 0.77 ( 1%) wall varconst : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall integration : 0.91 ( 1%) usr 0.04 ( 0%) sys 0.94 ( 1%) wall jump : 1.72 ( 1%) usr 0.07 ( 1%) sys 1.81 ( 1%) wall CSE : 0.95 ( 1%) usr 0.00 ( 0%) sys 1.01 ( 1%) wall global CSE : 29.65 (26%) usr 1.12 (13%) sys 30.85 (24%) wall loop analysis : 50.84 (44%) usr 6.48 (76%) sys 59.33 (47%) wall bypass jumps : 1.03 ( 1%) usr 0.10 ( 1%) sys 1.14 ( 1%) wall CSE 2 : 0.38 ( 0%) usr 0.01 ( 0%) sys 0.37 ( 0%) wall branch prediction : 7.13 ( 6%) usr 0.06 ( 1%) sys 7.25 ( 6%) wall flow analysis : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall combiner : 0.76 ( 1%) usr 0.01 ( 0%) sys 0.82 ( 1%) wall if-conversion : 0.28 ( 0%) usr 0.01 ( 0%) sys 0.27 ( 0%) wall regmove : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall local alloc : 0.41 ( 0%) usr 0.00 ( 0%) sys 0.44 ( 0%) wall global alloc : 6.34 ( 6%) usr 0.11 ( 1%) sys 6.49 ( 5%) wall reload CSE regs : 1.19 ( 1%) usr 0.01 ( 0%) sys 1.30 ( 1%) wall flow 2 : 0.19 ( 0%) usr 0.01 ( 0%) sys 0.19 ( 0%) wall if-conversion 2 : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.14 ( 0%) wall peephole 2 : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall rename registers : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall scheduling 2 : 0.37 ( 0%) usr 0.00 ( 0%) sys 0.37 ( 0%) wall machine dep reorg : 0.09 ( 0%) usr 0.01 ( 0%) sys 0.09 ( 0%) wall reorder blocks : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.16 ( 0%) wall shorten branches : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall final : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.26 ( 0%) wall symout : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall rest of compilation : 0.58 ( 1%) usr 0.00 ( 0%) sys 0.56 ( 0%) wall TOTAL : 114.68 8.51 126.68
Subject: Fwd: [3.4 Regression] compile time regression, when adding cerr usage Begin forwarded message: > From: andre maute <andre.maute@gmx.de> > Date: December 10, 2004 5:18:11 PM EST > To: gcc-bugs@gcc.gnu.org > Subject: [Bug rtl-optimization/16613] [3.4 Regression] compile time > regression, when adding cerr usage > > Once more i couldn't upload an attachment > with the bugzilla upload form, so i send it here. > > I'll refer to it later. > > Regards Andre Begin forwarded message: <excerpt><bold><color><param>0000,0000,0000</param>From: </color></bold>andre maute <<andre.maute@gmx.de> <bold><color><param>0000,0000,0000</param>Date: </color></bold>December 10, 2004 5:18:11 PM EST <bold><color><param>0000,0000,0000</param>To: </color></bold>gcc-bugs@gcc.gnu.org <bold><color><param>0000,0000,0000</param>Subject: </color>[Bug rtl-optimization/16613] [3.4 Regression] compile time regression, when adding cerr usage </bold> Once more i couldn't upload an attachment with the bugzilla upload form, so i send it here. I'll refer to it later. Regards Andre </excerpt>
Created attachment 7723 [details] compiletimetest2.cc.gz
I don't think the compile time regression is solved in the actual g++-4.0. I made some run time measurements with the attached file compiletimetest2.cc on a PIII 550. The __DEBUG__ tests only enable <iostream> and some cerr lines. > g++ -v Reading specs from /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.1/specs Configured with: ../gcc-3.2.1/configure --prefix=/usr --enable-shared --enable-languages=c,c++ --enable-threads=posix --with-slibdir=/lib --enable-__cxa_atexit--enable-clocale=gnu Thread model: posix gcc version 3.2.1 > g++-3.3.5 -v Reading specs from /opt/gcc-3.3.5/lib/gcc-lib/i686-pc-linux-gnu/3.3.5/specs Configured with: ../gcc-3.3.5/configure --prefix=/opt/gcc-3.3.5 --enable-shared--enable-languages=c,c++ --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --program-suffix=-3.3.5 --with-cpu=pentium3 Thread model: posix gcc version 3.3.5 > g++-3.4.3 -v Reading specs from /opt/gcc-3.4.3/lib/gcc/i686-pc-linux-gnu/3.4.3/specs Configured with: ../gcc-3.4.3/configure --prefix=/opt/gcc-3.4.3 --enable-shared--enable-languages=c,c++ --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --program-suffix=-3.4.3 --with-arch=pentium3 Thread model: posix gcc version 3.4.3 g++-4.0-20041205 -v Reading specs from /opt/gcc-4.0-20041205/lib/gcc/i686-pc-linux-gnu/4.0.0/specs Configured with: ../gcc-4.0-20041205/configure --prefix=/opt/gcc-4.0-20041205 --enable-shared --enable-languages=c,c++ --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --disable-nls --program-suffix=-4.0-20041205 --with-arch=pentium3 Thread model: posix gcc version 4.0.0 20041205 (experimental) > time g++ -c -O3 -D __NDEBUG__ compiletimetest2.cc real 0m9.957s user 0m9.910s sys 0m0.090s > time g++ -c -O3 -D __DEBUG__ compiletimetest2.cc real 0m13.544s user 0m13.270s sys 0m0.170s > time g++-3.3.5 -c -O3 -D __NDEBUG__ compiletimetest2.cc real 0m9.881s user 0m9.740s sys 0m0.130s > time g++-3.3.5 -c -O3 -D __NDEBUG__ compiletimetest2.cc real 0m9.881s user 0m9.740s sys 0m0.130s > time g++-3.4.3 -c -O3 -D __NDEBUG__ compiletimetest2.cc real 0m18.614s user 0m18.240s sys 0m0.310s > time g++-3.4.3 -c -O3 -D __DEBUG__ compiletimetest2.cc real 0m21.563s user 0m21.050s sys 0m0.510s > time g++-4.0-20041205 -c -O3 -D __NDEBUG__ compiletimetest2.cc real 0m24.983s user 0m24.740s sys 0m0.160s > time g++-4.0-20041205 -c -O3 -D __DEBUG__ compiletimetest2.cc real 0m31.269s user 0m30.230s sys 0m0.240s Regards Andre
sorry missed the following two lines > time g++-3.3.5 -c -O3 -D __DEBUG__ compiletimetest2.cc real 0m12.454s user 0m12.210s sys 0m0.230s so g++-3.3.5 is really good, and I don't hope that we will see 40 s for g++-4.1 ;-) Regards Andre
(In reply to comment #9) > g++-4.0-20041205 -v > Reading specs from /opt/gcc-4.0-20041205/lib/gcc/i686-pc-linux-gnu/4.0.0/specs > Configured with: ../gcc-4.0-20041205/configure --prefix=/opt/gcc-4.0-20041205 > --enable-shared --enable-languages=c,c++ --enable-threads=posix > --enable-__cxa_atexit --enable-clocale=gnu --disable-nls > --program-suffix=-4.0-20041205 --with-arch=pentium3 > Thread model: posix > gcc version 4.0.0 20041205 (experimental) You need to add --disable-checking to your configure flags in order to compare against a development branch. When release branches are made, checking gets turned off by default by configure, but it's on in non-release branches. The compiler is a *lot* slower with checking
now with --disable-checking in the configure parameters for gcc-4.0 > g++-4.0-20041205-1-dc -v Reading specs from /opt/gcc-4.0-20041205-1-dc/lib/gcc/i686-pc-linux-gnu/4.0.0/specs Configured with: ../gcc-4.0-20041205/configure --prefix=/opt/gcc-4.0-20041205-1-dc --enable-shared --enable-languages=c,c++ --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --disable-nls --program-suffix=-4.0-20041205-1-dc --with-arch=pentium3 --disable-checking Thread model: posix gcc version 4.0.0 20041205 (experimental) > time g++-4.0-20041205-1-dc -c -O3 -D __NDEBUG__ compiletimetest2.cc real 0m18.300s user 0m18.050s sys 0m0.250s > time g++-4.0-20041205-1-dc -c -O3 -D __DEBUG__ compiletimetest2.cc real 0m21.368s user 0m20.960s sys 0m0.440s this looks much better but not as good as with g++-3.3.5
Here is my timings for 3.3.2 vs the mainline on powerpc-darwin (yesterdays before a patch which should speed it up a little more): [zhivago:~/src/localgccPRs] pinskia% time ~/gcc-3.3//bin/gcc pr16613.ii -S -D__DEBUG__ -O2 5.600u 0.340s 0:06.99 84.9% 0+0k 0+3io 0pf+0w [zhivago:~/src/localgccPRs] pinskia% time ~/local3/bin/gcc pr16613.ii -S -D__DEBUG__ -O2 6.190u 0.590s 0:07.69 88.1% 0+0k 0+0io 0pf+0w Plus this is a mainline preprocessed source so it looks like the mainline is slightly slower than 3.3.2 (about 10%). For -O0: [zhivago:~/src/localgccPRs] pinskia% time ~/gcc-3.3//bin/gcc pr16613.ii -S -D__DEBUG__ 2.980u 0.260s 0:04.70 68.9% 0+0k 0+0io 0pf+0w [zhivago:~/src/localgccPRs] pinskia% time ~/local3/bin/gcc pr16613.ii -S -D__DEBUG__ 1.600u 0.270s 0:02.29 81.6% 0+0k 0+0io 0pf+0w So we are much faster for -O0.
i have run my compile time test again, here are the timings and something has gone worse after 2005/03/26 > g++-4.0-20050326 -v Using built-in specs. Target: i686-pc-linux-gnu Configured with: ../gcc-4.0-20050326/configure --prefix=/opt/gcc-4.0-20050326 --program-suffix=-4.0-20050326 --enable-shared --enable-languages=c,c++ --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --disable-nls --disable-checking --with-arch=pentium3 Thread model: posix gcc version 4.0.0 20050326 (prerelease) > g++-4.0-20050409 -v Using built-in specs. Target: i686-pc-linux-gnu Configured with: ../gcc-4.0-20050409/configure --prefix=/opt/gcc-4.0-20050409 --program-suffix=-4.0-20050409 --enable-shared --enable-languages=c,c++ --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --disable-nls --disable-checking --with-arch=pentium3 Thread model: posix gcc version 4.0.0 20050409 (prerelease) > g++-4.0-20050430 -v Using built-in specs. Target: i686-pc-linux-gnu Configured with: ../gcc-4.0-20050430/configure --prefix=/opt/gcc-4.0-20050430 --program-suffix=-4.0-20050430 --enable-shared --enable-languages=c,c++ --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --disable-nls --disable-checking --with-arch=pentium3 Thread model: posix gcc version 4.0.1 20050430 (prerelease) > time g++-4.0-20050326 -c -O3 -D __NDEBUG__ compiletimetest2.cc real 0m16.128s user 0m15.775s sys 0m0.282s > time g++-4.0-20050326 -c -O3 -D __DEBUG__ compiletimetest2.cc real 0m18.842s user 0m18.326s sys 0m0.488s > time g++-4.0-20050409 -c -O3 -D __NDEBUG__ compiletimetest2.cc real 0m52.158s user 0m51.030s sys 0m1.012s > time g++-4.0-20050409 -c -O3 -D __DEBUG__ compiletimetest2.cc real 0m55.566s user 0m54.460s sys 0m0.996s > time g++-4.0-20050430 -c -O3 -D __NDEBUG__ compiletimetest2.cc real 0m52.450s user 0m51.277s sys 0m0.982s > time g++-4.0-20050430 -c -O3 -D __DEBUG__ compiletimetest2.cc real 0m55.270s user 0m54.364s sys 0m0.906s the assemply file generated with g++-4.0-20050409 is twice as large than that generated with g++-4.0-20050326 using the option "-save-temps" > g++-4.0-20050326 -c -O3 -D __NDEBUG__ compiletimetest2.cc -save-temps > ls -al compiletimetest2.cc -rw-r--r-- 1 login500 users 621326 May 2 01:36 compiletimetest2.s > g++-4.0-20050409 -c -O3 -D __NDEBUG__ compiletimetest2.cc -save-temps > ls -al compiletimetest2.cc -rw-r--r-- 1 login500 users 1186872 May 2 01:34 compiletimetest2.s > g++-4.0-20050430 -c -O3 -D __NDEBUG__ compiletimetest2.cc -save-temps > ls -al compiletimetest2.cc -rw-r--r-- 1 login500 users 1186872 May 2 01:37 compiletimetest2.s using -ftime-reports reveals only that nearly every optimization pass has gone worse. Regards Andre
i want to supplement my compile time tests which shows that a regression was introduced between 2005/03/26 an 2005/04/02 > g++-4.0-20050402 -v Using built-in specs. Target: i686-pc-linux-gnu Configured with: ../gcc-4.0-20050402/configure --prefix=/opt/gcc-4.0-20050402 --program-suffix=-4.0-20050402 --enable-shared --enable-languages=c,c++ --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --disable-nls --disable-checking --with-arch=pentium3 Thread model: posix gcc version 4.0.0 20050402 (prerelease) > time g++-4.0-20050402 -c -O3 -D __NDEBUG__ compiletimetest2.cc real 0m52.126s user 0m51.187s sys 0m0.777s > time g++-4.0-20050402 -c -O3 -D __DEBUG__ compiletimetest2.cc real 0m55.409s user 0m54.280s sys 0m0.935s > g++-4.0-20050402 -c -O3 -D __NDEBUG__ compiletimetest2.cc -save-temps > ls -al compiletimetest2.s -rw-r--r-- 1 login500 users 1186149 May 5 22:05 compiletimetest2.s Regards Andre
Moving to 4.0.2 pre Mark.
Won't fix for GCC-4.0.x