Bug 16613 - [4.0 Regression] compile time regression, when adding cerr usage
Summary: [4.0 Regression] compile time regression, when adding cerr usage
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 3.4.1
: P2 normal
Target Milestone: 4.0.4
Assignee: Not yet assigned to anyone
URL:
Keywords: compile-time-hog
Depends on:
Blocks:
 
Reported: 2004-07-18 10:11 UTC by andre.maute
Modified: 2007-01-18 02:58 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-12-27 00:35:21


Attachments
compiletimetest2.cc.gz (6.58 KB, application/x-gzip)
2004-12-10 22:13 UTC, Andrew Pinski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description andre.maute 2004-07-18 10:11:09 UTC
See http://gcc.gnu.org/ml/gcc-bugs/2004-07/msg02181.html 
 
because i didn't see how to attach a file in the bugzilla formular, 
i first sent my bug to gcc-bugs.gcc.org 
 
regards andre
Comment 1 Andrew Pinski 2004-07-19 08:34:35 UTC
I wonder if this is because of the new unroller.
Comment 2 Jim Wilson 2004-07-27 18:10:21 UTC
On my Athlon64 system, it is 2 seconds to compile without __DEBUGGING__ and  1
minutes 40 seconds with.

I can reproduce the problem with -O2 -finline-functions, and it goes away if I
compile with just -O2.  It also goes away if I ifdef out the cerr uses in the
average_n functions.

The .s file increases in size by a factor of 5-7 when __DEBUGGING__ is defined,
depending on the exact options used.  This explains why the compile takes so
much longer, because we are generating and optimizing so much more code.
Comment 3 Mark Mitchell 2004-08-29 18:47:33 UTC
Postponed until GCC 3.4.3.
Comment 4 Mark Mitchell 2004-11-01 00:46:10 UTC
Postponed until GCC 3.4.4.
Comment 5 Kriang Lerdsuwanakij 2004-11-28 05:27:39 UTC
Confirmed.  GCC 4.0 doesn't have this problem.
Comment 6 Eric Botcazou 2004-12-01 13:35:07 UTC
Here's the time report on x86-64:

Execution times (seconds)
 garbage collection    :   3.72 ( 3%) usr   0.09 ( 1%) sys   4.03 ( 3%) wall
 callgraph construction:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 cfg construction      :   0.48 ( 0%) usr   0.04 ( 0%) sys   0.47 ( 0%) wall
 cfg cleanup           :   0.65 ( 1%) usr   0.05 ( 1%) sys   0.71 ( 1%) wall
 trivially dead code   :   0.68 ( 1%) usr   0.00 ( 0%) sys   0.66 ( 1%) wall
 life analysis         :   1.99 ( 2%) usr   0.02 ( 0%) sys   2.03 ( 2%) wall
 life info update      :   0.22 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall
 alias analysis        :   1.08 ( 1%) usr   0.04 ( 0%) sys   1.16 ( 1%) wall
 register scan         :   0.52 ( 0%) usr   0.00 ( 0%) sys   0.52 ( 0%) wall
 rebuild jump labels   :   0.26 ( 0%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall
 preprocessing         :   0.05 ( 0%) usr   0.03 ( 0%) sys   0.77 ( 1%) wall
 parser                :   0.42 ( 0%) usr   0.11 ( 1%) sys   0.46 ( 0%) wall
 name lookup           :   0.09 ( 0%) usr   0.04 ( 0%) sys   0.19 ( 0%) wall
 expand                :   0.68 ( 1%) usr   0.05 ( 1%) sys   0.77 ( 1%) wall
 varconst              :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 integration           :   0.91 ( 1%) usr   0.04 ( 0%) sys   0.94 ( 1%) wall
 jump                  :   1.72 ( 1%) usr   0.07 ( 1%) sys   1.81 ( 1%) wall
 CSE                   :   0.95 ( 1%) usr   0.00 ( 0%) sys   1.01 ( 1%) wall
 global CSE            :  29.65 (26%) usr   1.12 (13%) sys  30.85 (24%) wall
 loop analysis         :  50.84 (44%) usr   6.48 (76%) sys  59.33 (47%) wall
 bypass jumps          :   1.03 ( 1%) usr   0.10 ( 1%) sys   1.14 ( 1%) wall
 CSE 2                 :   0.38 ( 0%) usr   0.01 ( 0%) sys   0.37 ( 0%) wall
 branch prediction     :   7.13 ( 6%) usr   0.06 ( 1%) sys   7.25 ( 6%) wall
 flow analysis         :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall
 combiner              :   0.76 ( 1%) usr   0.01 ( 0%) sys   0.82 ( 1%) wall
 if-conversion         :   0.28 ( 0%) usr   0.01 ( 0%) sys   0.27 ( 0%) wall
 regmove               :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall
 local alloc           :   0.41 ( 0%) usr   0.00 ( 0%) sys   0.44 ( 0%) wall
 global alloc          :   6.34 ( 6%) usr   0.11 ( 1%) sys   6.49 ( 5%) wall
 reload CSE regs       :   1.19 ( 1%) usr   0.01 ( 0%) sys   1.30 ( 1%) wall
 flow 2                :   0.19 ( 0%) usr   0.01 ( 0%) sys   0.19 ( 0%) wall
 if-conversion 2       :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall
 peephole 2            :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall
 rename registers      :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall
 scheduling 2          :   0.37 ( 0%) usr   0.00 ( 0%) sys   0.37 ( 0%) wall
 machine dep reorg     :   0.09 ( 0%) usr   0.01 ( 0%) sys   0.09 ( 0%) wall
 reorder blocks        :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall
 shorten branches      :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall
 final                 :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall
 symout                :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 rest of compilation   :   0.58 ( 1%) usr   0.00 ( 0%) sys   0.56 ( 0%) wall
 TOTAL                 : 114.68             8.51           126.68
Comment 7 Andrew Pinski 2004-12-10 22:13:28 UTC
Subject: Fwd:  [3.4 Regression] compile time regression, when adding cerr usage



Begin forwarded message:

> From: andre maute <andre.maute@gmx.de>
> Date: December 10, 2004 5:18:11 PM EST
> To: gcc-bugs@gcc.gnu.org
> Subject: [Bug rtl-optimization/16613] [3.4 Regression] compile time 
> regression, when adding cerr usage
>
> Once more i couldn't upload an attachment
> with the bugzilla upload form, so i send it here.
>
> I'll refer to it later.
>
> Regards Andre





Begin forwarded message:


<excerpt><bold><color><param>0000,0000,0000</param>From:
</color></bold>andre maute <<andre.maute@gmx.de>

<bold><color><param>0000,0000,0000</param>Date:
</color></bold>December 10, 2004 5:18:11 PM EST

<bold><color><param>0000,0000,0000</param>To:
</color></bold>gcc-bugs@gcc.gnu.org

<bold><color><param>0000,0000,0000</param>Subject: </color>[Bug
rtl-optimization/16613] [3.4 Regression] compile time regression, when
adding cerr usage

</bold>

Once more i couldn't upload an attachment

with the bugzilla upload form, so i send it here.


I'll refer to it later.


Regards Andre

</excerpt>


Comment 8 Andrew Pinski 2004-12-10 22:13:29 UTC
Created attachment 7723 [details]
compiletimetest2.cc.gz
Comment 9 andre.maute 2004-12-10 22:36:16 UTC
I don't think the compile time regression is solved in the actual g++-4.0. 
I made some run time measurements with the attached file compiletimetest2.cc 
on a PIII 550. The __DEBUG__ tests only enable <iostream> and some cerr lines. 
 
> g++ -v 
Reading specs from /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.1/specs 
Configured with: ../gcc-3.2.1/configure --prefix=/usr --enable-shared 
--enable-languages=c,c++ --enable-threads=posix --with-slibdir=/lib 
--enable-__cxa_atexit--enable-clocale=gnu 
Thread model: posix 
gcc version 3.2.1 
 
> g++-3.3.5 -v 
Reading specs from /opt/gcc-3.3.5/lib/gcc-lib/i686-pc-linux-gnu/3.3.5/specs 
Configured with: ../gcc-3.3.5/configure --prefix=/opt/gcc-3.3.5 
--enable-shared--enable-languages=c,c++ --enable-threads=posix 
--enable-__cxa_atexit --enable-clocale=gnu --program-suffix=-3.3.5 
--with-cpu=pentium3 
Thread model: posix 
gcc version 3.3.5 
 
> g++-3.4.3 -v 
Reading specs from /opt/gcc-3.4.3/lib/gcc/i686-pc-linux-gnu/3.4.3/specs 
Configured with: ../gcc-3.4.3/configure --prefix=/opt/gcc-3.4.3 
--enable-shared--enable-languages=c,c++ --enable-threads=posix 
--enable-__cxa_atexit --enable-clocale=gnu --program-suffix=-3.4.3 
--with-arch=pentium3 
Thread model: posix 
gcc version 3.4.3 
 
g++-4.0-20041205 -v 
Reading specs from /opt/gcc-4.0-20041205/lib/gcc/i686-pc-linux-gnu/4.0.0/specs 
Configured with: ../gcc-4.0-20041205/configure --prefix=/opt/gcc-4.0-20041205 
--enable-shared --enable-languages=c,c++ --enable-threads=posix 
--enable-__cxa_atexit --enable-clocale=gnu --disable-nls 
--program-suffix=-4.0-20041205 --with-arch=pentium3 
Thread model: posix 
gcc version 4.0.0 20041205 (experimental) 
 
 
> time g++ -c -O3 -D __NDEBUG__ compiletimetest2.cc 
real     0m9.957s    user     0m9.910s    sys      0m0.090s 
 
> time g++ -c -O3 -D __DEBUG__ compiletimetest2.cc 
real    0m13.544s    user    0m13.270s    sys      0m0.170s 
 
> time g++-3.3.5 -c -O3 -D __NDEBUG__ compiletimetest2.cc 
real     0m9.881s    user     0m9.740s    sys      0m0.130s 
 
> time g++-3.3.5 -c -O3 -D __NDEBUG__ compiletimetest2.cc 
real     0m9.881s    user     0m9.740s    sys      0m0.130s  
 
> time g++-3.4.3 -c -O3 -D __NDEBUG__ compiletimetest2.cc 
real    0m18.614s    user    0m18.240s    sys      0m0.310s 
 
> time g++-3.4.3 -c -O3 -D __DEBUG__ compiletimetest2.cc 
real    0m21.563s    user    0m21.050s    sys      0m0.510s 
 
> time g++-4.0-20041205 -c -O3 -D __NDEBUG__ compiletimetest2.cc 
real    0m24.983s    user    0m24.740s    sys      0m0.160s 
 
> time g++-4.0-20041205 -c -O3 -D __DEBUG__ compiletimetest2.cc 
real    0m31.269s    user    0m30.230s    sys     0m0.240s 
 
 
Regards Andre 
Comment 10 andre.maute 2004-12-10 22:42:58 UTC
sorry missed the following two lines 
 
> time g++-3.3.5 -c -O3 -D __DEBUG__ compiletimetest2.cc 
real    0m12.454s    user    0m12.210s    sys     0m0.230s 
 
so g++-3.3.5 is really good, 
and I don't hope that we will see  40 s  for  g++-4.1 ;-) 
 
Regards Andre 
 
 
Comment 11 Daniel Berlin 2004-12-10 23:12:32 UTC
(In reply to comment #9)
>  g++-4.0-20041205 -v 
> Reading specs from /opt/gcc-4.0-20041205/lib/gcc/i686-pc-linux-gnu/4.0.0/specs 
> Configured with: ../gcc-4.0-20041205/configure --prefix=/opt/gcc-4.0-20041205 
> --enable-shared --enable-languages=c,c++ --enable-threads=posix 
> --enable-__cxa_atexit --enable-clocale=gnu --disable-nls 
> --program-suffix=-4.0-20041205 --with-arch=pentium3 
> Thread model: posix 
> gcc version 4.0.0 20041205 (experimental) 

You need to add --disable-checking to your configure flags in order to compare
against a development branch.
When release branches are made, checking gets turned off by default by
configure, but it's on in non-release branches.

The compiler is a *lot* slower with checking
Comment 12 andre.maute 2004-12-11 03:28:36 UTC
now with --disable-checking in the configure parameters for gcc-4.0 
 
> g++-4.0-20041205-1-dc -v 
Reading specs 
from /opt/gcc-4.0-20041205-1-dc/lib/gcc/i686-pc-linux-gnu/4.0.0/specs 
Configured with: ../gcc-4.0-20041205/configure 
--prefix=/opt/gcc-4.0-20041205-1-dc --enable-shared --enable-languages=c,c++ 
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu 
--disable-nls --program-suffix=-4.0-20041205-1-dc --with-arch=pentium3 
--disable-checking 
Thread model: posix 
gcc version 4.0.0 20041205 (experimental) 
 
> time g++-4.0-20041205-1-dc -c -O3 -D __NDEBUG__ compiletimetest2.cc 
real    0m18.300s    user    0m18.050s    sys      0m0.250s 
> time g++-4.0-20041205-1-dc -c -O3 -D __DEBUG__ compiletimetest2.cc 
real    0m21.368s    user    0m20.960s    sys      0m0.440s 
 
this looks much better but not as good as with g++-3.3.5 
Comment 13 Andrew Pinski 2004-12-11 08:10:38 UTC
Here is my timings for 3.3.2 vs the mainline on powerpc-darwin (yesterdays before a patch which 
should speed it up a little  more):
[zhivago:~/src/localgccPRs] pinskia% time ~/gcc-3.3//bin/gcc pr16613.ii -S -D__DEBUG__ -O2
5.600u 0.340s 0:06.99 84.9%     0+0k 0+3io 0pf+0w
[zhivago:~/src/localgccPRs] pinskia% time ~/local3/bin/gcc pr16613.ii -S -D__DEBUG__ -O2
6.190u 0.590s 0:07.69 88.1%     0+0k 0+0io 0pf+0w

Plus this is a mainline preprocessed source so it looks like the mainline is slightly slower than 3.3.2 
(about 10%).

For -O0:
[zhivago:~/src/localgccPRs] pinskia% time ~/gcc-3.3//bin/gcc pr16613.ii -S -D__DEBUG__
2.980u 0.260s 0:04.70 68.9%     0+0k 0+0io 0pf+0w
[zhivago:~/src/localgccPRs] pinskia% time ~/local3/bin/gcc pr16613.ii -S -D__DEBUG__
1.600u 0.270s 0:02.29 81.6%     0+0k 0+0io 0pf+0w

So we are much faster for -O0.
Comment 14 andre.maute 2005-05-01 23:42:47 UTC
i have run my compile time test again,     
here are the timings and something has gone worse after 2005/03/26     
   
> g++-4.0-20050326 -v   
Using built-in specs.   
Target: i686-pc-linux-gnu   
Configured with: ../gcc-4.0-20050326/configure --prefix=/opt/gcc-4.0-20050326   
--program-suffix=-4.0-20050326 --enable-shared --enable-languages=c,c++   
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu   
--disable-nls --disable-checking --with-arch=pentium3   
Thread model: posix   
gcc version 4.0.0 20050326 (prerelease)   
   
> g++-4.0-20050409 -v   
Using built-in specs.   
Target: i686-pc-linux-gnu   
Configured with: ../gcc-4.0-20050409/configure --prefix=/opt/gcc-4.0-20050409   
--program-suffix=-4.0-20050409 --enable-shared --enable-languages=c,c++   
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu   
--disable-nls --disable-checking --with-arch=pentium3   
Thread model: posix   
gcc version 4.0.0 20050409 (prerelease)   
   
> g++-4.0-20050430 -v   
Using built-in specs.   
Target: i686-pc-linux-gnu   
Configured with: ../gcc-4.0-20050430/configure --prefix=/opt/gcc-4.0-20050430   
--program-suffix=-4.0-20050430 --enable-shared --enable-languages=c,c++   
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu   
--disable-nls --disable-checking --with-arch=pentium3   
Thread model: posix   
gcc version 4.0.1 20050430 (prerelease)   
     
> time g++-4.0-20050326 -c -O3 -D __NDEBUG__ compiletimetest2.cc     
real    0m16.128s    user    0m15.775s    sys     0m0.282s     
     
> time g++-4.0-20050326 -c -O3 -D __DEBUG__ compiletimetest2.cc     
real    0m18.842s    user    0m18.326s    sys     0m0.488s     
   
   
> time g++-4.0-20050409 -c -O3 -D __NDEBUG__ compiletimetest2.cc   
real    0m52.158s    user    0m51.030s    sys     0m1.012s   
   
> time g++-4.0-20050409 -c -O3 -D __DEBUG__ compiletimetest2.cc   
real    0m55.566s    user    0m54.460s    sys     0m0.996s   
   
> time g++-4.0-20050430 -c -O3 -D __NDEBUG__ compiletimetest2.cc   
real    0m52.450s    user    0m51.277s    sys     0m0.982s   
   
> time g++-4.0-20050430 -c -O3 -D __DEBUG__ compiletimetest2.cc   
real    0m55.270s    user    0m54.364s    sys     0m0.906s   
  
the assemply file generated with g++-4.0-20050409 is twice as large than that  
generated with g++-4.0-20050326 using the option "-save-temps"  
  
> g++-4.0-20050326 -c -O3 -D __NDEBUG__ compiletimetest2.cc -save-temps  
> ls -al compiletimetest2.cc  
-rw-r--r--    1 login500  users      621326 May  2 01:36 compiletimetest2.s  
  
> g++-4.0-20050409 -c -O3 -D __NDEBUG__ compiletimetest2.cc -save-temps  
> ls -al compiletimetest2.cc  
-rw-r--r--    1 login500  users     1186872 May  2 01:34 compiletimetest2.s  
  
> g++-4.0-20050430 -c -O3 -D __NDEBUG__ compiletimetest2.cc -save-temps  
> ls -al compiletimetest2.cc  
-rw-r--r--    1 login500  users     1186872 May  2 01:37 compiletimetest2.s 
 
using -ftime-reports reveals only that nearly every optimization pass has gone 
worse. 
 
Regards Andre 
Comment 15 andre.maute 2005-05-05 20:12:37 UTC
i want to supplement my compile time tests which shows that a regression  
was introduced between 2005/03/26 an 2005/04/02  
  
> g++-4.0-20050402 -v  
Using built-in specs.  
Target: i686-pc-linux-gnu  
Configured with: ../gcc-4.0-20050402/configure --prefix=/opt/gcc-4.0-20050402  
--program-suffix=-4.0-20050402 --enable-shared --enable-languages=c,c++  
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu  
--disable-nls --disable-checking --with-arch=pentium3  
Thread model: posix  
gcc version 4.0.0 20050402 (prerelease)  
  
> time g++-4.0-20050402 -c -O3 -D __NDEBUG__ compiletimetest2.cc  
real    0m52.126s    user    0m51.187s    sys     0m0.777s  
  
> time g++-4.0-20050402 -c -O3 -D __DEBUG__ compiletimetest2.cc  
real    0m55.409s    user    0m54.280s    sys     0m0.935s  
  
> g++-4.0-20050402 -c -O3 -D __NDEBUG__ compiletimetest2.cc -save-temps  
> ls -al compiletimetest2.s  
-rw-r--r--    1 login500  users     1186149 May  5 22:05 compiletimetest2.s  
  
Regards Andre  
Comment 16 Andrew Pinski 2005-07-22 21:13:21 UTC
Moving to 4.0.2 pre Mark.
Comment 17 Gabriel Dos Reis 2007-01-18 02:58:58 UTC
Won't fix for GCC-4.0.x