This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug lto/45810] 40% slowdown when using LTO for a single-file program


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810

--- Comment #20 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 23:20:34 UTC ---
> This makes hookes_law estimate to be 91 instructions, so -finline-limit=183
> should be enough.

With the patch in comment #19, I rather find a threshold of -finline-limit=256.
In top of that as shown by the timing below the patch increases the threshold
for ac.f90 and breaks the vectorization for induct.f90.

Would the patch in comment #15 and an increase of the default value for
-finline-limit to 300 be acceptable at this stage (with the usual bells and
whisles: SPEC, ...)?

================================================================================
Date & Time     : 23 Jan 2011 23:18:23
Test Name       : pbharness
Compile Command : gfcp %n.f90 -Ofast -funroll-loops -ftree-loop-linear
-fomit-frame-pointer -finline-limit=300 -fwhole-program -flto -o %n
Benchmarks      : ac aermod air capacita channel doduc fatigue gas_dyn induct
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   :      300.0
Target Error %  :      0.200
Minimum Repeats :     2
Maximum Repeats :     5

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
          ac      3.15       50536      9.58       2  0.0156
      aermod    104.98     1652280     18.79       2  0.1011
         air      8.83       90048      6.99       5  0.7334
    capacita      5.95       89056     40.21       2  0.0174
     channel      1.65       34448      2.99       2  0.0502
       doduc     14.59      208056     27.91       2  0.0036
     fatigue      4.80       89264      4.72       2  0.0212
     gas_dyn     11.65      148176      4.66       5  0.4391
      induct     11.20      205976     22.34       2  0.0672
       linpk      1.59       21536     21.70       2  0.0299
        mdbx      5.78       84760     12.58       2  0.0119
          nf      7.60       83712     29.53       5  0.3854
     protein     11.69      163760     35.18       2  0.1109
      rnflow     15.23      167296     26.97       2  0.0890
    test_fpu     11.33      145848     11.06       5  0.3715
        tfft      1.13       22072      3.30       2  0.0607

Geometric Mean Execution Time =      12.89 seconds

================================================================================
Date & Time     : 23 Jan 2011 23:54:28
Test Name       : pbharness
Compile Command : gfcp %n.f90 -Ofast -funroll-loops -ftree-loop-linear
-fomit-frame-pointer -finline-limit=600 -fwhole-program -flto -o %n
Benchmarks      : ac aermod air capacita channel doduc fatigue gas_dyn induct
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   :      300.0
Target Error %  :      0.200
Minimum Repeats :     2
Maximum Repeats :     5

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
          ac      3.59       54576      8.10       2  0.0062
      aermod    103.73     1558344     18.91       2  0.0238
         air     10.47       89992      6.77       5  0.1563
    capacita      7.47      101344     40.08       2  0.0137
     channel      1.65       34448      2.97       5  0.5872
       doduc     15.82      216376     27.61       2  0.0000
     fatigue      5.10       89264      4.73       2  0.0000
     gas_dyn     12.09      152264      4.69       5  0.6428
      induct     11.10      205976     22.33       2  0.0403
       linpk      1.59       21536     21.72       2  0.0368
        mdbx      5.85       84760     12.58       2  0.0517
          nf     11.34      108280     28.98       2  0.1087
     protein     11.65      163760     35.18       3  0.1422
      rnflow     17.39      183696     26.71       2  0.0243
    test_fpu     11.49      145816     11.02       2  0.1226
        tfft      1.43       22072      3.29       2  0.0911

Geometric Mean Execution Time =      12.70 seconds


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]