This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug lto/45810] 40% slowdown when using LTO for a single-file program
- From: "dominiq at lps dot ens.fr" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sun, 23 Jan 2011 23:20:36 +0000
- Subject: [Bug lto/45810] 40% slowdown when using LTO for a single-file program
- Auto-submitted: auto-generated
- References: <bug-45810-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #20 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 23:20:34 UTC ---
> This makes hookes_law estimate to be 91 instructions, so -finline-limit=183
> should be enough.
With the patch in comment #19, I rather find a threshold of -finline-limit=256.
In top of that as shown by the timing below the patch increases the threshold
for ac.f90 and breaks the vectorization for induct.f90.
Would the patch in comment #15 and an increase of the default value for
-finline-limit to 300 be acceptable at this stage (with the usual bells and
whisles: SPEC, ...)?
================================================================================
Date & Time : 23 Jan 2011 23:18:23
Test Name : pbharness
Compile Command : gfcp %n.f90 -Ofast -funroll-loops -ftree-loop-linear
-fomit-frame-pointer -finline-limit=300 -fwhole-program -flto -o %n
Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times : 300.0
Target Error % : 0.200
Minimum Repeats : 2
Maximum Repeats : 5
Benchmark Compile Executable Ave Run Number Estim
Name (secs) (bytes) (secs) Repeats Err %
--------- ------- ---------- ------- ------- ------
ac 3.15 50536 9.58 2 0.0156
aermod 104.98 1652280 18.79 2 0.1011
air 8.83 90048 6.99 5 0.7334
capacita 5.95 89056 40.21 2 0.0174
channel 1.65 34448 2.99 2 0.0502
doduc 14.59 208056 27.91 2 0.0036
fatigue 4.80 89264 4.72 2 0.0212
gas_dyn 11.65 148176 4.66 5 0.4391
induct 11.20 205976 22.34 2 0.0672
linpk 1.59 21536 21.70 2 0.0299
mdbx 5.78 84760 12.58 2 0.0119
nf 7.60 83712 29.53 5 0.3854
protein 11.69 163760 35.18 2 0.1109
rnflow 15.23 167296 26.97 2 0.0890
test_fpu 11.33 145848 11.06 5 0.3715
tfft 1.13 22072 3.30 2 0.0607
Geometric Mean Execution Time = 12.89 seconds
================================================================================
Date & Time : 23 Jan 2011 23:54:28
Test Name : pbharness
Compile Command : gfcp %n.f90 -Ofast -funroll-loops -ftree-loop-linear
-fomit-frame-pointer -finline-limit=600 -fwhole-program -flto -o %n
Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times : 300.0
Target Error % : 0.200
Minimum Repeats : 2
Maximum Repeats : 5
Benchmark Compile Executable Ave Run Number Estim
Name (secs) (bytes) (secs) Repeats Err %
--------- ------- ---------- ------- ------- ------
ac 3.59 54576 8.10 2 0.0062
aermod 103.73 1558344 18.91 2 0.0238
air 10.47 89992 6.77 5 0.1563
capacita 7.47 101344 40.08 2 0.0137
channel 1.65 34448 2.97 5 0.5872
doduc 15.82 216376 27.61 2 0.0000
fatigue 5.10 89264 4.73 2 0.0000
gas_dyn 12.09 152264 4.69 5 0.6428
induct 11.10 205976 22.33 2 0.0403
linpk 1.59 21536 21.72 2 0.0368
mdbx 5.85 84760 12.58 2 0.0517
nf 11.34 108280 28.98 2 0.1087
protein 11.65 163760 35.18 3 0.1422
rnflow 17.39 183696 26.71 2 0.0243
test_fpu 11.49 145816 11.02 2 0.1226
tfft 1.43 22072 3.29 2 0.0911
Geometric Mean Execution Time = 12.70 seconds