This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

10 to 20% speedup with -m64 on Intel Core2Duo


Some time ago I had a look at pr30388 and got the following results:

	   g77 -O2     g95 -O2     gfc -O2   gfc -m64 -O2
	
MFLOPS:     1063        1061         858        1129

ref. g77			    -19%         +6%

Since the evening is quite calm I decided to check if this speedup with
-m64 is generic or not and I got the following timings for the Polyhedron
test suite:

================================================================================
Date & Time     : 27 Dec 2007 22:24:03
Test Name       : pbharness
Compile Command : gfc %n.f90 -m64 -O3 -ffast-math -funroll-loops -finline-limit=600 --param min-vect-loop-bound=2 -o %n
Benchmarks      : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   :      300.0
Target Error %  :      0.200
Minimum Repeats :     2
Maximum Repeats :     5

   Benchmark   Compile  Executable   Ave Run  Number   Estim
	Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
	  ac      4.27       50712     13.10       2  0.0420
      aermod    100.72     1200712     30.19       2  0.0066
	 air      6.68       73204      9.37       2  0.0267
    capacita      3.92       64520     56.49       2  0.0628
     channel      2.43       42752      2.29       2  0.0437
       doduc     14.42      179504     48.66       2  0.0021
     fatigue      5.69       76696     11.17       5  0.3700
     gas_dyn      6.32      700392     10.24       5  0.7605
      induct     12.79      160672     66.27       2  0.0053
       linpk      1.53       38400     27.54       2  0.0000
	mdbx      3.77       68856     15.16       2  0.0099
	  nf     11.69      112312     31.63       2  0.0174
     protein     10.71      110048     46.78       2  0.0064
      rnflow     10.95      163144     37.28       2  0.0268
    test_fpu     10.08      150080     12.72       2  0.0314
	tfft      1.37       30488      2.79       2  0.1074

Geometric Mean Execution Time =      18.20 seconds

================================================================================
Date & Time     : 27 Dec 2007 22:44:36
Test Name       : pbharness
Compile Command : gfc %n.f90 -O3 -ffast-math -funroll-loops -finline-limit=600 --param min-vect-loop-bound=2 -o %n
Benchmarks      : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   :      300.0
Target Error %  :      0.200
Minimum Repeats :     2
Maximum Repeats :     5

   Benchmark   Compile  Executable   Ave Run  Number   Estim
	Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
	  ac      4.48       46532     16.88       2  0.0207
      aermod    104.92     1288460     37.09       2  0.0081
	 air      6.67       80956     11.36       5  0.0849
    capacita      3.79       68332     62.40       2  0.0048
     channel      2.65       50780      2.51       4  0.1828
       doduc     14.27      183264     57.41       2  0.0009
     fatigue      6.11       84564     14.02       2  0.0642
     gas_dyn      5.93      699872     12.01       5  0.2754
      induct     11.83      160132     73.59       2  0.0177
       linpk      1.67       46512     27.57       2  0.0145
	mdbx      3.84       72672     16.78       2  0.0149
	  nf     16.73      157220     31.86       2  0.0016
     protein     11.62      113868     54.90       2  0.0337
      rnflow     11.87      187316     45.56       2  0.0889
    test_fpu     11.38      182544     14.56       2  0.0653
	tfft      1.44       34420      3.03       5  0.2973

Geometric Mean Execution Time =      20.86 seconds

================================================================================
Polyhedron Benchmark Validator
Copyright (C) Polyhedron Software Ltd - 2004 - All rights reserved

The results have been obtain on an Intel Core2Duo 2.16Ghz with 2Gb of RAM 
under Darwin9.1 with gfortran 4.3 at revision 131206.

Is this 10 to 20% speedup with -m64 expected?  and how generic is it? 
In the assembly code of the inner loop of the test case in PR30388,
the main differences I can see are at the level of the addressing: 
%eax, %ebp, ... in 32 bit mode and %rn, ... in 64 bit mode.

TIA

Dominique


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]