This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: 32b intel fortran vs. 64b linux gfortran
- From: Edvardsen KÃre <kare dot edvardsen at uit dot no>
- To: "burnus at net-b dot de" <burnus at net-b dot de>
- Cc: "fortran at gcc dot gnu dot org" <fortran at gcc dot gnu dot org>
- Date: Mon, 14 May 2012 13:16:02 +0000
- Subject: Re: 32b intel fortran vs. 64b linux gfortran
- References: <1336994866.4783.19.camel@kare-desktop> <4FB0FCAA.3020707@net-b.de>
On ma., 2012-05-14 at 14:38 +0200, Tobias Burnus wrote:
> On 05/14/2012 01:27 PM, Edvardsen KÃre wrote:
> > I'm trying to compare the performance of my code between 32b intel
> > fortran (Win7) and 64b linux gfortran. What I see is that the argument
> > for the trigonometric functions (sin, cos ,tan etc.) only take 7 digits
> > on my Windows 32b intel fortran when compiled, when on my 64b linux
> > gfortran more digits are used.
>
> Can you give an example? The default REAL data type without extra flags
> should be the same:
>
> . real, volatile :: r
> r = sqrt(2.0)
> r = sin (r)
> print '(f16.13)', r
> end
>
> should print 0.9877659678459 with both compilers. (I assume that the
> math librarys gives exactly and not only nearly the same result.)
>
> There is a difference for list-directed I/O, where gfortran prints more
> digits by default. For
> print *, r
> one has with ifort 0.9877660 and with gfortran 0.987765968 but that's
> internally the same binary number.
>
> If one converts a binary FP number to a decimal number - or vice versa -
> on might have the problem that ta number is not exactly representable.
> For instance the decimal number number 0.1 can be either rounded up or
> down as no binary number matches exactly; thus, if one prints the
> variable, one might get for '(f16.14)' either of the two lines
>
> 0.10000000149012
> 0.09999999403954
>
> The first line is what one typically gets for 0.1. (The second has been
> obtained by nearest(0.1,-1.0).) Thus, a compiler could simply print
> "0.1" instead as you couldn't distinguish between the numbers in 32-bit
> binary FP. It is simply an implementation choice whether one prints
> fewer (ifort ) or more (gfortran) digits with "*" (list-directed I/O).
>
> When doing performance comparisons, recall that they have different
> defaults in terms of optimization and that also the same flag (-O2) can
> mean different things. For benchmarks, I use, e..g.,
> gfortran -march=native -ffast-math -funroll-loops -O3
> -finline-limit=600 -fstack-arrays -fno-protect-parens
> (and possibly compiling and linking with -flto) and
> ifort -fast
>
> For nonbenchmarks, you should consider to leave out -ffast-math and
> -fno-protect-parens - as depending on the algorithm, that might lead to
> wrong results. Though, you might be lucky and your algorithm is stable
> enough for your input - such that the result is not or only negligibly
> effected. (See GCC man page/documentation and
> http://gcc.gnu.org/wiki/FloatingPointMath )
>
> (For ifort, you have to do something similar; for instance "-assume
> protect_parens" as that option is enabled by default but also something
> similar to -fno-fast-math; I think it could be -prec-div, but there
> might be more or the name could be different.)
>
>
> Note additionally that updating a compiler usually helps with the
> performance. Thus, the newest ifort should win (on average) against an
> old gfortran and vice versa. But for a single program, a factor 2 should
> be not surprising. When I compared GCC/gfortran 4.7 with ifort 12.0 and
> 12.1 using the Polyhedron benchmark, GCC was minutely (<~ 1%) faster
> than 12.0 while 12.1 was a bit less than 7% faster. And GCC 4.5 was 20%
> slower than 4.7.
>
> Tobias
Thanks Tobias for taking time.
(I'm using gfortran 4.6.1 on my 64b linux and ifort 12.1 on my 32b Win7)
I will try out the various flags to see what happens. The reason for
asking about this was that I noticed by accident that the output from
tan(x) with 32b ifort gave different result than with 64b gfortran.
Assigning x=0.5235987902 or x=0.5235988 gives exactly the same output on
my 32b ifort compiled executable.
The same test on my 64b gfort compiled executable give different
results, and from what I can see is that the the trigonometric argument
seem to be rounded off to 7 digits before calculated with the 32b ifort
compiled executable.
I will continue tomorrow with further testing, but for sure: hard coding
x=0.5235987902 or x=0.5235988 does not matter in my case for the result.
In detail, the interesting part of the code looks like this:
(substituting x with chi1 or chi2)
if (chi1 .eq. chi2) then
chi = 2.0*atan( ( r/tan(chi1) )**(1./proj_cone)
& * tan(chi1*0.5) )
else
chi = 2.0*atan( (r*proj_cone/sin(chi1))**(1./proj_cone)
& * tan(chi1*0.5))
endif
lat = (90.0-chi*deg_per_rad)*proj_hemi
giving an inaccuracy in "lat" as tan(chi1) seem to round off to 7-digits
on my Win7 before calculating.
Cheers,
K.E.