This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: 32b intel fortran vs. 64b linux gfortran


On ma., 2012-05-14 at 14:38 +0200, Tobias Burnus wrote:
> On 05/14/2012 01:27 PM, Edvardsen KÃre wrote:
> > I'm trying to compare the performance of my code between 32b intel
> > fortran (Win7) and 64b linux gfortran. What I see is that the argument
> > for the trigonometric functions (sin, cos ,tan etc.) only take 7 digits
> > on my Windows 32b intel fortran when compiled, when on my 64b linux
> > gfortran more digits are used.
> 
> Can you give an example? The default REAL data type without extra flags 
> should be the same:
> 
> . real, volatile :: r
>    r = sqrt(2.0)
>    r = sin (r)
>    print '(f16.13)', r
>    end
> 
> should print 0.9877659678459 with both compilers. (I assume that the 
> math librarys gives exactly and not only nearly the same result.)
> 
> There is a difference for list-directed I/O, where gfortran prints more 
> digits by default. For
>    print *, r
> one has with ifort 0.9877660 and with gfortran 0.987765968 but that's 
> internally the same binary number.
> 
> If one converts a binary FP number to a decimal number - or vice versa - 
> on might have the problem that ta number is not exactly representable. 
> For instance the decimal number number 0.1 can be either rounded up or 
> down as no binary number matches exactly; thus, if one prints the 
> variable, one might get for '(f16.14)' either of the two lines
> 
> 0.10000000149012
> 0.09999999403954
> 
> The first line is what one typically gets for 0.1. (The second has been 
> obtained by nearest(0.1,-1.0).) Thus, a compiler could simply print 
> "0.1" instead as you couldn't distinguish between the numbers in 32-bit 
> binary FP. It is simply an implementation choice whether one prints 
> fewer (ifort ) or more (gfortran) digits with "*" (list-directed I/O).
> 
> When doing performance comparisons, recall that they have different 
> defaults in terms of optimization and that also the same flag (-O2) can 
> mean different things.  For benchmarks, I use, e..g.,
>    gfortran -march=native -ffast-math -funroll-loops -O3 
> -finline-limit=600 -fstack-arrays -fno-protect-parens
> (and possibly compiling and linking with -flto) and
>    ifort -fast
> 
> For nonbenchmarks, you should consider to leave out -ffast-math and 
> -fno-protect-parens - as depending on the algorithm, that might lead to 
> wrong results. Though, you might be lucky and your algorithm is stable 
> enough for your input - such that the result is not or only negligibly 
> effected. (See GCC man page/documentation and 
> http://gcc.gnu.org/wiki/FloatingPointMath )
> 
> (For ifort, you have to do something similar; for instance "-assume 
> protect_parens" as that option is enabled by default but also something 
> similar to -fno-fast-math; I think it could be -prec-div, but there 
> might be more or the name could be different.)
> 
> 
> Note additionally that updating a compiler usually helps with the 
> performance. Thus, the newest ifort should win (on average) against an 
> old gfortran and vice versa. But for a single program, a factor 2 should 
> be not surprising. When I compared GCC/gfortran 4.7 with ifort 12.0 and 
> 12.1 using the Polyhedron benchmark, GCC was minutely (<~ 1%) faster 
> than 12.0 while 12.1 was a bit less than 7% faster. And GCC 4.5 was 20% 
> slower than 4.7.
> 
> Tobias

Thanks Tobias for taking time.

(I'm using gfortran 4.6.1 on my 64b linux and ifort 12.1 on my 32b Win7)

I will try out the various flags to see what happens. The reason for
asking about this was that I noticed by accident that the output from
tan(x) with 32b ifort gave different result than with 64b gfortran.

Assigning x=0.5235987902 or x=0.5235988 gives exactly the same output on
my 32b ifort compiled executable.

The same test on my 64b gfort compiled executable give different
results, and from what I can see is that the the trigonometric argument
seem to be rounded off to 7 digits before calculated with the 32b ifort
compiled executable.

I will continue tomorrow with further testing, but for sure: hard coding
x=0.5235987902 or x=0.5235988 does not matter in my case for the result.
In detail, the interesting part of the code looks like this:
(substituting x with chi1 or chi2)

if (chi1 .eq. chi2) then
          chi = 2.0*atan( ( r/tan(chi1) )**(1./proj_cone)
     &          * tan(chi1*0.5) )
        else
          chi = 2.0*atan( (r*proj_cone/sin(chi1))**(1./proj_cone)
     &          * tan(chi1*0.5))
        endif
        lat = (90.0-chi*deg_per_rad)*proj_hemi

giving an inaccuracy in "lat" as tan(chi1) seem to round off to 7-digits
on my Win7 before calculating.

Cheers,
K.E.




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]