This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Re: Inaccuracies in the SUM intrinsic



-----Original Message-----
From: Janne Blomqvist <jblomqvi@cc.hut.fi>
To: Martin Reinecke <martin@MPA-Garching.MPG.DE>
Date: Mon, 20 Mar 2006 22:34:38 +0200
Subject: Re: Inaccuracies in the SUM intrinsic

On Mon, Mar 20, 2006 at 04:39:54PM +0100, Martin Reinecke wrote:
> Hi,
> 
> I'm not sure whether this can be considered a bug in gfortran, but the SUM
> intrinsic appears much more inaccurate than in other compilers.
> For the following program
> 
> subroutine check_sum (sz)
> integer sz
> real arr(sz)
> real mean
> 
> arr(:)=3.14159
> mean = sum(arr)/sz
> print *,"size: ",sz,", mean: ",mean
> end subroutine
> 
> program avgtest
> implicit none
> 
> call check_sum (1)
> call check_sum (10)
> call check_sum (100)
> call check_sum (1000)
> call check_sum (10000)
> call check_sum (100000)
> call check_sum (1000000)
> call check_sum (10000000)
> end
> 
> 
> gfortran from the current mainline produces this output:
> 
>  size:            1 , mean:    3.141590
>  size:           10 , mean:    3.141590
>  size:          100 , mean:    3.141590
>  size:         1000 , mean:    3.141600
>  size:        10000 , mean:    3.141134
>  size:       100000 , mean:    3.143259
>  size:      1000000 , mean:    3.170175
>  size:     10000000 , mean:    3.473953
> 
> 
> The Intel compiler produces
> 
>  size:            1 , mean:    3.141590
>  size:           10 , mean:    3.141590
>  size:          100 , mean:    3.141590
>  size:         1000 , mean:    3.141590
>  size:        10000 , mean:    3.141590
>  size:       100000 , mean:    3.141590
>  size:      1000000 , mean:    3.141590
>  size:     10000000 , mean:    3.141590
> 
> And the Fujitsu compiler prints:
> 
>  size:  1 , mean:  3.14159012
>  size:  10 , mean:  3.14159012
>  size:  100 , mean:  3.14159012
>  size:  1000 , mean:  3.14159012
>  size:  10000 , mean:  3.14159012
>  size:  100000 , mean:  3.14159012
>  size:  1000000 , mean:  3.14159012
>  size:  10000000 , mean:  3.14159012
> 
> No optimisation was used. With optimisation switched on,
> gfortran also produces the accurate results.
> 
> I understand that the inaccuracies are caused by accumulating numerical
> errors, and gfortran's implementation is probably in agreement with the 
> standard,
> but it may still be better to use at least double precision temporaries when
> calculating floating point sums in the SUM intrinsic. 

I had a look at the asm generated by gfortran with and without
optimization as well as ifort, and while I'm no asm expert it seems
that the difference is caused by gfortran optimized and ifort keeping
the accumulator variable in an fpu register, while gfortran without
optimization stores the value to memory at each iteration step.

So, this is more an effect of the well known fact that gcc without any
extra switches really means something close to "no optimization",
while most commercial compilers perform quite a lot of optimization
even if you don't explicitly ask for it.

Also, turning on sse2 code generation with ifort (-xW) shows that
ifort doesn't use a double precision accumulator either.

$ ifort -QxW martin.f90 /link /stack:200000000
$ ./martin
 size:            1 , mean:    3.141590
 size:           10 , mean:    3.141590
 size:          100 , mean:    3.141591
 size:         1000 , mean:    3.141592
 size:        10000 , mean:    3.141601
 size:       100000 , mean:    3.141032
 size:      1000000 , mean:    3.145857
 size:     10000000 , mean:    3.186140

result batched into 8 sums, each calculated without extended precision.  OP must have requested extra precision of certain compilers, and not others.

How about sum(dble(arr))   ?


Tim Prince


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]