This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: Re: Inaccuracies in the SUM intrinsic
- From: "Timothy C Prince" <tprince at myrealbox dot com>
- To: jblomqvi at cc dot hut dot fi
- Cc: martin at MPA-Garching dot MPG dot DE, fortran at gcc dot gnu dot org
- Date: Mon, 20 Mar 2006 21:50:36 +0000
- Subject: Re: Re: Inaccuracies in the SUM intrinsic
-----Original Message-----
From: Janne Blomqvist <jblomqvi@cc.hut.fi>
To: Martin Reinecke <martin@MPA-Garching.MPG.DE>
Date: Mon, 20 Mar 2006 22:34:38 +0200
Subject: Re: Inaccuracies in the SUM intrinsic
On Mon, Mar 20, 2006 at 04:39:54PM +0100, Martin Reinecke wrote:
> Hi,
>
> I'm not sure whether this can be considered a bug in gfortran, but the SUM
> intrinsic appears much more inaccurate than in other compilers.
> For the following program
>
> subroutine check_sum (sz)
> integer sz
> real arr(sz)
> real mean
>
> arr(:)=3.14159
> mean = sum(arr)/sz
> print *,"size: ",sz,", mean: ",mean
> end subroutine
>
> program avgtest
> implicit none
>
> call check_sum (1)
> call check_sum (10)
> call check_sum (100)
> call check_sum (1000)
> call check_sum (10000)
> call check_sum (100000)
> call check_sum (1000000)
> call check_sum (10000000)
> end
>
>
> gfortran from the current mainline produces this output:
>
> size: 1 , mean: 3.141590
> size: 10 , mean: 3.141590
> size: 100 , mean: 3.141590
> size: 1000 , mean: 3.141600
> size: 10000 , mean: 3.141134
> size: 100000 , mean: 3.143259
> size: 1000000 , mean: 3.170175
> size: 10000000 , mean: 3.473953
>
>
> The Intel compiler produces
>
> size: 1 , mean: 3.141590
> size: 10 , mean: 3.141590
> size: 100 , mean: 3.141590
> size: 1000 , mean: 3.141590
> size: 10000 , mean: 3.141590
> size: 100000 , mean: 3.141590
> size: 1000000 , mean: 3.141590
> size: 10000000 , mean: 3.141590
>
> And the Fujitsu compiler prints:
>
> size: 1 , mean: 3.14159012
> size: 10 , mean: 3.14159012
> size: 100 , mean: 3.14159012
> size: 1000 , mean: 3.14159012
> size: 10000 , mean: 3.14159012
> size: 100000 , mean: 3.14159012
> size: 1000000 , mean: 3.14159012
> size: 10000000 , mean: 3.14159012
>
> No optimisation was used. With optimisation switched on,
> gfortran also produces the accurate results.
>
> I understand that the inaccuracies are caused by accumulating numerical
> errors, and gfortran's implementation is probably in agreement with the
> standard,
> but it may still be better to use at least double precision temporaries when
> calculating floating point sums in the SUM intrinsic.
I had a look at the asm generated by gfortran with and without
optimization as well as ifort, and while I'm no asm expert it seems
that the difference is caused by gfortran optimized and ifort keeping
the accumulator variable in an fpu register, while gfortran without
optimization stores the value to memory at each iteration step.
So, this is more an effect of the well known fact that gcc without any
extra switches really means something close to "no optimization",
while most commercial compilers perform quite a lot of optimization
even if you don't explicitly ask for it.
Also, turning on sse2 code generation with ifort (-xW) shows that
ifort doesn't use a double precision accumulator either.
$ ifort -QxW martin.f90 /link /stack:200000000
$ ./martin
size: 1 , mean: 3.141590
size: 10 , mean: 3.141590
size: 100 , mean: 3.141591
size: 1000 , mean: 3.141592
size: 10000 , mean: 3.141601
size: 100000 , mean: 3.141032
size: 1000000 , mean: 3.145857
size: 10000000 , mean: 3.186140
result batched into 8 sums, each calculated without extended precision. OP must have requested extra precision of certain compilers, and not others.
How about sum(dble(arr)) ?
Tim Prince