This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/33780] different results between O3 and O0
- From: "dominiq at lps dot ens dot fr" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 15 Oct 2007 15:51:19 -0000
- Subject: [Bug middle-end/33780] different results between O3 and O0
- References: <bug-33780-6642@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #3 from dominiq at lps dot ens dot fr 2007-10-15 15:51 -------
> that causes numerical results with CP2K to change going from -O0 to -O3.
If you do expect that optimization optimizes your computation, you should
expect some change of the numerical results, so put some tolerance in the
comparisons.
How small can be the tolerance depends on your problem. I have modified your
code to compute the polynomial in three different ways (yours, plus two others,
the third one being the recommended one for "numerical stability", if I did not
make some errors):
FUNCTION F(r,a,e,pf,qf)
REAL*8 :: a(2:15)
REAL*8 :: r,f,e, pr,pf,qf,rr
f=0
pf=0
pr=r
rr=1.0d0/r
DO i = 2, 15
f = f + a(i)/(r**(i-1)*REAL(i-1,8))
pf = pf + a(i)/(pr*REAL(i-1,8))
pr = r*pr
END DO
f=f-e
pf=pf-e
qf=a(15)/REAL(14,8)
do i = 14, 2, -1
qf=rr*qf+a(i)/REAL(i-1,8)
end do
qf=rr*qf-e
END FUNCTION F
PROGRAM TEST
REAL*8 :: a(2:15)=(/-195.771601327700D0, 15343.7861339500D0, &
-530864.458651600D0, 10707934.3905800D0, &
-140099704.789000D0, 1250943273.78500D0, &
-7795458330.67600D0, 33955897217.3100D0, &
-101135640744.000D0, 193107995718.700D0, &
-193440560940.000D0, -4224406093.91800D0, &
217192386506.500D0, -157581228915.500D0/)
REAL*8 :: r=4.51556282882533D0
REAL*8 :: e=-2.21199635966809471000D0
REAL*8 :: pf, qf
REAL*8 :: f
write(6,*) f(r,a,e,pf,qf)
write(6,*) pf, qf
END PROGRAM TEST
[karma] f90/bug% gfc pr33780_red.f90
[karma] f90/bug% a.out
0.0000000000000000
1.36957112317759311E-012 7.66053886991358013E-012
[karma] f90/bug% gfc -O3 pr33780_red.f90
[karma] f90/bug% a.out
-1.14397380457376130E-012
1.36957112317759311E-012 1.04861186749364478E-011
So for your problem the tolerance should be ~1.0E-11, and since pf and qf don't
use powers, this is not a question of inlining, but how the operations are
shuffled to "optimize" your computation.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33780