This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: Polyhedron benchmark on Opteron
- From: dominiq at lps dot ens dot fr (Dominique Dhumieres)
- To: fortran at gcc dot gnu dot org
- Date: Sat, 30 Sep 2006 17:22:17 +0200 (CEST)
- Subject: Re: Polyhedron benchmark on Opteron
> By fixing those silly loops where the last value is set to the next to
> last value inside the loop, rather than afterwards, gfortran can chop off
> 0.30 seconds,
You are probably speaking of:
DO N = 0, NP1
BAREA(N) = AREA(RBOUND(N))
IF (N == NP1) BAREA(N) = BAREA(N-1)
END DO
and
DO N = 1, NP1
CAREA(N) = AREA(RADIUS(N))
IF (N == NP1) CAREA(N) = CAREA(N-1)
END DO
I did not see much gain by hand optimizing these loops (within the timing
noise).
> leaving the monster array assignment with vector sqrt in
> eos as the one performance differentiation.
If you are speaking of
VOL(:NP1) = DX(:NP1)/3.0*(BAREA(:NP1-1)+SQRT(BAREA(:NP1-1)*BAREA(1&
& :NP1))+BAREA(1:NP1))
I did not see any improvement by replacing it by
do n = 1, np1
vol(n) = dx(n)*(barea(n-1)+sqrt(barea(n-1)*barea(n))+barea(n))/3.0
end do
However replacing
TEMP(:NODES) = IENER(:NODES)/SHEAT
PRES(:NODES) = (CGAMMA - 1.0)*DENS(:NODES)*IENER(:NODES)
GAMMA(:NODES) = CGAMMA
CS(:NODES) = SQRT(CGAMMA*PRES(:NODES)/DENS(:NODES))
by
GAMMA(:NODES) = CGAMMA
const = (CGAMMA - 1.0)*CGAMMA
RSHEAT = 1.0/SHEAT
do n = 1, nodes
TEMP(n) = IENER(n)*RSHEAT
PRES(n) = (CGAMMA - 1.0)*DENS(n)*IENER(n)
CS(n) = SQRT(const*IENER(n))
end do
gave me an important saving: from ~17" to ~13". I did not try to split the
saving between the "loop fusion" (which should be detected by "good"
compilers) and the removal of unnecessary division (I am not sure that this
optimization could be done by (is allowed to) any compiler).
Dominique