By fixing those silly loops where the last value is set to the next to
last value inside the loop, rather than afterwards, gfortran can chop off
0.30 seconds,
You are probably speaking of:
DO N = 0, NP1
BAREA(N) = AREA(RBOUND(N))
IF (N == NP1) BAREA(N) = BAREA(N-1)
END DO
and
DO N = 1, NP1
CAREA(N) = AREA(RADIUS(N))
IF (N == NP1) CAREA(N) = CAREA(N-1)
END DO
I did not see much gain by hand optimizing these loops (within the timing
noise).
leaving the monster array assignment with vector sqrt in
eos as the one performance differentiation.
If you are speaking of
VOL(:NP1) = DX(:NP1)/3.0*(BAREA(:NP1-1)+SQRT(BAREA(:NP1-1)*BAREA(1&
& :NP1))+BAREA(1:NP1))
I did not see any improvement by replacing it by
do n = 1, np1
vol(n) = dx(n)*(barea(n-1)+sqrt(barea(n-1)*barea(n))+barea(n))/3.0
end do
However replacing
TEMP(:NODES) = IENER(:NODES)/SHEAT
PRES(:NODES) = (CGAMMA - 1.0)*DENS(:NODES)*IENER(:NODES)
GAMMA(:NODES) = CGAMMA
CS(:NODES) = SQRT(CGAMMA*PRES(:NODES)/DENS(:NODES))
by
GAMMA(:NODES) = CGAMMA
const = (CGAMMA - 1.0)*CGAMMA
RSHEAT = 1.0/SHEAT
do n = 1, nodes
TEMP(n) = IENER(n)*RSHEAT
PRES(n) = (CGAMMA - 1.0)*DENS(n)*IENER(n)
CS(n) = SQRT(const*IENER(n))
end do
gave me an important saving: from ~17" to ~13". I did not try to split the
saving between the "loop fusion" (which should be detected by "good"
compilers) and the removal of unnecessary division (I am not sure that this
optimization could be done by (is allowed to) any compiler).