This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[gomp] omp performance question


Hi all,
hopefully this i s not off-topic, if it is, please let me know where
else to ask, thanks.

I am toying with the OpenMP implementation available in gfortran-4.2
(prerelease). After carefully profiling my program (gprof,
valgrind/callgrind), I identified two sections of code where approx
95% of execution time is spent, 47.x% each. Both sections have nested
DO loops similar to:

 sum(:) = 0.0
 DO l = 0, lmax
   tmp(:) = 0.0
   DO m = 0, l
     tmp(:) = tmp(:) + ...
   END DO
   sum(:) = sum(:) + tmp(:) + ...
 END DO

Therefore, I concluded OMP PARALLEL DO could improve matters, since
appropriate SMP hardware is available. Countering intuition, I found:

single threaded ( FCFLAGS=-O1) timings on x64_64, dual CPU (dual core
each), gave:
real    64m36.502s
user    64m36.886s
sys     0m0.040s

same machine, OMP enabled  (FCFLAGS="-O1 -fopenmp"):
real    67m16.611s
user    112m22.885s
sys     25m44.641s

Due to an ICE in the intel fortran compiler (see [1-3]), I have no
means to compare these timings. Could someone with more experience
with the GNU OpenMP implementation comment on the actual code snippet
given below [4]?

Daniel


[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29629 [2] http://www.openmp.org/pipermail/omp/2006/000551.html [3] http://www.openmp.org/pipermail/omp/2006/000552.html

[4] The actual computation, OpenMP statements included. Besides
CONJG(), no functions or subroutines are called, everything is
precomputed and stored within arrays. This function is easily called
20.000.000 times during an annealing procedure.

FUNCTION intensity(sa, s)
 USE math, ONLY: PI
 USE dammin_dam, ONLY: dam

 TYPE(simulated_annealing), INTENT(in) :: sa
 REAL(DBL), DIMENSION(:), INTENT(in)   :: s

REAL(DBL), DIMENSION(size(s)) :: intensity

 COMPLEX(DBL), DIMENSION(size(s)) :: Alm, Al0
 REAL(DBL), DIMENSION(size(s))    :: sumAlm
 INTEGER      :: l, m

intensity = 0.0

 !$OMP PARALLEL DO PRIVATE(l, m, Al0, Alm, sumAlm), REDUCTION(+:intensity)
 DO l = 0, sa%max_harmonics        ! sa%maxharmonics ~ 10 to 20
   sumAlm = 0.0
   DO m = 1, l
     Alm    = sa%current%alm(l, m, :)
     sumAlm = sumAlm + 2.0 * Alm * CONJG(Alm)
   END DO
   Al0       = sa%current%alm(l, 0, :)
   intensity = intensity + Al0 * CONJG(Al0) + sumAlm
 END DO
 !$OMP END PARALLEL DO

 intensity = 2.0 * PI**2 * intensity
END FUNCTION


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]