This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gomp] timings (was: [gomp] omp performance question)


Daniel Franke wrote:
This is to summarize the OMP timings I posted yesterday.

A relevant code fragment and explanations may be found here:
http://gcc.gnu.org/ml/fortran/2006-10/msg00753.html

$> uname -mo
x86_64 GNU/Linux

Four cores, dual CPU, dual core each.


Single threaded, optimized build (FCFLAGS="-O1"): real 64m36.502s user 64m36.886s sys 0m00.040s

OpenMP enabled builds, FCFLAGS="-O1 -fopenmp");

                  | OMP_DYNAMIC=FALSE | OMP_DYNAMIC=TRUE |
-------------------+-------------------+------------------+
                  |    57m36.233s     |    165m54.954s   |
OMP_NUM_THREADS=4 |    91m59.685s     |     98m13.528s   |
                  |    26m31.735s     |    109m26.858s   |
-------------------+-------------------+------------------+
                  |    85m54.649s     |    168m46.983s   |
OMP_NUM_THREADS=8 |   125m20.442s     |     97m53.903s   |
                  |    48m15.253s     |    113m00.108s   |
-------------------+-------------------+------------------+

Processes that ran with OMP_DYNAMIC=TRUE employed three threads
(most of the time). Using default values of
OMP_NUM_THREADS/OMP_DYNAMIC, i.e. by not specifying them explicitely,
the resulting values are:

real     67m16.611s
user    112m22.885s
sys      25m44.641s

All these numbers are oneshots, no variances are measured/available.

Hints and suggestions are still highly welcome =)

Daniel


If you run 2 threads, does it make a difference whether you assign them all to 2 sockets or spread them out 1 per socket (e.g. using taskset)? Why do you care about performance with 8 threads? If you do care, what are you doing to pair them efficiently?
Wouldn't it perform better if you would organize the data so the inner loops could run at stride 1? If the current version runs better with threads paired properly on shared caches, it would tend to confirm you have a cache sharing problem.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]