This is to summarize the OMP timings I posted yesterday.
A relevant code fragment and explanations may be found here:
http://gcc.gnu.org/ml/fortran/2006-10/msg00753.html
$> uname -mo
x86_64 GNU/Linux
Four cores, dual CPU, dual core each.
Single threaded, optimized build (FCFLAGS="-O1"):
real 64m36.502s
user 64m36.886s
sys 0m00.040s
OpenMP enabled builds, FCFLAGS="-O1 -fopenmp");
| OMP_DYNAMIC=FALSE | OMP_DYNAMIC=TRUE |
-------------------+-------------------+------------------+
| 57m36.233s | 165m54.954s |
OMP_NUM_THREADS=4 | 91m59.685s | 98m13.528s |
| 26m31.735s | 109m26.858s |
-------------------+-------------------+------------------+
| 85m54.649s | 168m46.983s |
OMP_NUM_THREADS=8 | 125m20.442s | 97m53.903s |
| 48m15.253s | 113m00.108s |
-------------------+-------------------+------------------+
Processes that ran with OMP_DYNAMIC=TRUE employed three threads
(most of the time). Using default values of
OMP_NUM_THREADS/OMP_DYNAMIC, i.e. by not specifying them explicitely,
the resulting values are:
real 67m16.611s
user 112m22.885s
sys 25m44.641s
All these numbers are oneshots, no variances are measured/available.
Hints and suggestions are still highly welcome =)
Daniel