[gomp3] Real OpenMP 3.0 tasking support
Johannes Singler
singler@ira.uka.de
Wed Jun 11 15:05:00 GMT 2008
Jakub Jelinek wrote:
> On Wed, Jun 11, 2008 at 02:26:04PM +0200, Johannes Singler wrote:
>> Are nested (possibly recursive) tasks supported (i. e. actually executed
>> in parallel)?
>
> To be precise, there are no such things as "nested" tasks.
> Tasks have a parent-child hierarchy for #pragma omp taskwait purposes
> (that waits on all immediate children of a task if any)
<snipped>
> Please see the libgomp.c/sort-1.c and libgomp.fortran/strassen.f90
> testcases/benchmarks to see tasking in action.
I looked into sort-1.c. Some results on a 2xquadcore machine:
singler@i10pc121:~/scratch> OMP_NUM_THREADS=1 ./a.out
Threads: 1
sort1: 3.11416
sort2: 3.11768
sort3: 3.12816
singler@i10pc121:~/scratch> OMP_NUM_THREADS=2 ./a.out
Threads: 2
sort1: 1.61807
sort2: 1.81284
sort3: 1.62286
singler@i10pc121:~/scratch> OMP_NUM_THREADS=4 ./a.out
Threads: 4
sort1: 0.905296
sort2: 2.50576
sort3: 1.62746
singler@i10pc121:~/scratch> OMP_NUM_THREADS=8 ./a.out
Threads: 8
sort1: 0.546678
sort2: 1.35682
sort3: 1.64447
sort1 scales nicely, while sort2 has some jerky behavior (probably
because it is completely unbalanced). sort3 (the one using the task
construct) seems to be stuck at a speedup of a bit less than 2. This is
why I suspect that only two threads do actual work (possibly leaving the
last block to sort to the master thread).
Adding
printf("%d ", omp_get_thread_num());
to the task created in line 319 shows that only two different thread IDs
are used in the 4 and 8 thread case, e. g.
1 3 3 1 3 3 3 1 3 1 1 3 3 3 1 1 1 1 3 3 3 1 3 3 3 3 3 3 3 3 1 3 1 1 ...
I know that one should not rely on the value of omp_get_thread_num() too
much in a task, but IMHO this is a strong hint.
-- Johannes
More information about the Gcc-patches
mailing list