[gomp3] Real OpenMP 3.0 tasking support

Johannes Singler singler@ira.uka.de
Wed Jun 11 15:05:00 GMT 2008


Jakub Jelinek wrote:
> On Wed, Jun 11, 2008 at 02:26:04PM +0200, Johannes Singler wrote:
>> Are nested (possibly recursive) tasks supported (i. e. actually executed 
>> in parallel)?
> 
> To be precise, there are no such things as "nested" tasks.
> Tasks have a parent-child hierarchy for #pragma omp taskwait purposes
> (that waits on all immediate children of a task if any) 

<snipped>

> Please see the libgomp.c/sort-1.c and libgomp.fortran/strassen.f90
> testcases/benchmarks to see tasking in action.

I looked into sort-1.c. Some results on a 2xquadcore machine:

singler@i10pc121:~/scratch> OMP_NUM_THREADS=1 ./a.out
Threads: 1
sort1: 3.11416
sort2: 3.11768
sort3: 3.12816
singler@i10pc121:~/scratch> OMP_NUM_THREADS=2 ./a.out
Threads: 2
sort1: 1.61807
sort2: 1.81284
sort3: 1.62286
singler@i10pc121:~/scratch> OMP_NUM_THREADS=4 ./a.out
Threads: 4
sort1: 0.905296
sort2: 2.50576
sort3: 1.62746
singler@i10pc121:~/scratch> OMP_NUM_THREADS=8 ./a.out
Threads: 8
sort1: 0.546678
sort2: 1.35682
sort3: 1.64447

sort1 scales nicely, while sort2 has some jerky behavior (probably 
because it is completely unbalanced). sort3 (the one using the task 
construct) seems to be stuck at a speedup of a bit less than 2. This is 
why I suspect that only two threads do actual work (possibly leaving the 
last block to sort to the master thread).

Adding

     printf("%d ", omp_get_thread_num());

to the task created in line 319 shows that only two different thread IDs 
are used in the 4 and 8 thread case, e. g.

1 3 3 1 3 3 3 1 3 1 1 3 3 3 1 1 1 1 3 3 3 1 3 3 3 3 3 3 3 3 1 3 1 1 ...

I know that one should not rely on the value of omp_get_thread_num() too 
much in a task, but IMHO this is a strong hint.

-- Johannes



More information about the Gcc-patches mailing list