This is the mail archive of the
mailing list for the GCC project.
Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime
- From: 김규래<msca8h at naver dot com>
- To: <gcc at gcc dot gnu dot org>
- Cc: Jakub Jelinek<jakub at redhat dot com>
- Date: Thu, 06 Jun 2019 03:25:24 +0900
- Subject: Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime
- References: <firstname.lastname@example.org> <20190603182101.GS19695@tucnak>
Hi, thanks for the detailed explanation.
I think I now get the picture.
Judging from my current understanding, the task-parallelism currently works as follows:
1. Tasks are placed in a global shared queue.
2. Workers consume the tasks by bashing the queue in a while loop, just as self-scheduling (dynamic scheduling)/
Then the improvements including work-stealing must be done by:
1. Each worker holds a dedicated task queue reducing the resource contention.
2. The tasks are distributed in a round-robin fashion
3. work-stealing will resolve the load imbalance.
If the above statements are correct, I guess the task priority should be given some special treatment?
From: "Jakub Jelinek"<email@example.com>
Sent: 2019-06-04 (화) 03:21:01 (GMT+09:00)
Subject: Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime
On Tue, Jun 04, 2019 at 03:01:13AM +0900, 김규래 wrote:
> I've been studying the libgomp task parallelism system.
> I have a few questions.
> First, Tracing the events shows that only the main thread calls GOMP_task.
No, any thread can call GOMP_task, in particular the thread that encountered
the #pragma omp task construct.
The GOMP_task function then decides based on the clauses of the construct
(passed in various ways through the arguments of that function) whether it
will be included (executed by the encountering thread), or queued for
later execution. In the latter case, it will be scheduled during a barrier
(implicit or explicit), see gomp_barrier_handle_tasks called from the
bar.[ch] code, or at other spots, e.g. during taskwait construct
(GOMP_taskwait) or at the end of taskgroup (GOMP_taskgroup_end).
> How do the other worker threads enter the libgomp runtime?
If you never encounter a parallel, teams or target construct, then there is
just one thread that does everything (well, the library is written such that
if you by hand pthread_create, each such thread acts as a separate initial
thread from OpenMP POV).
Threads are created e.g. during parallel construct (GOMP_parallel), where
for non-nested parallelism as the standard requires it reuses existing
threads if possible or spawns new ones, see mainly team.c (gomp_team_start)
for the function that spawns new threads or awakes the ones waiting for
work, or gomp_thread_start in the same file for the function actually run by
the libgomp library created threads.
> I can't find the entry point of the worker threads from the event tracing and the assembly dump.
> Second, How is the task priority set?
By the user, through priority clause, passed to GOMP_task and then taken
into account when handling tasks in the various queues.