This is the mail archive of the
mailing list for the GCC project.
Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime
On Wed, Jun 5, 2019 at 9:25 PM 김규래 <firstname.lastname@example.org> wrote:
> Hi, thanks for the detailed explanation.
> I think I now get the picture.
> Judging from my current understanding, the task-parallelism currently works as follows:
> 1. Tasks are placed in a global shared queue.
> 2. Workers consume the tasks by bashing the queue in a while loop, just as self-scheduling (dynamic scheduling)/
> Then the improvements including work-stealing must be done by:
> 1. Each worker holds a dedicated task queue reducing the resource contention.
> 2. The tasks are distributed in a round-robin fashion
For nested task submission (does OpenMP support that?) you probably
want to submit to the local queue rather than round-robin, no?
> 3. work-stealing will resolve the load imbalance.
> If the above statements are correct, I guess the task priority should be given some special treatment?
> Ray Kim
> -----Original Message-----
> From: "Jakub Jelinek"<email@example.com>
> To: "김규래"<firstname.lastname@example.org>;
> Cc: <email@example.com>;
> Sent: 2019-06-04 (화) 03:21:01 (GMT+09:00)
> Subject: Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime
> On Tue, Jun 04, 2019 at 03:01:13AM +0900, 김규래 wrote:
> > Hi,
> > I've been studying the libgomp task parallelism system.
> > I have a few questions.
> > First, Tracing the events shows that only the main thread calls GOMP_task.
> No, any thread can call GOMP_task, in particular the thread that encountered
> the #pragma omp task construct.
> The GOMP_task function then decides based on the clauses of the construct
> (passed in various ways through the arguments of that function) whether it
> will be included (executed by the encountering thread), or queued for
> later execution. In the latter case, it will be scheduled during a barrier
> (implicit or explicit), see gomp_barrier_handle_tasks called from the
> bar.[ch] code, or at other spots, e.g. during taskwait construct
> (GOMP_taskwait) or at the end of taskgroup (GOMP_taskgroup_end).
> > How do the other worker threads enter the libgomp runtime?
> If you never encounter a parallel, teams or target construct, then there is
> just one thread that does everything (well, the library is written such that
> if you by hand pthread_create, each such thread acts as a separate initial
> thread from OpenMP POV).
> Threads are created e.g. during parallel construct (GOMP_parallel), where
> for non-nested parallelism as the standard requires it reuses existing
> threads if possible or spawns new ones, see mainly team.c (gomp_team_start)
> for the function that spawns new threads or awakes the ones waiting for
> work, or gomp_thread_start in the same file for the function actually run by
> the libgomp library created threads.
> > I can't find the entry point of the worker threads from the event tracing and the assembly dump.
> > Second, How is the task priority set?
> By the user, through priority clause, passed to GOMP_task and then taken
> into account when handling tasks in the various queues.