This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: My current idea for improving libgomp


Hey Sho,

> I totally agree with this point.
> Currently, I'm planning to implement tied task using breath-first
> scheduler wrote in
> section 3.1 of "Evaluation of OpenMP Task Scheduling Strategies" by Nanos Group.
> http://www.sarc-ip.org/files/null/Workshop/1234128788173__TSchedStrat-iwomp08.pdf
> 
> That is:
> * A team has one team queue which contains tasks.
> * A team has some (user-level) threads.
> * A thread can have one running task.
> * A thread has private queue which contains tasks.
> * When a task is created, it is queued in team queue.
> * Each thread steals tasks from the team queue and inserts it in the
> private queue.
> * Once tied task is executed in a thread, it is queued only in the
> private queue in the thread
>   when it encounters `taskwait'.
> * Each thread runs a task from its private queue.
> 
> But I'm not sure how to achieve good load-balancing and what kind of
> cutoff strategy to take.
> As for load-balancing, I'll read Nanos4 implementations and ask Nanos
> Group for it.
> (Of course your advice will do :-) )
> 
> As for cutoff, basically I can choose `max-tasks' strategy or
> `max-levels' strategy.
> When number of tasks or recursion levels exceed this value, the
> scheduler stops its work
> and execute each task as sentences in sequential programs.
> But "Evaluation of OpenMP Task Scheduling Strategies" says better
> cutoff strategy is different
> from application to application.

Right, start with distributing the queues and then think about load
balancing.

I would say don't worry too much about cut-offs at this point. Finding a
good cut-off strategy that works without drawbacks is pretty much an
open research problem. Just spawn the tasks and focus on efficient task
creation and scheduling. In my experience, going from a centralized to a
distributed task pool already makes a huge difference.

To get a better overview of other implementations, which you can compare
to libgomp, I recommend a couple of papers. For example:

- OpenMP Tasks in IBM XL Compilers, X. Teruel et al.
- Support for OpenMP Tasks in Nanos v4, X. Teruel et al.
- OpenMP 3.0 Tasking Implementation in OpenUH, C. Addison et al.
- A Runtime Implementation of OpenMP Tasks, J. LaGrone et al.

You should be able to find copies online. If not, let me know.

-Andreas


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]