This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: GSoC openMP task scheduling Advice
- From: "guray.ozen" <guray dot ozen at gmail dot com>
- To: Tobias Burnus <burnus at net-b dot de>
- Cc: gcc at gcc dot gnu dot org
- Date: Wed, 1 May 2013 18:50:30 +0200
- Subject: Re: GSoC openMP task scheduling Advice
- References: <CA+ga0G4+2b3vgF7WmisYURxVnqAsijuFQjPVzMEgO7uvioWaMQ at mail dot gmail dot com> <CA+ga0G5ocagrNE6ATh5WUerUU+s9k52WG_S+BVkpTbm-JxU6bQ at mail dot gmail dot com> <517F79F6 dot 1090402 at net-b dot de>
Dear All,
Thank you for your reply Tobias.
By the way Mr Jakup I hope my approach is make sense for you.
I changed GOMP_SPINCOUNT factor and i got speedup more than.
I attached my trace that was profiled extrae and paraver. Light blue
mean idle, Dark blue mean running, Yellow scheduling, Fork/Join. First
trace belongs to intel trace with default configuration. 2nd trace
with spincount=10, 3rd spincount=100, 5rd with spincount=infinity. In
addition to you can see running time as nanosecond bottom right
corner.
https://raw.github.com/grypp/gcc-gsoc-taskscheduler/master/openmp.png
In my opinion gcc is little bit slower tha intel because of task
scheduling. Some threads (for example thread number 4,9,10 in the
multisort-omp-12-spin10 trace or 7,9,11 threads in the
multisort-omp-12-infinity trace) waiting too much to other threads.
You can see my trace image,
My advice i want to change or add task scheduler algorithm. The first
thing I really want to add work-stealing mechanism. In this way i can
decrease task waiting time and i can provide load-balancing. Also task
stealing has a strategy that Untied task can steal parents task. For
example parent task-stealing is very good for parallel-multisort
algortihm or recursive tasks. If parents task cannot be stolen then
the default task stealing techniques is used. In addition to i can
change task scheduler as follows : when a task is created, the
creating task is suspended and the executing thread switches to the
newly created task. When a task is suspended the task is placed in a
per thread local pool. So that i can provide better data locality.
I'm working on this topics but i am not sure my idea is good or not.
Probably i need change my way. Please can you help me? I would like to
work with OpenMP in gcc.
Regards,
Güray Özen
Polytechnic University of Catalonia
2013/4/30 Tobias Burnus <burnus@net-b.de>:
> guray.ozen wrote:
>>
>> I thought gcc tasks/threads waiting too much on the idle than intel
>> compiler's threads.
>
>
> Regarding busy waits, you could try to tune the values of the GOMP_SPINCOUNT
> environment variable. Search for "@node GOMP_SPINCOUNT" in
> http://gcc.gnu.org/viewcvs/gcc/branches/gomp-4_0-branch/libgomp/libgomp.texi?view=co&content-type=text%2Fplain
> for details.
>
> If you have enough cores which are available, there shouldn't be a problem
> with idle. (Except with tasks where one could argue that the threads should
> do task stealing instead.)
>
> Tobias,
> who leaves the other questions to Jakub