The question about using OpenMP taskloop feature in gcc

Tue Nov 21 09:24:00 GMT 2017

Hi Ruoyao,

I see. Thanks very much for your time and detailed explanation!
Best Regards
Nan Xiao

On Tue, Nov 21, 2017 at 4:56 PM, Xi Ruoyao <ryxi@stu.xidian.edu.cn> wrote:
> On 2017-11-21 16:50 +0800, Xi Ruoyao wrote:
>> On 2017-11-21 15:49 +0800, Nan Xiao wrote:
>> >
>> > #include <omp.h>
>> > #include <stdio.h>
>> >
>> > int main(void) {
>> >     #pragma omp parallel for
>> >     for (auto i = 0; i < 10; i++) {
>> >           int sum = 0;
>> >           #pragma omp taskloop shared(sum)
>> >           for (auto j = 0; j < 1000000; j++) {
>> >                  sum += j;
>> >            }
>> >            printf("%d\n", sum);
>> >      }
>> >      return 0;
>> > }
>>
>> There are two bugs in your code.  First, signed overflow is an undefined
>> behaviour and may generate arbitary result.  Second, the access to shared
>> variable sum is racing, the result may vary with scheduling.
>
> Fix:
>
>     #pragma omp parallel for
>     for (auto i = 0; i < 10; i++) {
>           long long sum = 0;
>           #pragma omp taskloop shared(sum)
>           for (auto j = 0; j < 1000000; j++) {
>                  __atomic_add_fetch(&sum, j, __ATOMIC_RELAXED);
>            }
>            printf("%lld\n", sum);
>     }
>
> This would generate "lock addq" instruction for "sum", instead of loading
> it into a register.
> --
> Xi Ruoyao <ryxi@stu.xidian.edu.cn>
> School of Aerospace Science and Technology, Xidian University