[Bug target/105873] [amdgcn][OpenMP] task reductions fail with "team master not responding; slave thread aborting"

jakub at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Jun 7 13:52:08 GMT 2022


--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I think it might be interesting to see which private values are used when:
#pragma omp declare target
foo (void)
  int result = 0;
  void **buf = __builtin_malloc (8192 * 2 * sizeof (void *));
#pragma omp taskgroup task_reduction(+: result)
    for(int i = 0; i < 8192; ++i) {
#pragma omp task in_reduction(+: result)
        result += 1;
        buf[2 * i] = &result;
        buf[2 * i + 1] = (void *) (__INTPTR_TYPE__) result;
  for(int i = 0; i < 8192; ++i)
    __builtin_printf ("%d %p %d\n", i, buf[2 * i], (int) (__INTPTR_TYPE__)
buf[2 * i + 1]);
  __builtin_free (buf);
  return result;
#pragma omp end declare target

main ()
  int r = 0;
  #pragma omp target parallel map(from:r) num_threads(8)
  #pragma omp single
  r = foo ();
  if (r != 8192)
    __builtin_abort ();
  return 0;

Perhaps if it reproduces even with smaller value than 8192 would be nice to use
a smaller value that still reproduces.
If it hangs before printing that, would be nice to find out where.
The code in foo calls GOMP_taskgroup_start and
GOMP_taskgroup_reduction_register (the latter should allocate the 8 private
copies of result), then in the task body it calls GOMP_task_reduction_remap and
finally at the end of the taskgroup, it calls GOMP_taskgroup_end, performs
(serially) reduction from those 8 private copies, and then
GOMP_taskgroup_reduction_unregister to free that.
Adding some printfs to the library side somewhere might help, e.g. to find out
if GOMP_taskgroup_end has been reached (it should be reached by a single thread
only) and whether it has finished.

More information about the Gcc-bugs mailing list