This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Non-deterministic failure in cactusADM using OpenMP


Hi,

I have been exploring non-deterministic failures in cactusADM (when 
autopar is enabled with a low threshold)' on a Power7 multi core machine.

The failure actually reoccurs in several other spec2006 benchmarks 
when the threshold is lowered to allow for more loops to get parallelized.

The scenario is that the program gets stuck, when one of the 
threads exits and the others remain
waiting on a team barrier (futex_wait). 

I disabled autopar completely, and MANUALLY parallelized (using openmp 
pragmas) only one loop in cactusADM.
I attached below the code before and after my changes.
Running cactusADM with this modified loop produces the exact same problem.

This makes me more confident that the problem is indeed with libgomp and 
not autopar, and is probably race condition.
One of the threads somehow passes the two barriers that it is supposed to 
be stuck on, (the team barrier and the docking barrier)
and exits while the other threads are waiting for its arrival on the team 
barrier.

The barriers in libgomp are implemented using futex:

static inline void
futex_wait (int *addr, int val)
{
  long err = sys_futex0 (addr, gomp_futex_wait, val);
  if (__builtin_expect (err == ENOSYS, 0))
    {
      gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG;
      gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG;
      sys_futex0 (addr, gomp_futex_wait, val);
    }
}


sys_futex0 (int *addr, int op, int val)
{
  register long int r0  __asm__ ("r0");
  register long int r3  __asm__ ("r3");
  register long int r4  __asm__ ("r4");
  register long int r5  __asm__ ("r5");
  register long int r6  __asm__ ("r6");

  r0 = SYS_futex;
  r3 = (long) addr;
  r4 = op;
  r5 = val;
  r6 = 0;

  /* ??? The powerpc64 sysdep.h file clobbers ctr; the powerpc32 sysdep.h
     doesn't.  It doesn't much matter for us.  In the interest of unity,
     go ahead and clobber it always.  */

  __asm volatile ("sc; mfcr %0"
                  : "=r"(r0), "=r"(r3), "=r"(r4), "=r"(r5), "=r"(r6)
                  : "r"(r0), "r"(r3), "r"(r4), "r"(r5), "r"(r6)
                  : "r7", "r8", "r9", "r10", "r11", "r12",
                    "cr0", "ctr", "memory");
  if (__builtin_expect (r0 & (1 << 28), 0))
    return r3;
  return 0;
}




I've opened this PR:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50977
for this problem.

These failures prevent changing autopar's cost model to allow for more 
parallelization to 
take place, which showed great performance potential.
Therefore, any help/comments would be meaningful,
Thanks,
Razys









Attachment: cactusADM.rtf
Description: RTF file


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]