Patch for PR libgomp/37938, IA64 specific.

Wed Nov 12 23:10:00 GMT 2008

On Wed, Nov 12, 2008 at 08:00:30AM -0800, Steve Ellcey wrote:
> The thing that seems a bit confusing to me is that we use
> __sync_bool_compare_and_swap to lock but __sync_lock_test_and_set to
> unlock in the gomp mutex lock/unlock routines.  If we wanted the most
> efficient mutex lock/unlock it would seem that we would want to use
> sync_lock_test_and_set for the mutex set (acquire semantics) and
> sync_lock_release (release semantics) for the mutex release.  I think

Only if we needed just unlocked/locked state.  But we want 3 to avoid
unnecessary trips into the kernel.

> the reason we don't do this is we want to get the old value of the mutex
> when releasing it in order to see if it is 1 or 2 in order to see if we
> should wake another process up with futex.  On IA64 there is no atomic
> exchange instruction with release semantics.  There is compare and
> exchange with acquire or release semantics and an exchange with acquire
> semantics, but no exchange with release semantics.

I think just using
  __sync_synchronize ();
  int val = __sync_lock_test_and_set (mutex, 0);
  if (__builtin_expect (val > 1, 0))
    gomp_mutex_unlock_slow (mutex);
in ia64/mutex.h (gomp_mutex_unlock) will be the fastest, as cmpxchg*.rel
would need an extra memory load and a loop if it failed.

	Jakub