[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

mwahab at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Apr 16 12:13:00 GMT 2015


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #33 from mwahab at gcc dot gnu.org ---
(In reply to torvald from comment #32)
> (In reply to James Greenhalgh from comment #28)
> > (In reply to torvald from comment #24)
> > > 3) We could do something just on ARM (and scan other arcs for similar
> > > issues).  That's perhaps the cleanest option.
> > 
> > Which leaves 3). From Andrew's two proposed solutions:
> 
> 3) Also seems best to me.  2) is worst, 1) is too much of a stick.
> 
> > This also gives us an easier route to fixing any issues with the
> > acquire/release __sync primitives (__sync_lock_test_and_set and
> > __sync_lock_release) if we decide that these also need to be stronger than
> > their C++11 equivalents.
> 
> I don't think we have another case of different __sync vs. __atomics
> semantics in case of __sync_lock_test_and_set.  The current specification
> makes it clear that this is an acquire barrier, and how it describes the
> semantics (ie, loads and stores that are program-order before the acquire op
> can move to after it) , this seems to be consistent with the effects C11
> specifies for acquire MO (with perhaps the distinction that C11 is clear
> that acquire needs to be paired with some release op to create an ordering
> constraint).

Thanks, I suspect that the acquire barrier may not be much as much of an issue
as I had remembered. (The issue came up while I was trying to understand the
C11 semantics.)

The test case (aarch64) I have is:
----
int foo = 0;
int bar = 0;
int T5(void)
{
  int x = __sync_lock_test_and_set(&foo, 1);
  return bar;
}
----
.L11:
    ldaxr    w2, [x0]     ; load-acquire
    stxr    w3, w1, [x0] ; store
    cbnz    w3, .L11
    ldr    w0, [x0, 4]  ; load
    ret
----
My concern was that the load could be speculated ahead of the store. Since the
store marks the end of the barrier, that could make it appear as if the load
had completed before the acquire-barrier.

In retrospect, I don't think that there will be a problem because any load that
could be moved would have to end up with the same value as if it had not moved.



More information about the Gcc-bugs mailing list