[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

torvald at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Apr 16 11:54:00 GMT 2015


--- Comment #32 from torvald at gcc dot gnu.org ---
(In reply to James Greenhalgh from comment #28)
> (In reply to torvald from comment #24)
> > 3) We could do something just on ARM (and scan other arcs for similar
> > issues).  That's perhaps the cleanest option.
> Which leaves 3). From Andrew's two proposed solutions:

3) Also seems best to me.  2) is worst, 1) is too much of a stick.

> This also gives us an easier route to fixing any issues with the
> acquire/release __sync primitives (__sync_lock_test_and_set and
> __sync_lock_release) if we decide that these also need to be stronger than
> their C++11 equivalents.

I don't think we have another case of different __sync vs. __atomics semantics
in case of __sync_lock_test_and_set.  The current specification makes it clear
that this is an acquire barrier, and how it describes the semantics (ie, loads
and stores that are program-order before the acquire op can move to after it) ,
this seems to be consistent with the effects C11 specifies for acquire MO (with
perhaps the distinction that C11 is clear that acquire needs to be paired with
some release op to create an ordering constraint).

I'd say this basically also applies to __sync_lock_release, with the exception
that the current documentation does not mention that stores can be speculated
to before the barrier.  That seems to be an artefact of a TSO model.
However, I don't think this matters much because what the release barrier
allows one to do is reasoning that if one sees the barrier to have taken place
(eg, observe that the lock has been released), then also all ops before the
barrier will be visible.
It does not guarantee that if one observes an effect that is after the barrier
in program order, that the barrier itself will necessarily have taken effect. 
To be able to make this observation, one would have to ensure using __sync ops
that the other effect after the barrier is indeed after the barrier, which
would mean using an release op for the other effect -- which would take care of

If everyone agrees with this reasoning, we probably should add documentation
explaining this.

However, I guess some people relying on data races in their programs could
(mis?)understand the __sync_lock_release semantics to mean that it is a means
to get the equivalent of a C11 release *fence* -- which it is not because the
fence would apply to the (erroneously non-atomic) store after the barrier,
which could one lead to believe that if one observes the store after the
barrier, the fence must also be in effect.  Thoughts?

More information about the Gcc-bugs mailing list