GCC interpretation of C11 atomics (DR 459)
Mon Feb 26 18:35:00 GMT 2018
On Mon, 2018-02-26 at 14:53 +0000, Ruslan Nikolaev via gcc wrote:
> Thank you for more comments, my response is below.
> On Mon, 26 Feb 2018, Szabolcs Nagy wrote:>
> > rmw load is only valid if the implementation can
> > guarantee that atomic objects are never read-only.
> But per response from WG14 regarding DR 459 which I quoted, the standard does not seem to define behavior for read-only memory
> (and const qualifier should not suggest that). RMW, according to them, is fine for atomic_load.
... does not imply this latter statement. The statement you cited is
about what the standard itself requires, not what makes sense for a
particular implementation. For example, one could build an
implementation that does not have any read-only memory and doesn't
distinguish between loads and atomic RMW operations; in such as case, it
wouldn't make sense for the standard to require it. OTOH though, if
read-only memory exists, it makes sense for an implementation to try to
Consider trying to use atomics for memory mapped read-only from another
process, for example to observe output from that other process. You
don't want to make it read-write for security reasons, for example.
Atomic operations designated as lock-free by the implementation are
supposed to be address-free too, which targets the use case of mapping
memory from somewhere else. So, in such a case, using the wide CAS for
atomic loads breaks a reasonable assumption. Moreover, it's also a
special case, in that 32b atomics do work as intended.
Also, I believe the vast majority of synchronization code makes implicit
assumptions about the performance of atomic load operations, notably
that concurrent loads don't create contention, or at least much less
than concurrent writes. The behavior you favor would violate that, and
there's no portable way to distinguish one from the other.
Thus, GCC only declares operations as lock-free if atomic loads of the
particular size/alignment are natively supported, and with the
performance properties one would associate with just a load on the
particular arch. If an atomic load and an atomic CAS are supported,
that's fine; if there's just a CAS, that's not enough.
I see your point in wanting to have a builtin or such for the 64b atomic
CAS. However, IMO, this doesn't fit into the world of C11/C++11
atomics, and thus rather should be accessible through a separate
More information about the Gcc