[PATCH] Fix memory orders description in atomic ops built-ins docs.

Mon May 18 17:07:00 GMT 2015

Hello,

On 15/05/15 17:22, Torvald Riegel wrote:
> This patch improves the documentation of the built-ins for atomic
> operations.

The "memory model" to "memory order" change does improve things but I think that
the patch has some problems. As it is now, it makes some of the descriptions
quite difficult to understand and seems to assume more familiarity with details
of the C++11 specification then might be expected.

Generally, the memory order descriptions seem to be targeted towards language
designers but don't provide for anybody trying to understand how to implement or
to use the built-ins. Adding a less formal, programmers view to some of the
descriptions would help. That implies the descriptions would be more than just
illustrative, but I'd suggest that would be appropriate for the GCC manual.

I'm also not sure that the use of C++11 terms in the some of the
descriptions. In particular, using happens-before seems wrong because
happens-before isn't described anywhere in the GCC manual and because it has a
specific meaning in the C++11 specification that doesn't apply to the GCC
built-ins (which C++11 doesn't know about).

Some more comments below.

Regards,
Matthew

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 6004681..5b2ded8 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8853,19 +8853,19 @@ are not prevented from being speculated to before the barrier.

  [...]  If the data type size maps to one
-of the integral sizes that may have lock free support, the generic
-version uses the lock free built-in function.  Otherwise an
+of the integral sizes that may support lock-freedom, the generic
+version uses the lock-free built-in function.  Otherwise an
  external call is left to be resolved at run time.

=====
This is a slightly awkward sentence. Maybe it could be replaced with something
on the lines of "The generic function uses the lock-free built-in function when
the data-type size makes that possible, otherwise an external call is left to be
resolved at run-time."
=====

-The memory models integrate both barriers to code motion as well as
-synchronization requirements with other threads.  They are listed here
-in approximately ascending order of strength.
+An atomic operation can both constrain code motion by the compiler and
+be mapped to a hardware instruction for synchronization between threads
+(e.g., a fence).  [...]

=====
This is a little unclear (and inaccurate, aarch64 can use two instructions
for fences). I also thought that atomic operations constrain code motion by the
hardware. Maybe break the link with the compiler and hardware: "An atomic
operation can both constrain code motion and act as a synchronization point
between threads".
=====

  @table  @code
  @item __ATOMIC_RELAXED
-No barriers or synchronization.
+Implies no inter-thread ordering constraints.

====
It may be useful to be explicit that there are no restrctions on code motion.
====

  @item __ATOMIC_CONSUME
-Data dependency only for both barrier and synchronization with another
-thread.
+This is currently implemented using the stronger @code{__ATOMIC_ACQUIRE}
+memory order because of a deficiency in C++11's semantics for
+@code{memory_order_consume}.

=====
It would be useful to have a description of what the __ATOMIC_CONSUME was
meant to do, as well as the fact that it currently just maps to
__ATOMIC_ACQUIRE. (Or maybe just drop it from the documentation until it's
fixed.)
=====

  @item __ATOMIC_ACQUIRE
-Barrier to hoisting of code and synchronizes with release (or stronger)
-semantic stores from another thread.
+Creates an inter-thread happens-before constraint from the release (or
+stronger) semantic store to this acquire load.  Can prevent hoisting
+of code to before the operation.

=====
As noted before, it's not clear what the "inter-thread happens-before"
means in this context.

Here and elsewhere:
"Can prevent <motion> of code" is ambiguous: it doesn't say under what
conditions code would or wouldn't be prevented from moving.
=====

-Note that the scope of a C++11 memory model depends on whether or not
-the function being called is a @emph{fence} (such as
-@samp{__atomic_thread_fence}).  In a fence, all memory accesses are
-subject to the restrictions of the memory model.  When the function is
-an operation on a location, the restrictions apply only to those
-memory accesses that could affect or that could depend on the
-location.
+Note that in the C++11 memory model, @emph{fences} (e.g.,
+@samp{__atomic_thread_fence}) take effect in combination with other
+atomic operations on specific memory locations (e.g., atomic loads);
+operations on specific memory locations do not necessarily affect other
+operations in the same way.

====
Its very unclear what this paragraph is saying. It seems to suggest that fences
only work in combination with other operations. But that doesn't seem right
since a __atomic_thread_fence (with appropriate memory order) can be dropped
into any piece of code and will act in the way that memory barriers are commonly
understood to work.
====

  @@ -9131,5 +9135,5 @@ if (_atomic_always_lock_free (sizeof (long long), 0))
  @deftypefn {Built-in Function} bool __atomic_is_lock_free (size_t size, void *ptr)

  This built-in function returns true if objects of @var{size} bytes always
+generate lock-free atomic instructions for the target architecture.  If
+it is not known to be lock-free, a call is made to a runtime routine named
====
Probably unrelated to the point of this patch but does this mean
"If this built-in function is not known to be lock-free .."?
====