[PATCH] Fix memory orders description in atomic ops built-ins docs.

Torvald Riegel triegel@redhat.com
Fri May 22 18:16:00 GMT 2015

On Fri, 2015-05-22 at 17:41 +0100, Matthew Wahab wrote:
> On 21/05/15 19:26, Torvald Riegel wrote:
> > On Thu, 2015-05-21 at 16:45 +0100, Matthew Wahab wrote:
> >> On 19/05/15 20:20, Torvald Riegel wrote:
> >>> On Mon, 2015-05-18 at 17:36 +0100, Matthew Wahab wrote:
> >>>> Hello,
> >>>>
> >>>> On 15/05/15 17:22, Torvald Riegel wrote:
> >>>>> This patch improves the documentation of the built-ins for atomic
> >>>>> operations.
> >>>>
> I think we're talking at cross-purposes and not really getting anywhere. I've replied 
> to some of your comments below, but it's mostly a restatement of points already made.

OK.  I have a few more comments below.

> >> We seem to have different views about the purpose of the manual page. I'm treating it
> >> as a description of the built-in functions provided by gcc to generate the code
> >> needed to implement the C++11 model. That is, the built-ins are distinct from C++11
> >> and their descriptions should be, as far as possible, independent of the methods used
> >> in the C++11 specification to describe the C++11 memory model.
> >
> > OK.  But we'd need a *precise* specification of what they do if we'd
> > want to make them separate from the C++11 memory model.  And we don't
> > have that, would you agree?
> There is a difference between the sort of description that is needed for a formal 
> specification and the sort that would be needed for a programmers manual. The best 
> example of this that I can think of is the Standard ML definition 
> (http://sml-family.org). That is a mathematical (so precise) definition that is 
> invaluable if you want an unambiguous specification of the language. But its useless 
> for anybody who just wants to use Standard ML to write programs. For that, you need 
> go to the imprecise descriptions that are given in books about SML and in the 
> documentation for SML compilers and libraries.
> The problem with using the formal SML definition is the same as with using the formal 
> C++11 definition: most of it is detail needed to make things in the formal 
> specification come out the right way. That detail, about things that are internal to 
> the definition of the specification, makes it difficult to understand what is 
> intended to be available for the user.

A relation like happens-before is "user-facing".  It is how one reasons
about ordering in a multi-threaded execution.  This isn't internal or
for a corner-case like additional-synchronizes-with or one of the
consistency rules.

> The GCC manual seems to me to be aimed more at the people who want to use GCC to 
> write code and I don't think that the patch makes much allowance for them. I do think 
> that more precise statements about the relationship to C++11 are useful to have. Its 
> the sort of constraint that ought to be documented somewhere. But it seems to be more 
> of interest to compiler writers or, at least, to users who are as knowledgeable as 
> compiler writers. A document targeting that group, such as the GCC internals or a GCC 
> wiki-page, would seem to be a better place for the information.
> (Another example of the distinction may be the Intel Itanium ABI documentation which 
> has a programmers description of the synchronization primitives and a separate, 
> formal description of their behaviour.)
> For what it's worth, my view of how C++11, the __atomics and the machine code line up 
> is that each is a distinct layer. Each layer implements the requirements of the 
> higher (more abstract) layer but is otherwise entirely independent. That's why I 
> think that a description of the __atomic built-in, aimed at compiler users rather 
> than writers and that doesn't expect knowledge of C++11 is desirable and possible.
> >> I'm also concerned that the patch, by describing things in terms of formal C++11
> >> concepts, makes it more difficult for people to know what the built-ins can be
> >> expected to do and so make the built-in more difficult to use[..]
> >
> > I hadn't thought about that possible danger, but that would be right.
> > The way I would prefer to counter that is that we add a big fat warning
> > to the __sync built-ins that we don't have a precise specification for
> > them and that there are several corners of hand-waving and potentially
> > further issues, and that this is another reason to prefer the __atomic
> > built-ins.  PR 65697 etc. are enough indication for me that we indeed
> > lack a proper specification.
> Increasing uncertainty about the __sync built-ins wouldn't make people move to 
> equally uncertain __atomic built-ins. There's enough knowledge and use of the __sync 
> builtins to make them a more comfortable choice then the C++11 atomics and in the 
> worst case it would push people to roll their own synchronization functions with 
> assembler or system calls.

I don't buy that.  Sure, some people will be uncomfortable with
anything.  But I don't see how "specified in C++11 and C11" is the same
level of uncertainty as "we don't have a tight specification".  Users
can pick their favorite source of education on the C++11 memory model.
And over time, C++11 / C11 will become commonly known...

> > Well, "just wants to add a memory barrier" is a the start of the
> > problem.  The same way one needs to understand a hardware memory model
> > to pick the right HW instruction(s), the same one needs to understand a
> > programming language memory model to pick a fence and understand its
> > semantics.
> Sometimes you just want the hardware instructions and don't care about the 
> programming language semantics.

If you tightly control code generation around those uses, that can work.
But compilers will optimize concurrent code, so in general, one can't
pretend the compiler isn't there or is just a fancy assembler.

I'm stressing this point because I think that it's critical that users
understand that in most cases, they have to consider both the compiler
and the hardware when writing concurrent C/C++ code.

> > If you haven't, please just look at
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html
> > and try to specify the constraints just for the valid optimizations
> > described there.  This should be a good indication of why I think
> > specifying the reordering / behavior constraints is nontrivial.
> I don't remember seeing this, it's an interesting paper. It deals with things that 
> matter to compiler writers though so I don't think it's relevant to point I was 
> trying to make (that there's a distinction between documentation for compiler users 
> and compiler writers).

Well, it's optimizations that the compiler is allowed to do.  So if we
give imprecise definitions of the semantics of __atomic* to users, it
can lead to differences between what users think __atomic* provides and
what it actually does.

I pointed to this paper because it shows examples of optimizations that
would be disallowed by simple definitions of __atomic but are allowed by
the standard.

> >
> >> The requirement that the atomics should support
> >> C++11 could be met by making sure that the description of the expected behaviour is
> >> sufficient for C++11.
> >
> > We don't just want the semantics of __atomic* to be sufficient for C++11
> > but also want them to be *as weak as possible* to be still sufficient
> > for C++11 -- otherwise, we'll make C++11 code less efficient than it can
> > be.
> Yes, that's implied by 'sufficient'.

I disagree.  If condition C1 is sufficient to yield condition C2, then
C1 implies C2 -- but they are not necessarily equal.  "Sufficient and
necessary" would be what we want for the mapping.  (For example, it
would be *sufficient* to make all __atomic* to have seq-cst behavior,
irrespective of the memory order argument -- but it wouldn't be
*necessary*, and it would be less efficient.)

More information about the Gcc-patches mailing list