[RFC][PATCH 0/5] arch: atomic rework
Thu Feb 6 23:25:00 GMT 2014
On Thu, 2014-02-06 at 22:13 +0000, Joseph S. Myers wrote:
> On Thu, 6 Feb 2014, Torvald Riegel wrote:
> > > It seems that nobody really
> > > agrees on exactly how the C11 atomics map to real architectural
> > > instructions on anything but the trivial architectures.
> > There's certainly different ways to implement the memory model and those
> > have to be specified elsewhere, but I don't see how this differs much
> > from other things specified in the ABI(s) for each architecture.
> It is not clear to me that there is any good consensus understanding of
> how to specify this in an ABI, or how, given an ABI, to determine whether
> an implementation is valid.
> For ABIs not considering atomics / concurrency, it's well understood, for
> example, that the ABI specifies observable conditions at function call
> boundaries ("these arguments are in these registers", "the stack pointer
> has this alignment", "on return from the function, the values of these
> registers are unchanged from the values they had on entry"). It may
> sometimes specify things at other times (e.g. presence or absence of a red
> zone - whether memory beyond the stack pointer may be overwritten on an
> interrupt). But if it gives a code sequence, it's clear this is just an
> example rather than a requirement for particular code to be generated -
> any code sequence suffices if it meets the observable conditions at the
> points where code generated by one implementation may pass control to code
> generated by another implementation.
> When atomics are involved, you no longer have a limited set of
> well-defined points where control may pass from code generated by one
> implementation to code generated by another - the code generated by the
> two may be running concurrently.
> We know of certain cases
> <http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html> where there are
> choices of the mapping of atomic operations to particular instructions.
> But I'm not sure there's much evidence that these are the only ABI issues
> arising from concurrency - that there aren't any other ways in which an
> implementation may transform code, consistent with the as-if rule of ISO
> C, that may run into incompatibilities of different choices.
I can't think of other incompatibilities with high likelihood --
provided we ignore consume memory order and the handling of dependencies
(see below). But I would doubt there is a high risk of such because (1)
the data race definition should hopefully not cause subtle
incompatibilities and (2) there is clear "hand-off point" from the
compiler to a specific instruction representing an atomic access. For
example, if we have a release store, and we agree on the instructions
used for that, then compilers will have to ensure happens-before for
anything before the release store; for example, as long as stores
sequenced-before the release store are performed, it doesn't matter in
which order that happens. Subsequently, an acquire-load somewhere else
can pick this sequence of events up just by using the agreed-upon
acquire-load; like with the stores, it can order subsequent loads in any
way it sees fit, including different optimizations. That's obviously
not a formal proof, though. But it seems likely to me, at least :)
I'm more concerned about consume and dependencies because as far as I
understand the standard's requirements, dependencies need to be tracked
across function calls. Thus, we might have several compilers involved
in that, and we can't just "condense" things to happens-before, but it's
instead how and that we keep dependencies intact. Because of that, I'm
wondering whether this is actually implementable practically. (See
> And even if
> those are the only issues, it's not clear there are well-understood ways
> to define the mapping from the C11 memory model to the architecture's
> model, which provide a good way to reason about whether a particular
> choice of instructions is valid according to the mapping.
I think that if we have different options, there needs to be agreement
on which to choose across the compilers, at the very least. I don't
quite know how this looks like for GCC and LLVM, for example.
> > Are you familiar with the formalization of the C11/C++11 model by Batty
> > et al.?
> > http://www.cl.cam.ac.uk/~mjb220/popl085ap-sewell.pdf
> > http://www.cl.cam.ac.uk/~mjb220/n3132.pdf
> These discuss, as well as the model itself, proving the validity of a
> particular choice of x86 instructions. I imagine that might be a starting
> point towards an understanding of how to describe the relevant issues in
> an ABI, and how to determine whether a choice of instructions is
> consistent with what an ABI says. But I don't get the impression this is
> yet at the level where people not doing research in the area can routinely
> read and write such ABIs and work out whether a code sequence is
> consistent with them.
It's certainly complicated stuff (IMHO). On the positive side, it's
just a few compilers, the kernel, etc. that have to deal with this, if
most programmers use the languages' memory model.
> (If an ABI says "use instruction X", then you can't use a more efficient
> X' added by a new version of the instruction set. But it can't
> necessarily be as loose as saying "use X and Y, or other instructions that
> achieve semantics when the other thread is using X or Y", because it might
> be the case that Y' interoperates with X, X' interoperates with Y, but X'
> and Y' don't interoperate with each other. I'd envisage something more
> like mapping not to instructions, but to concepts within the
> architecture's own memory model - but that requires the architecture's
> memory model to be as well defined, and probably formalized, as the C11
Yes. The same group of researchers have also worked on formalizing the
Power model, and use this as base for a proof of the correctness of the
proposed mappings: http://www.cl.cam.ac.uk/~pes20/cppppc/
The formal approach to all this might be a complex task, but it is more
confidence-inspiring than making guesses about one standard's prose vs.
another specification's prose.
More information about the Gcc