[RFC][PATCH 0/5] arch: atomic rework

Torvald Riegel triegel@redhat.com
Thu Feb 20 18:12:00 GMT 2014


On Thu, 2014-02-20 at 09:34 -0800, Linus Torvalds wrote:
> On Thu, Feb 20, 2014 at 9:14 AM, Torvald Riegel <triegel@redhat.com> wrote:
> >>
> >> So the clarification is basically to the statement that the "if
> >> (consume(p)) a" version *would* have an ordering guarantee between the
> >> read of "p" and "a", but the "consume(p) ? a : b" would *not* have
> >> such an ordering guarantee. Yes?
> >
> > Not as I understand it.  If my reply above wasn't clear, let me know and
> > I'll try to rephrase it into something that is.
> 
> Yeah, so you and Paul agree. And as I mentioned in the email that
> crossed with yours, I think that means that the standard is overly
> complex, hard to understand, fragile, and all *pointlessly* so.

Let's step back a little here and distinguish different things:

1) AFAICT, mo_acquire provides all the ordering guarantees you want.
Thus, I suggest focusing on mo_acquire for now, especially when reading
the model.  Are you fine with the mo_acquire semantics?

2) mo_acquire is independent of carries-a-dependency and similar
definitions/relations, so you can ignore all those to reason about
mo_acquire.

3) Do you have concerns over the runtime costs of barriers with
mo_acquire semantics?  If so, that's a valid discussion point, and we
can certainly dig deeper into this topic to see how we can possibly use
weaker HW barriers by exploiting things the compiler sees and can
potentially preserve (e.g., control dependencies).  There might be some
stuff the compiler can do without needing further input from the
programmer.

4) mo_consume is kind of the difficult special-case variant of
mo_acquire.  We should discuss it (including whether it's helpful)
separately from the memory model, because it's not essential.

> Btw, there are many ways that "use a consume as an input to a
> conditional" can happen. In particular, even if the result is actually
> *used* like a pointer as far as the programmer is concerned, tricks
> like pointer compression etc can well mean that the "pointer" is
> actually at least partly implemented using conditionals, so that some
> paths end up being only dependent through a comparison of the pointer
> value.

AFAIU, this is similar to my concerns about how a compiler can
reasonably implement the ordering guarantees: The mo_consume value may
be used like a pointer in source code, but how this looks in the
generated code be different after reasonable transformations (including
transformations to control dependencies, so the compiler would have to
avoid those).  Or did I misunderstand?

> So I very much did *not* want to complicate the "litmus test" code
> snippet when Paul tried to make it more complex, but I do think that
> there are cases where code that "looks" like pure pointer chasing
> actually is not for some cases, and then can become basically that
> litmus test for some path.
> 
> Just to give you an example: in simple list handling it is not at all
> unusual to have a special node that is discovered by comparing the
> address, not by just loading from the pointer and following the list
> itself. Examples of that would be a HEAD node in a doubly linked list
> (Linux uses this concept quite widely, it's our default list
> implementation), or it could be a place-marker ("cursor" entry) in the
> list for safe traversal in the presence of concurrent deletes etc.
> 
> And obviously there is the already much earlier mentioned
> compiler-induced compare, due to value speculation, that can basically
> create such sequences even wherethey did not originally exist in the
> source code itself.
> 
> So even if you work with "pointer dereferences", and thus match that
> particular consume pattern, I really don't see why anybody would think
> that "hey, we can ignore any control dependencies" is a good idea.
> It's a *TERRIBLE* idea.
> 
> And as mentioned, it's a terrible idea with no upsides. It doesn't
> help compiler optimizations for the case it's *intended* to help,
> since those optimizations can still be done without the horribly
> broken semantics. It doesn't help compiler writers, it just confuses
> them.

I'm worried about how compilers can implement mo_consume without
prohibiting lots of optimizations on the code.  Your thoughts seem to
point in a similar direction.

I think we should continue by discussing mo_acquire first.  It has
easier semantics and allows a relatively simple implementation in
compilers (although there might be not-quite-so-simple optimizations).  

It's unfortunate we started the discussion with the tricky special case
first; maybe that's what contributed to the confusion.



More information about the Gcc mailing list