This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC][PATCH 0/5] arch: atomic rework
- From: Linus Torvalds <torvalds at linux-foundation dot org>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: Paul McKenney <paulmck at linux dot vnet dot ibm dot com>, Will Deacon <will dot deacon at arm dot com>, Peter Zijlstra <peterz at infradead dot org>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>, David Howells <dhowells at redhat dot com>, "linux-arch at vger dot kernel dot org" <linux-arch at vger dot kernel dot org>, "linux-kernel at vger dot kernel dot org" <linux-kernel at vger dot kernel dot org>, "akpm at linux-foundation dot org" <akpm at linux-foundation dot org>, "mingo at kernel dot org" <mingo at kernel dot org>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Mon, 10 Feb 2014 11:09:24 -0800
- Subject: Re: [RFC][PATCH 0/5] arch: atomic rework
- Authentication-results: sourceware.org; auth=none
- References: <52F3DA85 dot 1060209 at arm dot com> <20140206185910 dot GE27276 at mudshark dot cambridge dot arm dot com> <20140206192743 dot GH4250 at linux dot vnet dot ibm dot com> <1391721423 dot 23421 dot 3898 dot camel at triegel dot csb> <20140206221117 dot GJ4250 at linux dot vnet dot ibm dot com> <1391730288 dot 23421 dot 4102 dot camel at triegel dot csb> <20140207042051 dot GL4250 at linux dot vnet dot ibm dot com> <20140207074405 dot GM5002 at laptop dot programming dot kicks-ass dot net> <20140207165028 dot GO4250 at linux dot vnet dot ibm dot com> <20140207165548 dot GR5976 at mudshark dot cambridge dot arm dot com> <20140207180216 dot GP4250 at linux dot vnet dot ibm dot com> <1391992071 dot 18779 dot 99 dot camel at triegel dot csb>
On Sun, Feb 9, 2014 at 4:27 PM, Torvald Riegel <triegel@redhat.com> wrote:
>
> Intuitively, this is wrong because this let's the program take a step
> the abstract machine wouldn't do. This is different to the sequential
> code that Peter posted because it uses atomics, and thus one can't
> easily assume that the difference is not observable.
Btw, what is the definition of "observable" for the atomics?
Because I'm hoping that it's not the same as for volatiles, where
"observable" is about the virtual machine itself, and as such volatile
accesses cannot be combined or optimized at all.
Now, I claim that atomic accesses cannot be done speculatively for
writes, and not re-done for reads (because the value could change),
but *combining* them would be possible and good.
For example, we often have multiple independent atomic accesses that
could certainly be combined: testing the individual bits of an atomic
value with helper functions, causing things like "load atomic, test
bit, load same atomic, test another bit". The two atomic loads could
be done as a single load without possibly changing semantics on a real
machine, but if "visibility" is defined in the same way it is for
"volatile", that wouldn't be a valid transformation. Right now we use
"volatile" semantics for these kinds of things, and they really can
hurt.
Same goes for multiple writes (possibly due to setting bits):
combining multiple accesses into a single one is generally fine, it's
*adding* write accesses speculatively that is broken by design..
At the same time, you can't combine atomic loads or stores infinitely
- "visibility" on a real machine definitely is about timeliness.
Removing all but the last write when there are multiple consecutive
writes is generally fine, even if you unroll a loop to generate those
writes. But if what remains is a loop, it might be a busy-loop
basically waiting for something, so it would be wrong ("untimely") to
hoist a store in a loop entirely past the end of the loop, or hoist a
load in a loop to before the loop.
Does the standard allow for that kind of behavior?
Linus