This is the mail archive of the
mailing list for the GCC project.
Re: Compilers and RCU readers: Once more unto the breach!
- From: Will Deacon <will dot deacon at arm dot com>
- To: "Paul E. McKenney" <paulmck at linux dot vnet dot ibm dot com>
- Cc: Linus Torvalds <torvalds at linux-foundation dot org>, Linux Kernel Mailing List <linux-kernel at vger dot kernel dot org>, "c++std-parallel at accu dot org" <c++std-parallel at accu dot org>, "linux-arch at vger dot kernel dot org" <linux-arch at vger dot kernel dot org>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>, p796231 <Peter dot Sewell at cl dot cam dot ac dot uk>, "mark dot batty at cl dot cam dot ac dot uk" <Mark dot Batty at cl dot cam dot ac dot uk>, Peter Zijlstra <peterz at infradead dot org>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>, David Howells <dhowells at redhat dot com>, Andrew Morton <akpm at linux-foundation dot org>, Ingo Molnar <mingo at kernel dot org>, "michaelw at ca dot ibm dot com" <michaelw at ca dot ibm dot com>
- Date: Fri, 22 May 2015 18:30:29 +0100
- Subject: Re: Compilers and RCU readers: Once more unto the breach!
- Authentication-results: sourceware.org; auth=none
- References: <20150520005510 dot GA23559 at linux dot vnet dot ibm dot com> <CA+55aFy_8V-rbE9FQMHx6tXjj8HHKZuKSJvnRPVYvpk46EQA1g at mail dot gmail dot com> <CA+55aFxOtcB8AYCpLQBGSXK=8_Vh4uDs5HEpzGpPy+hgz542ag at mail dot gmail dot com> <20150520024148 dot GD6776 at linux dot vnet dot ibm dot com> <20150520114745 dot GC11498 at arm dot com> <20150520121522 dot GH6776 at linux dot vnet dot ibm dot com> <20150520154617 dot GE11498 at arm dot com> <20150520181606 dot GT6776 at linux dot vnet dot ibm dot com> <20150521192422 dot GC19204 at arm dot com> <20150521200212 dot GW6776 at linux dot vnet dot ibm dot com>
On Thu, May 21, 2015 at 09:02:12PM +0100, Paul E. McKenney wrote:
> On Thu, May 21, 2015 at 08:24:22PM +0100, Will Deacon wrote:
> > On Wed, May 20, 2015 at 07:16:06PM +0100, Paul E. McKenney wrote:
> > > On to #5:
> > >
> > > r1 = atomic_load_explicit(&x, memory_order_consume);
> > > if (r1 == 42)
> > > atomic_store_explicit(&y, r1, memory_order_relaxed);
> > > ----------------------------------------------------
> > > r2 = atomic_load_explicit(&y, memory_order_consume);
> > > if (r2 == 42)
> > > atomic_store_explicit(&x, 42, memory_order_relaxed);
> > >
> > > The first thread's accesses are dependency ordered. The second thread's
> > > ordering is in a corner case that memory-barriers.txt does not cover.
> > > You are supposed to start control dependencies with READ_ONCE_CTRL(), not
> > > a memory_order_consume load (AKA rcu_dereference and friends). However,
> > > Alpha would have a full barrier as part of the memory_order_consume load,
> > > and the rest of the processors would (one way or another) respect the
> > > control dependency. And the compiler would have some fun trying to
> > > break it.
> > But this is interesting because the first thread is ordered whilst the
> > second is not, so doesn't that effectively forbid the compiler from
> > constant-folding values if it can't prove that there is no dependency
> > chain?
> You lost me on this one. Are you suggesting that the compiler
> speculate the second thread's atomic store? That would be very
> bad regardless of dependency chains.
> So what constant-folding optimization are you thinking of here?
> If the above example is not amenable to such an optimization, could
> you please give me an example where constant folding would apply
> in a way that is sensitive to dependency chains?
Unless I'm missing something, I can't see what would prevent a compiler
from looking at the code in thread1 and transforming it into the code in
thread2 (i.e. constant folding r1 with 42 given that the taken branch
must mean that r1 == 42). However, such an optimisation breaks the
dependency chain, which means that a compiler needs to walk backwards
to see if there is a dependency chain extending to r1.
> > > So the current Linux memory model would allow (r1 == 42 && r2 == 42),
> > > but I don't know of any hardware/compiler combination that would
> > > allow it. And no, I am -not- going to update memory-barriers.txt for
> > > this litmus test, its theoretical interest notwithstanding! ;-)
Of course, I'm not asking for that at all! I'm just trying to see how
your proposal holds up with the example.