This is the mail archive of the
mailing list for the GCC project.
Re: [powerpc64le] seq_cst memory order possibly not honored
- From: Andrey Semashev <andrey dot semashev at gmail dot com>
- To: Jonathan Wakely <jwakely dot gcc at gmail dot com>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Fri, 14 Aug 2015 15:20:45 +0300
- Subject: Re: [powerpc64le] seq_cst memory order possibly not honored
- Authentication-results: sourceware.org; auth=none
- References: <55CD3833 dot 2080002 at gmail dot com> <CAH6eHdQ5_-gnt=AYAyaDjqdUnkPJNjnx4z321JUvGcYYk-Qfgg at mail dot gmail dot com> <55CDBAC4 dot 6090303 at gmail dot com> <CAH6eHdRPV74zuYwDoBoMKdk=tiVY0nFh2ePFtKfm4NGLef3vKA at mail dot gmail dot com>
On 14.08.2015 13:19, Jonathan Wakely wrote:
On 14 August 2015 at 10:54, Andrey Semashev <firstname.lastname@example.org> wrote:
Otherwise I cannot see how (x==0 && y==0) could happen. The last load in
each thread is sequenced after the first seq_cst store and both stores are
ordered with respect to each other, so one of the threads should produce 1.
The tool evaluates the possible executions according to a formal model
of the C++ memory model, so is invaluable for answering questions like
It shows that there is no sychronizes-with (shown as "sw")
relationship between the seq_cst store and the relaxed load for each
atomic object. There is a total order of sequentially consistent
operations, but the loads are not sequentially consistent and do not
synchronize with the stores.
Thank you Jonathan, you are correct. I've changed the test to use
seq_cst on loads as well and also removed the first load as it doesn't
really matter for the test. I'll see if it helps the tester.
I'm still not entirely sure if the missing 'sync' instruction is a,
let's say, desirable code, from practical point of view. I understand
that the seq_cst load will generate an extra 'sync' which will ensure
the stored 1 is visible to the other thread. However, if there is no
second instruction, i.e. thread 1 performs a store(seq_cst) and thread 2
performs a load(seq_cst) of the same atomic variable, the second thread
may not observe the stored value until thread 1 performs another
instruction involving 'sync' (or the CPU flushes the cache line for some
reason). This increases latencies of inter-thread communication.