This is the mail archive of the
mailing list for the GCC project.
Re: Implementing atomic load as compare-and-swap for read-only memory
- From: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>
- To: Jakub Jelinek <jakub at redhat dot com>, Torvald Riegel <triegel at redhat dot com>
- Cc: gcc at gcc dot gnu dot org, Richard Henderson <rth at redhat dot com>
- Date: Fri, 03 Jun 2016 13:46:23 +0100
- Subject: Re: Implementing atomic load as compare-and-swap for read-only memory
- Authentication-results: sourceware.org; auth=none
- References: <57514F17 dot 2090007 at foss dot arm dot com> <20160603100300 dot GE7387 at tucnak dot redhat dot com> <1464956769 dot 17104 dot 295 dot camel at localhost dot localdomain> <20160603123226 dot GH7387 at tucnak dot redhat dot com>
Hi Jakub, Torvald,
On 03/06/16 13:32, Jakub Jelinek wrote:
On Fri, Jun 03, 2016 at 02:26:09PM +0200, Torvald Riegel wrote:
And that would be fine, IMO. If you can't even load atomically, doing
something useful with this type will be hard except in special cases.
Also, doing a CAS (compare-and-swap) and thus potentially bringing in
the cache line in exclusive mode can be a lot more costly than what
users might expect from a load. A short critical section might not be
If you only have a CAS as base of the atomic operations on a type, then
a CAS operation exposed to the user will still be a just a single HW
CAS. But any other operation besides the CAS and a load will need *two*
CAS operations; even an atomic store has to be implemented as a CAS
Would we just stop expanding all those __atomic_*/__sync_* builtins inline
then (which would IMHO break tons of stuff), or just some predicate that
atomic.h/atomic headers use?
But doesn't that mean you should fall back to locked operation also for any
other atomic operation on such types, because otherwise if you atomic_store
or any other kind of atomic operation, it wouldn't use the locking, while
for atomic load it would?
I suppose you mean that one must fall back to using locking for all
operations? If load isn't atomic, then it can't be made atomic using
external locks if the other operations don't use the locks.
That would be an ABI change and quite significant
pessimization in many cases.
A change from wide CAS to locking would be an ABI change I suppose, but
it could also be considered a necessary bugfix if we don't want to write
to read-only memory. Does this affect anything but i686?
Also x86_64 (for 128-bit atomics), clearly also either arm or aarch64
(judging from who initiated this thread), I bet there are many others.
I'm looking at pre-LPAE ARMv7-A targets for which the
ARM Architecture Reference Manual (rev C.c) section A3.5.3 recommends:
"The way to atomically load two 32-bit quantities is to perform a
LDREXD/STREXD sequence, reading and writing the same value, for which the
STREXD succeeds, and use the read values."
Currently we emit just a single load-doubleword-exclusive which, according to the above,
would not be enough on such targets.
On aarch64 doubleword (128 bit) atomic loads are done through locks (PR 70814).