This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Fix __atomic to not implement atomic loads with CAS.

From: Jakub Jelinek <jakub at redhat dot com>
To: Ramana Radhakrishnan <ramana dot radhakrishnan at foss dot arm dot com>
Cc: Torvald Riegel <triegel at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Henderson <rth at redhat dot com>, Szabolcs Nagy <szabolcs dot nagy at arm dot com>, Will Deacon <Will dot Deacon at arm dot com>
Date: Thu, 2 Feb 2017 15:52:37 +0100
Subject: Re: [PATCH] Fix __atomic to not implement atomic loads with CAS.
Authentication-results: sourceware.org; auth=none
References: <1485802440.16721.118.camel@redhat.com> <a382eff6-1465-e2f6-0cdb-dcd6c74fee45@foss.arm.com>
Reply-to: Jakub Jelinek <jakub at redhat dot com>

On Thu, Feb 02, 2017 at 02:48:42PM +0000, Ramana Radhakrishnan wrote:
> On 30/01/17 18:54, Torvald Riegel wrote:
> > This patch fixes the __atomic builtins to not implement supposedly
> > lock-free atomic loads based on just a compare-and-swap operation.
> > 
> > If there is no hardware-backed atomic load for a certain memory
> > location, the current implementation can implement the load with a CAS
> > while claiming that the access is lock-free.  This is a bug in the cases
> > of volatile atomic loads and atomic loads to read-only-mapped memory; it
> > also creates a lot of contention in case of concurrent atomic loads,
> > which results in at least counter-intuitive performance because most
> > users probably understand "lock-free" to mean hardware-backed (and thus
> > "fast") instead of just in the progress-criteria sense.
> > 
> > This patch implements option 3b of the choices described here:
> > https://gcc.gnu.org/ml/gcc/2017-01/msg00167.html
> 
> 
> Will Deacon pointed me at this thread asking if something similar could be
> done on ARM.
> 
> On armv8-a we can implement an atomic load of 16 bytes using an LDXP / STXP
> loop as a 16 byte load isnt' single copy atomic. On armv8.1-a we do have a
> CAS on 16 bytes.

If the AArch64 ISA guarantees LDXP is atomic, then yes, you can do that.
The problem we have on x86_64 is that I think neither Intel nor AMD gave us
guarantees that aligned SSE or AVX loads are guaranteed to be atomic.

	Jakub

Follow-Ups:
- Re: [PATCH] Fix __atomic to not implement atomic loads with CAS.
  - From: Ramana Radhakrishnan

References:
- Re: [PATCH] Fix __atomic to not implement atomic loads with CAS.
  - From: Ramana Radhakrishnan

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]