This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GCC libatomic ABI specification draft

On Fri, 2016-12-02 at 12:13 +0100, Gabriel Paubert wrote:
> On Thu, Dec 01, 2016 at 11:13:37AM -0800, Bin Fan at Work wrote:
> > Hi Szabolcs,
> > 
> > > On Nov 29, 2016, at 3:11 AM, Szabolcs Nagy <> wrote:
> > > 
> > > On 17/11/16 20:12, Bin Fan wrote:
> > >> 
> > >> Although this ABI specification specifies that 16-byte properly aligned atomics are inlineable on platforms
> > >> supporting cmpxchg16b, we document the caveats here for further discussion. If we decide to change the
> > >> inlineable attribute for those atomics, then this ABI, the compiler and the runtime implementation should be
> > >> updated together at the same time.
> > >> 
> > >> 
> > >> The compiler and runtime need to check the availability of cmpxchg16b to implement this ABI specification.
> > >> Here is how it would work: The compiler can get the information either from the compiler flags or by
> > >> inquiring the hardware capabilities. When the information is not available, the compiler should assume that
> > >> cmpxchg16b instruction is not supported. The runtime library implementation can also query the hardware
> > >> compatibility and choose the implementation at runtime. Assuming the user provides correct compiler options
> > > 
> > > with this abi the runtime implementation *must* query the hardware
> > > (because there might be inlined cmpxchg16b in use in another module
> > > on a hardware that supports it and the runtime must be able to sync
> > > with it).
> > 
> > Thanks for the comment. Yes, the ABI requires libatomic must query the hardware. This is 
> > necessary if we want the compiler to generate inlined code for 16-byte atomics. Note that 
> > this particular issue only affects x86. 
> Why? Power (at least recent ones) has 128 bit atomic instructions
> (lqarx/stqcx.) and Z has 128 bit compare and swap. 

That's not the only factor affecting whether cmpxchg16b or such is used
for atomics.  If the HW just offers a wide CAS but no wide atomic load,
then even an atomic load is not truly just a load, which breaks (1)
atomic loads on read-only mapped memory and (2) volatile atomic loads
(unless we claim that an idempotent store is like a load, which is quite
a stretch for volatile I think).

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]