This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fwd: Re: GCC libatomic questions


Hi,

I have a revised version of the libatomic ABI draft which tries to accommodate Richard's comments. The new version is attached. The diff is also appended.

Thanks,
- Bin

diff ABI.txt ABI-1.1.txt
28a29,30
> - The versioning of the library external symbols
>
47a50,57
> Note
>
> Some 64-bit x86 ISA does not support the cmpxchg16b instruction, for
> example, some early AMD64 processors and later Intel Xeon Phi co-
> processor. Whether cmpxchg16b is supported may affect the ABI
> specification for certain atomic types. We will discuss the detail
> where it has an impact.
>
101c111,112
< _Atomic __int128 16 16 N not applicable
---
> _Atomic __int128 (with at16) 16 16 Y not applicable > _Atomic __int128 (w/o at16) 16 16 N not applicable
105c116,117
< _Atomic long double 16 16 N 12 4 N
---
> _Atomic long double (with at16) 16 16 Y 12 4 N > _Atomic long double (w/o at16) 16 16 N 12 4 N
106a119,120
> _Atomic double _Complex 16 16(8) Y 16 16(8) N
>                     (with at16)
107a122
>                     (w/o at16)
110a126,127
> _Atomic long double _Imaginary 16 16 Y 12 4 N
>                     (with at16)
111a129
>                     (w/o at16)
146a165,167
> with at16 means the ISA supports cmpxchg16b, w/o at16 means the ISA
> does not support cmpxchg16b.
>
191a213,214
> _Atomic struct {char a[16];} 16 16(1) Y 16 16(1) N
>                     (with at16)
192a216
>                     (w/o at16)
208a233,235
> with at16 means the ISA supports cmpxchg16b, w/o at16 means the ISA
> does not support cmpxchg16b.
>
246a274,276
> On the 64-bit x86 platform which supports the cmpxchg16b instruction,
> 16-byte atomic types whose alignment matches the size is inlineable.
>
303,306c333,338
< CMPXCHG16B is not always available on 64-bit x86 platforms, so 16-byte
< naturally aligned atomics are not inlineable. The support functions for
< such atomics are free to use lock-free implementation if the instruction
< is available on specific platforms.
---
> "Inlineability" is a compile time property, which in most cases depends
> only on the type. In a few cases it also depends on whether the target
> ISA supports the cmpxchg16b instruction. A compiler may get the ISA
> information by either compilation flags or inquiring the hardware
> capabilities. When the hardware capabilities information is not available,
> the compiler should assume the cmpxchg16b instruction is not supported.
665a698,705
>     The function takes the size of an object and an address which
>     is one of the following three cases
>     - the address of the object
>     - a faked address that solely indicates the alignment of the
>       object's address
>     - NULL, which means that the alignment of the object matches size
>     and returns whether the object is lock-free.
>
711c751
< 5. Libatomic Assumption on Non-blocking Memory Instructions
---
> 5. Libatomic symbol versioning
712a753,868
> Here is the mapfile for symbol versioning of the libatomic library
> specified by this ABI specification
>
> LIBATOMIC_1.0 {
>   global:
>     __atomic_load;
>     __atomic_store;
>     __atomic_exchange;
>     __atomic_compare_exchange;
>     __atomic_is_lock_free;
>
>     __atomic_add_fetch_1;
>     __atomic_add_fetch_2;
>     __atomic_add_fetch_4;
>     __atomic_add_fetch_8;
>     __atomic_add_fetch_16;
>     __atomic_and_fetch_1;
>     __atomic_and_fetch_2;
>     __atomic_and_fetch_4;
>     __atomic_and_fetch_8;
>     __atomic_and_fetch_16;
>     __atomic_compare_exchange_1;
>     __atomic_compare_exchange_2;
>     __atomic_compare_exchange_4;
>     __atomic_compare_exchange_8;
>     __atomic_compare_exchange_16;
>     __atomic_exchange_1;
>     __atomic_exchange_2;
>     __atomic_exchange_4;
>     __atomic_exchange_8;
>     __atomic_exchange_16;
>     __atomic_fetch_add_1;
>     __atomic_fetch_add_2;
>     __atomic_fetch_add_4;
>     __atomic_fetch_add_8;
>     __atomic_fetch_add_16;
>     __atomic_fetch_and_1;
>     __atomic_fetch_and_2;
>     __atomic_fetch_and_4;
>     __atomic_fetch_and_8;
>     __atomic_fetch_and_16;
>     __atomic_fetch_nand_1;
>     __atomic_fetch_nand_2;
>     __atomic_fetch_nand_4;
>     __atomic_fetch_nand_8;
>     __atomic_fetch_nand_16;
>     __atomic_fetch_or_1;
>     __atomic_fetch_or_2;
>     __atomic_fetch_or_4;
>     __atomic_fetch_or_8;
>     __atomic_fetch_or_16;
>     __atomic_fetch_sub_1;
>     __atomic_fetch_sub_2;
>     __atomic_fetch_sub_4;
>     __atomic_fetch_sub_8;
>     __atomic_fetch_sub_16;
>     __atomic_fetch_xor_1;
>     __atomic_fetch_xor_2;
>     __atomic_fetch_xor_4;
>     __atomic_fetch_xor_8;
>     __atomic_fetch_xor_16;
>     __atomic_load_1;
>     __atomic_load_2;
>     __atomic_load_4;
>     __atomic_load_8;
>     __atomic_load_16;
>     __atomic_nand_fetch_1;
>     __atomic_nand_fetch_2;
>     __atomic_nand_fetch_4;
>     __atomic_nand_fetch_8;
>     __atomic_nand_fetch_16;
>     __atomic_or_fetch_1;
>     __atomic_or_fetch_2;
>     __atomic_or_fetch_4;
>     __atomic_or_fetch_8;
>     __atomic_or_fetch_16;
>     __atomic_store_1;
>     __atomic_store_2;
>     __atomic_store_4;
>     __atomic_store_8;
>     __atomic_store_16;
>     __atomic_sub_fetch_1;
>     __atomic_sub_fetch_2;
>     __atomic_sub_fetch_4;
>     __atomic_sub_fetch_8;
>     __atomic_sub_fetch_16;
>     __atomic_test_and_set_1;
>     __atomic_test_and_set_2;
>     __atomic_test_and_set_4;
>     __atomic_test_and_set_8;
>     __atomic_test_and_set_16;
>     __atomic_xor_fetch_1;
>     __atomic_xor_fetch_2;
>     __atomic_xor_fetch_4;
>     __atomic_xor_fetch_8;
>     __atomic_xor_fetch_16;
>
>   local:
>     *;
> };
> LIBATOMIC_1.1 {
>   global:
>     __atomic_feraiseexcept;
> } LIBATOMIC_1.0;
> LIBATOMIC_1.2 {
>   global:
>     atomic_thread_fence;
>     atomic_signal_fence;
>     atomic_flag_test_and_set;
>     atomic_flag_test_and_set_explicit;
>     atomic_flag_clear;
>     atomic_flag_clear_explicit;
> } LIBATOMIC_1.1;
>
> 6. Libatomic Assumption on Non-blocking Memory Instructions
>
752,753c908,910
< So such compiler change must be accompanied by a library change, and
< the ABI must be updated as well.
---
> In such case, the libatomic library and the compiler should be upgraded
> in lock-step, and the inlineable property for certain atomic types
> will be changed from false to true.


On 7/6/2016 12:41 PM, Richard Henderson wrote:
CMPXCHG16B is not always available on 64-bit x86 platforms, so 16-byte
naturally aligned atomics are not inlineable. The support functions for
such atomics are free to use lock-free implementation if the instruction
is available on specific platforms.

Except that it is available on almost all 64-bit x86 platforms. As far as I know, only 2004 era AMD processors didn't have it; all Intel 64-bit cpus have supported it.

Further, gcc will most certainly make use of it when one specifies any command-line option that enables it, such as -march=native.

Therefore we must specify that for x86_64, 16-byte objects are non-locking on cpus that support cmpxchg16b.

However, if a compiler inlines an atomic operation on an _Atomic long
double object and uses the new lock-free instructions, it could break
the compatibility if the library implementation is still non-lock-free.
So such compiler change must be accompanied by a library change, and
the ABI must be updated as well.

The tie between gcc version and libgcc.so version is tight; I see no reason that the libatomic.so version should not also be tight with the compiler version.

It is sufficient that libatomic use atomic instructions when they are available. If a new processor comes out with new capabilities, the compiler and runtime are upgraded in lock-step.

How that is selected is beyond the ABI but possible solutions are

(1) ld.so search path, based on processor capabilities,
(2) ifunc (or workalike) where the function is selected at startup,
(3) explicit runtime test within the relevant functions.

All solutions expose the same function interface so the function call ABI is not affected.

_Bool __atomic_is_lock_free (size_t size, void *object);

    Returns whether the object pointed to by object is lock-free.
    The function assumes that the size of the object is size. If object
    is NULL then the function assumes that object is aligned on an
    size-byte address.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65033

The actual code change is completely within libstdc++, but it affects the description of the libatomic function.

C++ requires that is_lock_free return the same result for all objects of a given type. Whereas __atomic_is_lock_free, with a non-null object, determines if we will implement lock free for a *specific* object, using the specific object's alignment.

Rather than break the ABI and add a different function that passes the type alignment, the solution we hit upon was to pass a "fake", minimally aligned pointer as the object parameter: (void *)(uintptr_t)-__alignof(type).


The final component of the ABI that you've forgotten to specify, if you want full compatibility of linked binaries, is symbol versioning.

We have had two ABI additions since the original release.  See

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libatomic/libatomic.map;h=39e7c2c6b9a70121b5f4031da346a27ae6c1be98;hb=HEAD


r~

Attachment: ABI-1.1.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]