The following builtins are intended to be compatible with those described in the Intel Itanium Processor-specific Application Binary Interface, section 7.4. As such, they depart from the normal GCC practice of using the “__builtin_” prefix, and further that they are overloaded such that they work on multiple types.
The definition given in the Intel documentation allows only for the use of
the types int
, long
, long long
as well as their unsigned
counterparts. GCC will allow any integral scalar or pointer type that is
1, 2, 4 or 8 bytes in length.
Not all operations are supported by all target processors. If a particular operation cannot be implemented on the target processor, a warning will be generated and a call an external function will be generated. The external function will carry the same name as the builtin, with an additional suffix ‘_n’ where n is the size of the data type.
In most cases, these builtins are considered a full barrier. That is, no memory operand will be moved across the operation, either forward or backward. Further, instructions will be issued as necessary to prevent the processor from speculating loads across the operation and from queuing stores after the operation.
All of the routines are described in the Intel documentation to take “an optional list of variables protected by the memory barrier”. It's not clear what is meant by that; it could mean that only the following variables are protected, or it could mean that these variables should in addition be protected. At present GCC ignores this list and protects all variables which are globally accessible. If in the future we make some use of this list, an empty list will continue to mean all globally accessible variables.
__sync_fetch_and_add (
type *ptr,
type value, ...)
__sync_fetch_and_sub (
type *ptr,
type value, ...)
__sync_fetch_and_or (
type *ptr,
type value, ...)
__sync_fetch_and_and (
type *ptr,
type value, ...)
__sync_fetch_and_xor (
type *ptr,
type value, ...)
__sync_fetch_and_nand (
type *ptr,
type value, ...)
{ tmp = *ptr; *ptr op= value; return tmp; } { tmp = *ptr; *ptr = ~(tmp & value); return tmp; } // nand
Note: GCC 4.4 and later implement __sync_fetch_and_nand
builtin as *ptr = ~(tmp & value)
instead of *ptr = ~tmp & value
.
__sync_add_and_fetch (
type *ptr,
type value, ...)
__sync_sub_and_fetch (
type *ptr,
type value, ...)
__sync_or_and_fetch (
type *ptr,
type value, ...)
__sync_and_and_fetch (
type *ptr,
type value, ...)
__sync_xor_and_fetch (
type *ptr,
type value, ...)
__sync_nand_and_fetch (
type *ptr,
type value, ...)
{ *ptr op= value; return *ptr; } { *ptr = ~(*ptr & value); return *ptr; } // nand
Note: GCC 4.4 and later implement __sync_nand_and_fetch
builtin as *ptr = ~(*ptr & value)
instead of
*ptr = ~*ptr & value
.
bool __sync_bool_compare_and_swap (
type *ptr,
type oldval,
type newval, ...)
__sync_val_compare_and_swap (
type *ptr,
type oldval,
type newval, ...)
*
ptr is oldval, then write newval into
*
ptr.
The “bool” version returns true if the comparison is successful and
newval was written. The “val” version returns the contents
of *
ptr before the operation.
__sync_synchronize (...)
__sync_lock_test_and_set (
type *ptr,
type value, ...)
*
ptr, and returns the previous contents of
*
ptr.
Many targets have only minimal support for such locks, and do not support
a full exchange operation. In this case, a target may support reduced
functionality here by which the only valid value to store is the
immediate constant 1. The exact value actually stored in *
ptr
is implementation defined.
This builtin is not a full barrier, but rather an acquire barrier.
This means that references after the builtin cannot move to (or be
speculated to) before the builtin, but previous memory stores may not
be globally visible yet, and previous memory loads may not yet be
satisfied.
void __sync_lock_release (
type *ptr, ...)
__sync_lock_test_and_set
.
Normally this means writing the constant 0 to *
ptr.
This builtin is not a full barrier, but rather a release barrier. This means that all previous memory stores are globally visible, and all previous memory loads have been satisfied, but following memory reads are not prevented from being speculated to before the barrier.