From http://code.google.com/p/google-perftools/source/browse/trunk/src/base/atomicops-internals-x86.h bool has_amd_lock_mb_bug; // Processor has AMD memory-barrier bug; do lfence // after acquire compare-and-swap. ... inline Atomic32 NoBarrier_CompareAndSwap(volatile Atomic32* ptr, Atomic32 old_value, Atomic32 new_value) { Atomic32 prev; __asm__ __volatile__("lock; cmpxchgl %1,%2" : "=a" (prev) : "q" (new_value), "m" (*ptr), "0" (old_value) : "memory"); return prev; } inline Atomic32 Acquire_CompareAndSwap(volatile Atomic32* ptr, Atomic32 old_value, Atomic32 new_value) { Atomic32 x = NoBarrier_CompareAndSwap(ptr, old_value, new_value); if (AtomicOps_Internalx86CPUFeatures.has_amd_lock_mb_bug) { __asm__ __volatile__("lfence" : : : "memory"); } return x; } inline Atomic32 Release_CompareAndSwap(volatile Atomic32* ptr, Atomic32 old_value, Atomic32 new_value) { return NoBarrier_CompareAndSwap(ptr, old_value, new_value); } This bug can also affect lock+xadd combinations.
More info from Solaris bug # 6323525: if (number_of_cores() < 2) then don't have bug if (family == 0xf && Model < 0x40) then have bug if (rdmsr(MSR_BU_CFG/*0xC0011023*/) & 2) then bug is masked See also http://bugzilla.kernel.org/show_bug.cgi?id=11305
I cannot see any better alternative, but I'll point out that saying "wrong-code" is not exact.
Not an openmp bug; we've got to make the change for all code gcc generates.
Not working on it any longer.
So reading https://www.amd.com/system/files/TechDocs/25759.pdf and errata #147, the errata only occurs if lock is not there: The erratum will not occur if there is a LOCK prefix on the read-modify-write instruction. And GCC never emits a cmpxchg without a lock prefix so there is nothing to be done.