As I understand __sync_synchronize, the intent is to emit a memory barrier instruction, at least on multiprocessor systems. Currently on x86 __sync_synchronize inhibits explicit code motion across the builtin function call but not processor reordering of memory operations across the builtin function call. I think mfence is the right instruction on x86-64, and this or a similar pattern should be added to sync.md: (define_insn "memory_barrier" [(unspec_volatile [(const_int 0)] UNSPEC_MFENCE)] "TARGET_SSE2" "mfence")
Just to confirm the bug: the gcc doc says it follows the Intel itanium binary interface. The Intel documentation says « Associated with each instrinsic are certain memory barrier properties that restrict the movement of memory references to visible data across the intrinsic operation (by either the compiler or the processor). » Not including the mfence instruction would let the processor move references across the instruction, so it is mandatory. And that is not only for x86_64, but also x86, on which you can use e.g. a locked nop if you don't know the arch, or a mfence when using -march= (IIRC it appeared with SSE2)
H.J. can probably confirm this.
__sync_synchronize isn't specified for IA32/Intel64. You can check out Intel Memory Ordering White Paper: www.intel.com/products/processor/manuals/318147.pdf to see what is the most appropriate.
The Intel Memory Ordering White Paper is at http://www.intel.com/products/processor/manuals/318147.pdf
We do already know which x86 memory barrier instruction we need, that's not the problem, no need to give us pointers to documentations. The problem is that we'd like to not use explicit x86 instructions but rather rely on the "portability" of gcc's __sync_synchronize. If your answer is "sync_foobar is not specified for baz", all these __sync_foobar become useless, since the user can't assume what their are supposed to achieve. From what the documentation says, I had assumed that gcc would implement for other archs what Intel documents for ia64, but it happens not to be the case at least for sync_synchronize. So in short, if that's your answer to the bug, then I'll have to tell people _not_ to use gcc's __sync_* at all except on ia64, since that would be the only arch on which we would have any semantic guarantee... (and then the documentation needs to be fixed). Of course, that's not an option I'd like, and actually I believe there may be SMP bugs in libgc & such that use it...
I think it is a bug.
Patch that implements "memory_barrier" for x86 at [1]. [1] http://gcc.gnu.org/ml/gcc-patches/2008-11/msg01181.html
Ah, well, by "nop", I was thinking about things like what Linux does: lock; addl $0,0(%%esp)
Subject: Bug 36793 Author: uros Date: Mon Nov 24 16:55:49 2008 New Revision: 142160 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=142160 Log: * config/i386/i386.md (UNSPECV_CMPXCHG): Rename from UNSPECV_CMPXCHG_[12]. * config/i386/sync.md: Use UNSPECV_CMPXCHG instead of UNSPECV_CMPXCHG_[12]. PR target/36793 * config/i386/sync.md (memory_barrier): New expander. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.md trunk/gcc/config/i386/sync.md
Fixed.
Should we fix __sync_synchronize in 4.3 too?