This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: full memory barrier?


Hei Chan <structurechart@yahoo.com> writes:

> I am a little bit confused what asm volatile ("" : : : "memory") does.
>
> I searched online; many people said that it creates the "full memory barrier".
>
> I have a test code:
> int main() {
>         bool bar;
>         asm volatile ("" : : : "memory");
>         bar = true;
>         return 1;
> }
>
> Running g++ -c -g -Wa,-a,-ad foo.cpp gives me:
>
>    2:foo.cpp       ****         bool bar;
>    3:foo.cpp       ****         asm volatile ("" : : : "memory");
>   22                            .loc 1 3 0
>    4:foo.cpp       ****         bar = true;
>   23                            .loc 1 4 0
>
> It doesn't involve any fence instruction.
>
> Maybe I completely misunderstand the idea of "full memory barrier".

The definition of "memory barrier" is ambiguous when looking at code
written in a high-level language.

The statement "asm volatile ("" : : : "memory");" is a compiler
scheduling barrier for all expressions that load from or store values to
memory.  That means something like a pointer dereference, an array
index, or an access to a volatile variable.  It may or may not include a
reference to a local variable, as a local variable need not be in
memory.

This kind of compiler scheduling barrier can be used in conjunction with
a hardware memory barrier.  The compiler doesn't know that a hardware
memory barrier is special, and it will happily move memory access
instructions across the hardware barrier.  Therefore, if you want to use
a hardware memory barrier in compiled code, you must use it along with a
compiler scheduling barrier.

On the other hand a compiler scheduling barrier can be useful even
without a hardware memory barrier.  For example, in a coroutine based
system with multiple light-weight threads running on a single processor,
you need a compiler scheduling barrier, but you do not need a hardware
memory barrier.

gcc will generate a hardware memory barrier if you use the
__sync_synchronize builtin function.  That function acts as both a
hardware memory barrier and a compiler scheduling barrier.

Ian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]