This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Re: full memory barrier?
- From: Ian Lance Taylor <iant at google dot com>
- To: Hei Chan <structurechart at yahoo dot com>
- Cc: gcc-help at gcc dot gnu dot org
- Date: Mon, 11 Apr 2011 14:42:07 -0700
- Subject: Re: full memory barrier?
- References: <583819.17047.qm@web36506.mail.mud.yahoo.com>
Hei Chan <structurechart@yahoo.com> writes:
> I am a little bit confused what asm volatile ("" : : : "memory") does.
>
> I searched online; many people said that it creates the "full memory barrier".
>
> I have a test code:
> int main() {
> bool bar;
> asm volatile ("" : : : "memory");
> bar = true;
> return 1;
> }
>
> Running g++ -c -g -Wa,-a,-ad foo.cpp gives me:
>
> 2:foo.cpp **** bool bar;
> 3:foo.cpp **** asm volatile ("" : : : "memory");
> 22 .loc 1 3 0
> 4:foo.cpp **** bar = true;
> 23 .loc 1 4 0
>
> It doesn't involve any fence instruction.
>
> Maybe I completely misunderstand the idea of "full memory barrier".
The definition of "memory barrier" is ambiguous when looking at code
written in a high-level language.
The statement "asm volatile ("" : : : "memory");" is a compiler
scheduling barrier for all expressions that load from or store values to
memory. That means something like a pointer dereference, an array
index, or an access to a volatile variable. It may or may not include a
reference to a local variable, as a local variable need not be in
memory.
This kind of compiler scheduling barrier can be used in conjunction with
a hardware memory barrier. The compiler doesn't know that a hardware
memory barrier is special, and it will happily move memory access
instructions across the hardware barrier. Therefore, if you want to use
a hardware memory barrier in compiled code, you must use it along with a
compiler scheduling barrier.
On the other hand a compiler scheduling barrier can be useful even
without a hardware memory barrier. For example, in a coroutine based
system with multiple light-weight threads running on a single processor,
you need a compiler scheduling barrier, but you do not need a hardware
memory barrier.
gcc will generate a hardware memory barrier if you use the
__sync_synchronize builtin function. That function acts as both a
hardware memory barrier and a compiler scheduling barrier.
Ian