full memory barrier?
Hei Chan
structurechart@yahoo.com
Mon Apr 11 22:11:00 GMT 2011
Thanks for your reply.
You mentioned the statement "is a compiler scheduling barrier for all
expressions that load from or store values to memory". Does "memory" mean the
main memory? Or does it include the CPU cache?
----- Original Message ----
From: Ian Lance Taylor <iant@google.com>
To: Hei Chan <structurechart@yahoo.com>
Cc: gcc-help@gcc.gnu.org
Sent: Mon, April 11, 2011 2:42:07 PM
Subject: Re: full memory barrier?
Hei Chan <structurechart@yahoo.com> writes:
> I am a little bit confused what asm volatile ("" : : : "memory") does.
>
> I searched online; many people said that it creates the "full memory barrier".
>
> I have a test code:
> int main() {
> bool bar;
> asm volatile ("" : : : "memory");
> bar = true;
> return 1;
> }
>
> Running g++ -c -g -Wa,-a,-ad foo.cpp gives me:
>
> 2:foo.cpp **** bool bar;
> 3:foo.cpp **** asm volatile ("" : : : "memory");
> 22 .loc 1 3 0
> 4:foo.cpp **** bar = true;
> 23 .loc 1 4 0
>
> It doesn't involve any fence instruction.
>
> Maybe I completely misunderstand the idea of "full memory barrier".
The definition of "memory barrier" is ambiguous when looking at code
written in a high-level language.
The statement "asm volatile ("" : : : "memory");" is a compiler
scheduling barrier for all expressions that load from or store values to
memory. That means something like a pointer dereference, an array
index, or an access to a volatile variable. It may or may not include a
reference to a local variable, as a local variable need not be in
memory.
This kind of compiler scheduling barrier can be used in conjunction with
a hardware memory barrier. The compiler doesn't know that a hardware
memory barrier is special, and it will happily move memory access
instructions across the hardware barrier. Therefore, if you want to use
a hardware memory barrier in compiled code, you must use it along with a
compiler scheduling barrier.
On the other hand a compiler scheduling barrier can be useful even
without a hardware memory barrier. For example, in a coroutine based
system with multiple light-weight threads running on a single processor,
you need a compiler scheduling barrier, but you do not need a hardware
memory barrier.
gcc will generate a hardware memory barrier if you use the
__sync_synchronize builtin function. That function acts as both a
hardware memory barrier and a compiler scheduling barrier.
Ian
More information about the Gcc-help
mailing list