This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
volatile access optimization (C++ / x86_64)
- From: Matt Godbolt <matt at godbolt dot org>
- To: GCC Development <gcc at gcc dot gnu dot org>
- Date: Fri, 26 Dec 2014 14:32:23 -0600
- Subject: volatile access optimization (C++ / x86_64)
- Authentication-results: sourceware.org; auth=none
Hi all,
I'm investigating ways to have single-threaded writers write to memory
areas which are then (very infrequently) read from another thread for
monitoring purposes. Things like "number of units of work done".
I initially modeled this with relaxed atomic operations. This
generates a "lock xadd" style instruction, as I can't convey that
there are no other writers.
As best I can tell, there's no memory order I can use to explain my
usage characteristics. Giving up on the atomics, I tried volatiles.
These are less than ideal as their power is less expressive, but in my
instance I am not trying to fight the ISA's reordering; just prevent
the compiler from eliding updates to my shared metrics.
GCC's code generation uses a "load; add; store" for volatiles, instead
of a single "add 1, [metric]".
http://goo.gl/dVzRSq has the example (which is also at the bottom of my email).
Is there a reason why (in principal) the volatile increment can't be
made into a single add? Clang and ICC both emit the same code for the
volatile and non-volatile case.
Thanks in advance for any thoughts on the matter,
Matt
--- example code ---
#include <atomic>
std::atomic<int> a(0);
void base_case() {
a++;
}
void relaxed() {
a.fetch_add(1, std::memory_order_relaxed);
}
void load_and_store_relaxed() {
a.store(a.load(std::memory_order_relaxed) + 1, std::memory_order_relaxed);
}
void cast_as_int_ptr() {
(*(int*)&a) ++;
}
void cast_as_volatile_int_ptr() {
(*(volatile int*)&a) ++;
}
---example output (gcc490)---
base_case():
lock addl $1, a(%rip)
ret
relaxed():
lock addl $1, a(%rip)
ret
load_and_store_relaxed():
movl a(%rip), %eax
addl $1, %eax
movl %eax, a(%rip)
ret
cast_as_int_ptr():
addl $1, a(%rip)
ret
cast_as_volatile_int_ptr():
movl a(%rip), %eax
addl $1, %eax
movl %eax, a(%rip)
ret