volatile access optimization (C++ / x86_64)

Andrew Haley aph@redhat.com
Sat Dec 27 17:45:00 GMT 2014

On 27/12/14 16:02, Paul_Koning@Dell.com wrote:
>> On Dec 26, 2014, at 6:19 PM, Andrew Haley <aph@redhat.com> wrote:
>> On 26/12/14 22:49, Matt Godbolt wrote:
>>> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley <aph@redhat.com> wrote:
>>>> On 26/12/14 20:32, Matt Godbolt wrote:
>>>>> Is there a reason why (in principal) the volatile increment can't be
>>>>> made into a single add? Clang and ICC both emit the same code for the
>>>>> volatile and non-volatile case.
>>>> Yes.  Volatiles use the "as if" rule, where every memory access is as
>>>> written.  a volatile increment is defined as a load, an increment, and
>>>> a store.
>>> That makes sense to me from a logical point of view. My
>>> understanding though is the volatile keyword was mainly used when
>>> working with memory-mapped devices, where memory loads and stores
>>> could not be elided. A single-instruction load-modify-write like
>>> "increment [addr]" adheres to these constraints even though it is a
>>> single instruction.  I realise my understanding could be wrong here!
>>> If not though, both clang and icc are taking a short-cut that may
>>> puts them into non-compliant state.
>> It's hard to be certain.  The language used by the standard is very
>> unhelpful: it requires all accesses to be as written, but does not
>> define exactly what constitutes an access.
> I would look at this sort of thing with the mindset of a network
> protocol designer.  If the externally visible actions are correct,
> the implementation is correct.  Details not visible at the external
> reference interface are irrelevant.
> In the case of volatile variables, the external interface in
> question is the one at the point where that address is implemented —
> a memory cell, or memory mapped I/O device on a bus.  So the
> required behavior is that load and store operations (read and write
> transactions at that interface) occur as written.

I believe this is incorrect.  For accesses to reach memory in program
order on most architectures would require volatile memory references
to emit memory barriers, and the C committee decided not to require

> If a processor has add instructions that support memory references
> (as in x86 and vax, but not mips), such an instruction will perform
> a read cycle followed by a write cycle.  So as seen at the critical
> interface, the behavior is the same as if you were to do an explicit
> load, register add, store sequence.  Therefore the use of a single
> add-to-memory is a valid implementation.

I agree.


More information about the Gcc mailing list