volatile access optimization (C++ / x86_64)

Matt Godbolt matt@godbolt.org
Sat Dec 27 18:04:00 GMT 2014


On Sat, Dec 27, 2014 at 11:57 AM, Andrew Haley <aph@redhat.com> wrote:
> On 27/12/14 00:02, Matt Godbolt wrote:
>> On Fri, Dec 26, 2014 at 5:19 PM, Andrew Haley <aph@redhat.com> wrote:
>>> On 26/12/14 22:49, Matt Godbolt wrote:
>>>> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley <aph@redhat.com> wrote:
>>> Why?
>>
>> Performance.
>
> Okay, but that's not what I was trying to ask: if you don't need an
> atomic access, why do you care that it uses a read-modify-write
> instruction instead of three instructions?  Is it faster?  Have you
> measured it?  Is it so much faster that it's critical for your
> application?

Good point. No; I've yet to measure it but I will. I'll be honest: my
instinct is that really it won't make a measurable difference. From a
microarchitectural point of view it devolves to almost exactly the
same set of micro-operations (barring the duplicate memory address
calculation). It does encode to a longer instruction stream (15 bytes
vs 7 bytes), so there's an argument it puts more pressure than needed
on the i-cache. But honestly, it's more from an aesthetic point of
view I prefer the increment. (The locked version *is* measurable
slower).

Also, it's always nice to understand why particular optimisations
aren't performed by the compiler from a correctness point of view! :)

Thanks all for your fascinating insights :)

-matt



More information about the Gcc mailing list