Failure to optimize?

Florian Weimer fweimer@redhat.com
Tue Jan 12 14:22:46 GMT 2021


* Jonathan Wakely:

> On Tue, 12 Jan 2021 at 13:37, Florian Weimer via Gcc-help
> <gcc-help@gcc.gnu.org> wrote:
>>
>> * ☂Josh Chia (謝任中) via Gcc-help:
>>
>> > I have a code snippet that I'm wondering why GCC didn't optimize the way I
>> > think it should:
>> > https://godbolt.org/z/1qKvax
>> >
>> > bar2() is a variant of bar1() that has been manually tweaked to avoid
>> > branches. I haven't done any benchmarks but, I would expect the branchless
>> > bar2() to perform better than bar1() but GCC does not automatically
>> > optimize bar1() to be like bar2(); the generated code for bar1() and bar2()
>> > are different and the generated code for bar1() contains a branch.
>>
>> The optimization is probably valid for C99, but not for C11, where the
>> memory model prevents the compiler from introducing spurious writes:
>> Another thread may modify the variable concurrently, and if this happens
>> only if foo returns NULL, the original bar1 function does not contain a
>> data race, but the branchless version would.
>
> I'm not sure about the rules for C, but in C++ the compiler can assume
> there is no race, because the increment is not atomic.

The problem is that the store is conditional as written.  The compiler
cannot introduce an unconditional store due to that.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



More information about the Gcc-help mailing list