[PATCH][AArch64] Emit tighter strong atomic compare-exchange loop when comparing against zero

Kyrill Tkachov kyrylo.tkachov@foss.arm.com
Fri Jun 2 10:38:00 GMT 2017


Ping.

Thanks,
Kyrill

On 08/05/17 12:00, Kyrill Tkachov wrote:
> Ping.
>
> Thanks,
> Kyrill
>
> On 24/04/17 10:38, Kyrill Tkachov wrote:
>> Pinging this back into context so that I don't forget about it...
>>
>> https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00376.html
>>
>> Thanks,
>> Kyrill
>>
>> On 08/03/17 16:35, Kyrill Tkachov wrote:
>>> Hi all,
>>>
>>> For the testcase in this patch where the value of x is zero we currently generate:
>>> foo:
>>>         mov     w1, 4
>>> .L2:
>>>         ldaxr   w2, [x0]
>>>         cmp     w2, 0
>>>         bne     .L3
>>>         stxr    w3, w1, [x0]
>>>         cbnz    w3, .L2
>>> .L3:
>>>         cset    w0, eq
>>>         ret
>>>
>>> We currently cannot merge the cmp and b.ne inside the loop into a cbnz because we need
>>> the condition flags set for the return value of the function (i.e. the cset at the end).
>>> But if we re-jig the sequence in that case we can generate a tighter loop:
>>> foo:
>>>         mov     w1, 4
>>> .L2:
>>>         ldaxr   w2, [x0]
>>>         cbnz    w2, .L3
>>>         stxr    w3, w1, [x0]
>>>         cbnz    w3, .L2
>>> .L3:
>>>         cmp     w2, 0
>>>         cset    w0, eq
>>>         ret
>>>
>>> So we add an explicit compare after the loop and inside the loop we use the fact that
>>> we're comparing against zero to emit a CBNZ. This means we may re-do the comparison twice
>>> (once inside the CBNZ, once at the CMP at the end), but there is now less code inside the loop.
>>>
>>> I've seen this sequence appear in glibc locking code so maybe it's worth adding the extra bit
>>> of complexity to the compare-exchange splitter to catch this case.
>>>
>>> Bootstrapped and tested on aarch64-none-linux-gnu. In previous iterations of the patch where
>>> I had gotten some logic wrong it would cause miscompiles of libgomp leading to timeouts in its
>>> testsuite but this version passes everything cleanly.
>>>
>>> Ok for GCC 8? (I know it's early, but might as well get it out in case someone wants to try it out)
>>>
>>> Thanks,
>>> Kyrill
>>>
>>> 2017-03-08  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>     * config/aarch64/aarch64.c (aarch64_split_compare_and_swap):
>>>     Emit CBNZ inside loop when doing a strong exchange and comparing
>>>     against zero.  Generate the CC flags after the loop.
>>>
>>> 2017-03-08  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>     * gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: New test.
>>
>



More information about the Gcc-patches mailing list