[PATCH][AArch64] Emit tighter strong atomic compare-exchange loop when comparing against zero

Kyrill Tkachov kyrylo.tkachov@foss.arm.com
Mon Apr 24 09:54:00 GMT 2017


Pinging this back into context so that I don't forget about it...

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00376.html

Thanks,
Kyrill

On 08/03/17 16:35, Kyrill Tkachov wrote:
> Hi all,
>
> For the testcase in this patch where the value of x is zero we currently generate:
> foo:
>         mov     w1, 4
> .L2:
>         ldaxr   w2, [x0]
>         cmp     w2, 0
>         bne     .L3
>         stxr    w3, w1, [x0]
>         cbnz    w3, .L2
> .L3:
>         cset    w0, eq
>         ret
>
> We currently cannot merge the cmp and b.ne inside the loop into a cbnz because we need
> the condition flags set for the return value of the function (i.e. the cset at the end).
> But if we re-jig the sequence in that case we can generate a tighter loop:
> foo:
>         mov     w1, 4
> .L2:
>         ldaxr   w2, [x0]
>         cbnz    w2, .L3
>         stxr    w3, w1, [x0]
>         cbnz    w3, .L2
> .L3:
>         cmp     w2, 0
>         cset    w0, eq
>         ret
>
> So we add an explicit compare after the loop and inside the loop we use the fact that
> we're comparing against zero to emit a CBNZ. This means we may re-do the comparison twice
> (once inside the CBNZ, once at the CMP at the end), but there is now less code inside the loop.
>
> I've seen this sequence appear in glibc locking code so maybe it's worth adding the extra bit
> of complexity to the compare-exchange splitter to catch this case.
>
> Bootstrapped and tested on aarch64-none-linux-gnu. In previous iterations of the patch where
> I had gotten some logic wrong it would cause miscompiles of libgomp leading to timeouts in its
> testsuite but this version passes everything cleanly.
>
> Ok for GCC 8? (I know it's early, but might as well get it out in case someone wants to try it out)
>
> Thanks,
> Kyrill
>
> 2017-03-08  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/aarch64/aarch64.c (aarch64_split_compare_and_swap):
>     Emit CBNZ inside loop when doing a strong exchange and comparing
>     against zero.  Generate the CC flags after the loop.
>
> 2017-03-08  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: New test.



More information about the Gcc-patches mailing list