__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 not defined on aarch64
Toebs Douglass
toby@winterflaw.net
Thu Jun 29 08:57:00 GMT 2017
On 29/06/17 09:47, Andrew Haley wrote:
> Exactly. The fact that people can mess up is no excuse for GCC not
> providing an intrinsic for double-word CAS.
So, I was going to write originally that DWCAS *is* provided, and that
what is missing is *only* the compiler predefine for _SYNC_16.
However, I've just been playing around with a bit of test code and
looking as assembly output and I am now utterly confused.
This is a good thing, because before I was totally confused *but didn't
know it* :-)
It looks like libatomic is for DWCAS emitting a non-atomic DWCAS.
(In which case, if in fact libatomic does not support DWCAS, then the
lack of _SYNC_16 looks correct!)
I'm testing on an aarch64, and this is the test code (where I vary the
type of the variables to perform different lengths of CAS);
#include <stdio.h>
#include <stdlib.h>
int main( void );
int main()
{
__int128 __attribute__ ( (aligned(16)) )
target = 1,
compare = 1,
exchange = 2;
__atomic_compare_exchange_n( &target, &compare, exchange, 0,
__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST );
printf( "%d\n", (int) target );
return( EXIT_SUCCESS );
}
This is how I dump assembly;
objdump -d -M -S a.out
Now, I don't know ARM assembly, so it is certainly the case I'm just not
understanding what I'm looking it; please keep that in mind and be
forgiving!
So, starting with the OS provided GCC 4.9.2;
1. calling GCC without -latomic fails with
test.c:(.text+0x50): undefined reference to `__atomic_compare_exchange_16'
2. calling GCC with -latomic compiles and gives this in the disassembly;
0000000000400500 <__atomic_compare_exchange_16@plt-0x20>:
400500: a9bf7bf0 stp x16, x30, [sp,#-16]!
400504: 90000090 adrp x16, 410000 <__FRAME_END__+0xf840>
400508: f944f211 ldr x17, [x16,#2528]
40050c: 91278210 add x16, x16, #0x9e0
400510: d61f0220 br x17
400514: d503201f nop
400518: d503201f nop
40051c: d503201f nop
and the following in main;
400700: 97ffff88 bl 400520 <__atomic_compare_exchange_16@plt>
(With I believe the usual initial PLT fix-up occurring, which is why the
main listing above is at -0x20.)
There are as far as I can tell in the entire disassembly (and certainly
not in the function call above) any use of the "X" type load/store
instructions, which are the type for load-linked/store-conditional.
In other words, it seems to be a non-atomic DWCAS and indeed *nothing*
atomic is going on, anywhere (but there is here a very high chance I've
just *missed* it, since I don't know ARM).
3. changing the code to use int long long unsigned (64-bit), staying
with 4.9.2, I can link and compile without -latomic. This gives the
following disassembly which is in main proper and looks right;
400650: c85ffc41 ldaxr x1, [x2]
400654: eb03003f cmp x1, x3
400658: 54000061 b.ne 400664 <main+0x44>
40065c: c805fc44 stlxr w5, x4, [x2]
400660: 35ffff85 cbnz w5, 400650 <main+0x30>
4. I have the same outcome with my hand-built GCC 7.1.0.
More information about the Gcc-help
mailing list