This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Current __exchange_and_add on ia64 (was: Memory barriers..)
Hi all, hi Peter, hi Alexander,
>> I filed other/24757, about this.
>
> Have you tried calling _S_initialize before spawning the threads? If
> the problem disappears, then _S_initialize isn't thread safe. If it
> persists... I'm out of ideas :-)
I'm trying to figure out whether there is really something wrong in the
assembly that mainline gcc is producing for a simple
__sync_fetch_and_add of an int, on ia64. I'm compiling this:
#include <ia64intrin.h>
int
__attribute__ ((__unused__))
__exchange_and_add(volatile int* __mem, int __val)
{ return __sync_fetch_and_add(__mem, __val); }
In fact, I'm seeing something different in mainline vs 4_0-branch (I
think we agreed, some months ago, that the assembly produced by 4_0 was
fine). At -O2:
4_0-branch
---
0000000000000000 <__exchange_and_add>:
0: 19 00 00 00 22 00 [MMB] mf
6: 80 00 80 60 21 00 ld4.acq r8=[r32]
c: 00 00 00 20 nop.b 0x0;;
10: 09 70 20 00 08 20 [MMI] addp4 r14=r8,r0
16: f0 00 20 00 42 00 mov r15=r8
1c: 81 08 01 80 add r8=r8,r33;;
20: 0b 00 38 40 2a 04 [MMI] mov.m ar.ccv=r14;;
26: 80 40 80 22 20 00 cmpxchg4.acq r8=[r32],r8,ar.ccv
2c: 00 00 04 00 nop.i 0x0;;
30: 10 00 00 00 01 00 [MIB] nop.m 0x0
36: 70 78 20 0c 71 03 cmp4.eq p7,p6=r15,r8
3c: e0 ff ff 4a (p06) br.cond.dptk.few 10
<__exchange_and_add+0x10>
40: 17 00 00 00 00 08 [BBB] nop.b 0x0
46: 00 00 00 00 10 80 nop.b 0x0
4c: 08 00 84 00 br.ret.sptk.many b0;;
mainline
---
0000000000000000 <__exchange_and_add>:
0: 09 78 00 40 b0 10 [MMI] ld4.acq r15=[r32]
6: 00 00 00 02 00 00 nop.m 0x0
c: 00 00 04 00 nop.i 0x0;;
10: 09 00 3c 40 2a 04 [MMI] mov.m ar.ccv=r15
16: e0 00 3c 00 42 e0 mov r14=r15
1c: f1 08 01 80 add r15=r15,r33;;
20: 09 40 00 40 22 04 [MMI] mov.m r8=ar.ccv
26: f0 78 80 62 20 00 cmpxchg4.rel
r15=[r32],r15,ar.ccv
2c: 00 00 04 00 nop.i 0x0;;
30: 13 30 38 1e 07 b8 [MBB] cmp.eq p6,p7=r14,r15
36: 01 f0 ff ff 25 80 (p06) br.cond.dpnt.few 10
<__exchange_and_add+0x10>
3c: 08 00 84 00 br.ret.sptk.many b0;;
You see, mainline doesn't emit any 'mf'. Another difference is that
mainline uses 'cmpxchg4.rel' instead of 'cmpxchg4.acq'. Now, if I
remember correctly an old message from Alexander, either 'mf' is emitted
before 'cmpxchg4.acq' or after 'cmpxchg4.rel', but must be present...
Help much appreciated...
Thanks in advance,
Paolo.