This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: Fw: [patch] Make std::tr1::shared_ptr thread-safe.
Alexander Terekhov wrote:
>Itanic stuff that was posted here contained nothing having anything to do
>with memory syncronization AFAICS. The rest of exchange_and_add()
>stuff was totally busted (WRT reference counting and msync) as well.
>
>
Ok, Peter is right, forgot the relevant include, *sorry* (still, I'm
really surprised that no warnings are emitted at compile?!? I'm lucky to
not seriously program with that stuff so often ;). The below is the
assembly for
#include <ia64intrin.h>
int
__attribute__ ((__unused__))
__exchange_and_add(volatile int* __mem, int __val)
{ return __sync_fetch_and_add(__mem, __val); }
(-O0):
0000000000000000 <__exchange_and_add>:
0: 03 10 00 18 00 21 [MII] mov r2=r12
6: c0 00 33 7e 46 c0 adds r12=-32,r12;;
c: 01 16 fc 8c adds r14=-32,r2;;
10: 0d 00 80 1c 98 11 [MFI] st8 [r14]=r32
16: 00 00 00 02 00 c0 nop.f 0x0
1c: 81 16 fc 8c adds r14=-24,r2;;
20: 01 00 84 1c 90 11 [MII] st4 [r14]=r33
26: 00 41 0b 7e 46 e0 adds r16=-24,r2
2c: 01 17 fc 8c adds r15=-16,r2;;
30: 0d 00 40 1e 98 11 [MFI] st8 [r15]=r16
36: 00 00 00 02 00 c0 nop.f 0x0
3c: 01 16 fc 8c adds r14=-32,r2;;
40: 0d 78 00 1c 18 10 [MFI] ld8 r15=[r14]
46: 00 00 00 02 00 c0 nop.f 0x0
4c: 81 17 fc 8c adds r14=-8,r2;;
50: 09 00 3c 1c 98 11 [MMI] st8 [r14]=r15
56: 00 00 00 44 00 00 mf
5c: 82 17 fc 8c adds r16=-8,r2;;
60: 0b 80 00 20 18 10 [MMI] ld8 r16=[r16];;
66: 00 01 40 60 21 00 ld4.acq r16=[r16]
6c: 00 00 04 00 nop.i 0x0;;
70: 1c 00 40 04 90 11 [MFB] st4 [r2]=r16
76: 00 00 00 02 00 00 nop.f 0x0
7c: 00 00 00 20 nop.b 0x0
80: 0d 78 00 04 10 10 [MFI] ld4 r15=[r2]
86: 00 00 00 02 00 c0 nop.f 0x0
8c: c1 16 fc 8c adds r14=-20,r2;;
90: 19 00 3c 1c 90 11 [MMB] st4 [r14]=r15
96: e0 00 08 20 20 00 ld4 r14=[r2]
9c: 00 00 00 20 nop.b 0x0;;
a0: 0d 00 38 40 2a 04 [MFI] mov.m ar.ccv=r14
a6: 00 00 00 02 00 00 nop.f 0x0
ac: 02 17 fc 8c adds r16=-16,r2;;
b0: 0a 80 00 20 18 10 [MMI] ld8 r16=[r16];;
b6: e0 00 40 20 20 00 ld4 r14=[r16]
bc: 00 00 04 00 nop.i 0x0
c0: 0b 78 00 04 10 10 [MMI] ld4 r15=[r2];;
c6: f0 78 38 00 40 00 add r15=r15,r14
cc: 00 00 04 00 nop.i 0x0;;
d0: 0d 00 3c 04 90 11 [MFI] st4 [r2]=r15
d6: 00 00 00 02 00 00 nop.f 0x0
dc: 82 17 fc 8c adds r16=-8,r2;;
e0: 19 70 00 20 18 10 [MMB] ld8 r14=[r16]
e6: f0 00 08 20 20 00 ld4 r15=[r2]
ec: 00 00 00 20 nop.b 0x0;;
f0: 0a 78 3c 1c 11 10 [MMI] cmpxchg4.acq
r15=[r14],r15,ar.ccv;;
f6: 00 78 08 20 23 00 st4 [r2]=r15
fc: 00 00 04 00 nop.i 0x0
100: 0d 70 00 04 10 10 [MFI] ld4 r14=[r2]
106: 00 00 00 02 00 00 nop.f 0x0
10c: c2 16 fc 8c adds r16=-20,r2;;
110: 0a 80 00 20 10 10 [MMI] ld4 r16=[r16];;
116: 70 80 38 0c 71 00 cmp4.eq p7,p6=r16,r14
11c: 00 00 04 00 nop.i 0x0
120: 1c 00 00 00 01 00 [MFB] nop.m 0x0
126: 00 00 00 02 00 03 nop.f 0x0
12c: 60 ff ff 4a (p06) br.cond.dptk.few 80
<__exchange_and_add+0x80>
130: 0b 78 b0 05 3f 23 [MMI] adds r15=-20,r2;;
136: e0 00 3c 20 20 00 ld4 r14=[r15]
13c: 00 00 04 00 nop.i 0x0;;
140: 11 40 00 1c 00 21 [MIB] mov r8=r14
146: c0 00 08 00 42 80 mov r12=r2
14c: 08 00 84 00 br.ret.sptk.many b0;;
(-O2)
0000000000000000 <__exchange_and_add>:
0: 19 00 00 00 22 00 [MMB] mf
6: 80 00 80 60 21 00 ld4.acq r8=[r32]
c: 00 00 00 20 nop.b 0x0;;
10: 09 70 20 00 08 20 [MMI] addp4 r14=r8,r0
16: f0 00 20 00 42 00 mov r15=r8
1c: 81 08 01 80 add r8=r8,r33;;
20: 0b 00 38 40 2a 04 [MMI] mov.m ar.ccv=r14;;
26: 80 40 80 22 20 00 cmpxchg4.acq r8=[r32],r8,ar.ccv
2c: 00 00 04 00 nop.i 0x0;;
30: 10 00 00 00 01 00 [MIB] nop.m 0x0
36: 70 78 20 0c 71 03 cmp4.eq p7,p6=r15,r8
3c: e0 ff ff 4a (p06) br.cond.dptk.few 10
<__exchange_and_add+0x10>
40: 17 00 00 00 00 08 [BBB] nop.b 0x0
46: 00 00 00 00 10 80 nop.b 0x0
4c: 08 00 84 00 br.ret.sptk.many b0;;
>So reg or not reg, your testing is a bit premature I'm afraid.
>
>
But, as pointed out by Peter too, for now we do *not* want to rely on a
correct __exchange_and_add, this is a project for 4.0.1, not for 4.0.0.
I thought that Jonathan was already using old-good locking in all the
important places + minimal safe use of exchange_and_add?!?
Thus, my question: missing a correct, fully-fenced, exchange_and_add, is
it possible a correct (albeit not optimally performing)
implementation?!? I think this very basic question has not received a
clear-cut answer yet. It is not a catastrophe if for 4.0.0 shared_ptr
works in MT only on a subset of the targets (only x86, x86_64, ia64, for
instance) but we *must* know, as soon as possible.
Paolo.
P.S. Can we possibly avoid the not-so-funny-(anymore) "itanic" jargon?