Current __exchange_and_add on ia64 (was: Memory barriers..)

Paolo Carlini pcarlini@suse.de
Fri Nov 18 01:17:00 GMT 2005


Hi all, hi Peter, hi Alexander,

>> I filed other/24757, about this.
>
> Have you tried calling _S_initialize before spawning the threads? If
> the problem disappears, then _S_initialize isn't thread safe. If it
> persists... I'm out of ideas :-)

I'm trying to figure out whether there is really something wrong in the
assembly that mainline gcc is producing for a simple
__sync_fetch_and_add of an int, on ia64. I'm compiling this:

#include <ia64intrin.h>

int
__attribute__ ((__unused__))
__exchange_and_add(volatile int* __mem, int __val)
{ return __sync_fetch_and_add(__mem, __val); }

In fact, I'm seeing something different in mainline vs 4_0-branch (I
think we agreed, some months ago, that the assembly produced by 4_0 was
fine). At -O2:

4_0-branch
---
0000000000000000 <__exchange_and_add>:
   0:   19 00 00 00 22 00       [MMB]       mf
   6:   80 00 80 60 21 00                   ld4.acq r8=[r32]
   c:   00 00 00 20                         nop.b 0x0;;
  10:   09 70 20 00 08 20       [MMI]       addp4 r14=r8,r0
  16:   f0 00 20 00 42 00                   mov r15=r8
  1c:   81 08 01 80                         add r8=r8,r33;;
  20:   0b 00 38 40 2a 04       [MMI]       mov.m ar.ccv=r14;;
  26:   80 40 80 22 20 00                   cmpxchg4.acq r8=[r32],r8,ar.ccv
  2c:   00 00 04 00                         nop.i 0x0;;
  30:   10 00 00 00 01 00       [MIB]       nop.m 0x0
  36:   70 78 20 0c 71 03                   cmp4.eq p7,p6=r15,r8
  3c:   e0 ff ff 4a                   (p06) br.cond.dptk.few 10
<__exchange_and_add+0x10>
  40:   17 00 00 00 00 08       [BBB]       nop.b 0x0
  46:   00 00 00 00 10 80                   nop.b 0x0
  4c:   08 00 84 00                         br.ret.sptk.many b0;;

mainline
---
0000000000000000 <__exchange_and_add>:
   0:   09 78 00 40 b0 10       [MMI]       ld4.acq r15=[r32]
   6:   00 00 00 02 00 00                   nop.m 0x0
   c:   00 00 04 00                         nop.i 0x0;;
  10:   09 00 3c 40 2a 04       [MMI]       mov.m ar.ccv=r15
  16:   e0 00 3c 00 42 e0                   mov r14=r15
  1c:   f1 08 01 80                         add r15=r15,r33;;
  20:   09 40 00 40 22 04       [MMI]       mov.m r8=ar.ccv
  26:   f0 78 80 62 20 00                   cmpxchg4.rel
r15=[r32],r15,ar.ccv
  2c:   00 00 04 00                         nop.i 0x0;;
  30:   13 30 38 1e 07 b8       [MBB]       cmp.eq p6,p7=r14,r15
  36:   01 f0 ff ff 25 80             (p06) br.cond.dpnt.few 10
<__exchange_and_add+0x10>
  3c:   08 00 84 00                         br.ret.sptk.many b0;;

You see, mainline doesn't emit any 'mf'. Another difference is that
mainline uses 'cmpxchg4.rel' instead of 'cmpxchg4.acq'. Now, if I
remember correctly an old message from Alexander, either 'mf' is emitted
before 'cmpxchg4.acq' or after 'cmpxchg4.rel', but must be present...

Help much appreciated...

Thanks in advance,
Paolo.



More information about the Libstdc++ mailing list