This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Current __exchange_and_add on ia64 (was: Memory barriers..)


Hi all, hi Peter, hi Alexander,

>> I filed other/24757, about this.
>
> Have you tried calling _S_initialize before spawning the threads? If
> the problem disappears, then _S_initialize isn't thread safe. If it
> persists... I'm out of ideas :-)

I'm trying to figure out whether there is really something wrong in the
assembly that mainline gcc is producing for a simple
__sync_fetch_and_add of an int, on ia64. I'm compiling this:

#include <ia64intrin.h>

int
__attribute__ ((__unused__))
__exchange_and_add(volatile int* __mem, int __val)
{ return __sync_fetch_and_add(__mem, __val); }

In fact, I'm seeing something different in mainline vs 4_0-branch (I
think we agreed, some months ago, that the assembly produced by 4_0 was
fine). At -O2:

4_0-branch
---
0000000000000000 <__exchange_and_add>:
   0:   19 00 00 00 22 00       [MMB]       mf
   6:   80 00 80 60 21 00                   ld4.acq r8=[r32]
   c:   00 00 00 20                         nop.b 0x0;;
  10:   09 70 20 00 08 20       [MMI]       addp4 r14=r8,r0
  16:   f0 00 20 00 42 00                   mov r15=r8
  1c:   81 08 01 80                         add r8=r8,r33;;
  20:   0b 00 38 40 2a 04       [MMI]       mov.m ar.ccv=r14;;
  26:   80 40 80 22 20 00                   cmpxchg4.acq r8=[r32],r8,ar.ccv
  2c:   00 00 04 00                         nop.i 0x0;;
  30:   10 00 00 00 01 00       [MIB]       nop.m 0x0
  36:   70 78 20 0c 71 03                   cmp4.eq p7,p6=r15,r8
  3c:   e0 ff ff 4a                   (p06) br.cond.dptk.few 10
<__exchange_and_add+0x10>
  40:   17 00 00 00 00 08       [BBB]       nop.b 0x0
  46:   00 00 00 00 10 80                   nop.b 0x0
  4c:   08 00 84 00                         br.ret.sptk.many b0;;

mainline
---
0000000000000000 <__exchange_and_add>:
   0:   09 78 00 40 b0 10       [MMI]       ld4.acq r15=[r32]
   6:   00 00 00 02 00 00                   nop.m 0x0
   c:   00 00 04 00                         nop.i 0x0;;
  10:   09 00 3c 40 2a 04       [MMI]       mov.m ar.ccv=r15
  16:   e0 00 3c 00 42 e0                   mov r14=r15
  1c:   f1 08 01 80                         add r15=r15,r33;;
  20:   09 40 00 40 22 04       [MMI]       mov.m r8=ar.ccv
  26:   f0 78 80 62 20 00                   cmpxchg4.rel
r15=[r32],r15,ar.ccv
  2c:   00 00 04 00                         nop.i 0x0;;
  30:   13 30 38 1e 07 b8       [MBB]       cmp.eq p6,p7=r14,r15
  36:   01 f0 ff ff 25 80             (p06) br.cond.dpnt.few 10
<__exchange_and_add+0x10>
  3c:   08 00 84 00                         br.ret.sptk.many b0;;

You see, mainline doesn't emit any 'mf'. Another difference is that
mainline uses 'cmpxchg4.rel' instead of 'cmpxchg4.acq'. Now, if I
remember correctly an old message from Alexander, either 'mf' is emitted
before 'cmpxchg4.acq' or after 'cmpxchg4.rel', but must be present...

Help much appreciated...

Thanks in advance,
Paolo.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]