Atomic operations on the ARM

Daniel Jacobowitz drow@mvista.com
Fri Oct 4 12:08:00 GMT 2002


On Fri, Oct 04, 2002 at 11:09:05AM +0100, Richard Earnshaw wrote:
> > On Thu, Oct 03, 2002 at 11:31:15AM -0500, Benjamin Kosnik wrote:
> > > 
> > > > Sure, but if they don't work then there's not much point in them being 
> > > > 'light-weight'!
> > > 
> > > If they don't work, they shouldn't exist, and the generic routines
> > > (which do nothing, and are not atomic) should be used, period.
> > > 
> > > Perhaps the arm configuration should just use the generics, hmm?
> > 
> > This doesn't make sense.  We have generic threading locks; yes, it's
> > heavyweight; but it works.  Isn't correctness important?
> 
> I would have thought that correctness came before efficiency...
> 
> > 
> > Richard, does the problem with using swp in this context also affect
> > ARM/Linux?  Glibc appears to use the just about the same code for
> > atomic operations.
> 
> I haven't seen the glibc code, but if it was trying to do an atomic add 
> using the SWP based sequence that was in atomicity.h, then yes, it's 
> broken.

Well, here's exchange_and_add:
  __asm__ ("\n"
           "0:\tldr\t%0,[%3]\n\t"
           "add\t%1,%0,%4\n\t"
           "swp\t%2,%1,[%3]\n\t"
           "cmp\t%0,%2\n\t"
           "swpne\t%1,%2,[%3]\n\t"
           "bne\t0b"

The others are pretty similar.  So it looks like the same thing to me. 
Ugh!

> The problem was a race condition if the store back failed to find the 
> original value stored in memory after the update:
> 
> 1 0:
> 2	ldr     %0, [%3]
> 3	add     %1, %0, %4
> 4	swp     %2, %1, [%3]  <-- if this fails
> 5	cmp     %0, %2        <-- in this test
> 6	swpne   %1, %2, [%3]  <-- try to restore other change
> 7	bne     0b
> 
> that is, the instruction at line 6 tries to put back the value that was in 
> memory at line 4 (before we tried to store our own update).  But what if 
> another thread also tries to update at that point? the value it inserts in 
> between the two swap operations will be lost.  The attachment in PR 3584 
> illustrates the problem excellently.

I'm not very familiar with ARM, so I'm just going by a quick reference
guide here... but it appears that SWP is the only one of the normal
atomic primitives available.  So we have test-and-set/xchg, but that's
about it.

You can do a 24-bit mutex (which is all at least one other platform
offers) using swpb; then you use one byte of the _Atomic_word as a
spinlock.  That does still have some issues but should be possible
without an ABI change since the size/alignment of the _Atomic_word
don't actually change.  Does that sound worthwhile?

> We could get clever and use a single bit in _Atomic_word to be a mutex 
> bit, and effectively make it a bit-field, effectively
> 
> typedef struct
> {
>   int mutex:1;
>   signed int val:31
> } _Atomic_word;
> 
> But that would involve fixing the source code that directly accesses this 
> type and changing it to use set and read macros.

Why, fancy that.  It looks awfully familiar... :)  See above for why I
think it's advantageous, though.  I don't think you could implement
that safely with only swap-word and swap-byte, also.  Need a whole
byte.

> __gthread_mutex_t _Atomic_add_mutex __attribute__ ((weak));

That would work too.  It's a global spinlock but things done under it
are so short...

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer



More information about the Libstdc++ mailing list