Memory barriers vs lock/unlock
Peter Dimov
pdimov@mmltd.net
Tue Nov 8 19:21:00 GMT 2005
Paolo Carlini wrote:
> Hi,
>
> in our simple port of boost_shared_ptr we have some naively puzzling
> things like:
>
> void
> release() // nothrow
> {
> if (__gnu_cxx::__exchange_and_add(&_M_use_count, -1) == 1)
> {
> dispose();
> __glibcxx_mutex_lock(_M_mutex);
> __glibcxx_mutex_unlock(_M_mutex);
> weak_release();
> }
> }
void
weak_release() // nothrow
{
if (__gnu_cxx::__exchange_and_add(&_M_weak_count, -1) == 1)
{
__glibcxx_mutex_lock(_M_mutex);
__glibcxx_mutex_unlock(_M_mutex);
destroy();
}
}
> I'm currently investigating that, and I'm not an expert of this area,
> and I'd like to have some help. If I remember correctly some old
> exchanges those "weird" empty lock/unlock stem from the need to add
> memory barriers, nothing more.
Correct. The sequence is thread A invoking release(), dropping the last
strong reference, then thread B invoking weak_release(), dropping the last
weak reference. Thread B's destroy() needs to observe the effects of thread
A's dispose().
It's possible to optimize release a bit by inlining weak_release by hand:
void
release() // nothrow
{
if (__gnu_cxx::__exchange_and_add(&_M_use_count, -1) == 1)
{
dispose();
__glibcxx_mutex_lock(_M_mutex);
__glibcxx_mutex_unlock(_M_mutex);
if (__gnu_cxx::__exchange_and_add(&_M_weak_count, -1) == 1)
{
destroy();
}
}
}
There's no need to lock the mutex twice.
> Therefore, I'm wondering whether we
> wouldn't be best off using right away _GLIBCXX_READ_MEM_BARRIER and
> _GLIBCXX_WRITE_MEM_BARRIER like, for instance, libsupc++/guard.cc is
> already doing.
_GLIBCXX_READ_MEM_BARRIER is a #loadLoad barrier; _GLIBCXX_WRITE_MEM_BARRIER
is #storeStore. Neither is a full barrier. However, since __exchange_and_add
is a read-modify-write operation (both a load and a store), a combination of
_GLIBCXX_READ_MEM_BARRIER and _GLIBCXX_WRITE_MEM_BARRIER can be used instead
of the lock/unlock pair. I think. :-)
When __exchange_and_add is implemented in terms of __sync_fetch_and_add,
which seems to guarantee full ordering, there'll be no need for lock+unlock
or explicit barriers.
More information about the Libstdc++
mailing list