This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [PATCH][RFC] Remove volatile from data members in libstdc++


On Thu, 13 Jul 2006, Boehm, Hans wrote:

> I think we are all in violent agreement that volatile isn't guaranteed
> to solve the problem.
> 
> But I still claim it's closer than other viable mechanisms we currently
> have.  Thus it's the best available stop-gap workaround until the
> underlying issues get fixed.

I still claim that there are no underlying issues to fix (other than
optimizing rope to not using a mutex).  And I still claim that if there
were underlying issues to fix volatile would help nothing - in fact
likely would make the problem worse as it tends to widen possible race
windows due to extra loads and worse optimized code.

But it's the libstdc++ maintainers call...

so, please take the bunch of patches I submitted and decide what to do
about them for -v3/-v7.

> The other alternatives that have been suggested are:
> 
> 1) The atomicity.h functionality in libstdc++.
> 
> 2) The __sync_ gcc intrinsics.
> 
> 3) volatile assembly code.

The only thing that should happen here wrt 1, 2 and 3 is that v7 should
overhaul it's atomicity.h primitives to use the gcc intrinsics where
available and otherwise fall back to assembly code (for most of the
interesting architectures you can steal from the linux kernel code
in their include/asm-$arch/atomic.h headers).

> I think neither (1) nor (2) really apply here.  They should be (and soon
> will be, I hope) used for atomic read-modify-write operations.  But they
> don't support plain atomic loads and stores.  For rope at least, it's
> the reference count loads (for copy-avoidance) that are the issue.

There are no such things as atomic loads.  Even the kernel relies on
gcc to load properly aligned words from memory in one piece.  Usually
the hardware also relies on proper alignment here (there's the
largest "atomic" load of a aligned L1 cacheline).

> (3) is probably correct, but I don't think it's practical to put
> machine-specific in-line assembly code in such library code.
> 
> One could argue that the right solution is to add loads and stores to
> (2).  But that quickly runs into another major weakness of the current
> primitives: They don't allow the right kind of control over memory
> ordering/visibility.  If you tried to follow the current design and
> include a full memory fence everywhere, the resulting atomic loads would
> be completely unusable, at least on machines like Pentium 4s that have
> slow fences.

?  You are mixing two issues here.  First compiler optimization barriers
and second, hardware load/store barriers.  Of course the gcc sync
builtins already deal with both.  Have to.  Otherwise they would not
work at all.

Again, the linux kernel is a good place to look at for what hoops one
need to jump through on what architectures to get proper optimization
and CPU barriers and atomicity.

Richard.

--
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]