This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ARM: enable interpreter, add locks code


On Wed, Jul 11, 2007 at 10:10:19PM -0000, Boehm, Hans wrote:
> The problem of course is that this works perectly for some use cases,
> e.g. when it's used to implement test-and-set for a spin-lock.  And it
> probably almost never fails on a uniprocessor.  I'd expect it to do
> fairly badly on a multicore processor for use cases for which the
> atomicity really matters.

Yes, that's right.  There has only been ARM multi-core support in the
Linux kernel for a brief while (the last year or so); over the ten
year life of the ARM LinuxThreads port I believe this issue was
discovered by inspection twice, but encountered in practice never.  I
don't even have reports of inexplicable hangs that I can recall.  Of
course that doesn't mean it never happened, but it would require
astoundingly antagonistic context switching.

> My impression is that there are some recent ARM processors that no
> longer implement swp, at least not in hardware.  (It's been deprecated
> since V6).  I'd expect it to do badly on those, too.

It's deprecated but still implemented, to the best of my knowledge.  I
doubt it will go away until v8.

The only ways that I can see to do it are:

  - Assume a kernel sufficiently new to provide the atomic ops
  helpers.

  or

  - Build an SMP-supporting version of the library as a shared library
  and find it at runtime using the glibc hwcap mechanism.  Any
  system that might support SMP will have both the kernel helpers and
  the v6 load locked / store conditional operations.

NPTL, where I last needed this, fortunately gets to assume a
sufficiently recent kernel for other reasons.

I've gotten some feedback in the past that checking at runtime in the
atomic ops is not acceptable.  The load of a global mask value hurts
performance too much.  I have not measured it myself, though.

> After startup overhead, it's presumably a load of a global followed by a
> predictable branch, right?

I believe a number of extant ARM cores have very limited branch
prediction.  Paul or Richard will presumably chime in if I'm wrong :-)

-- 
Daniel Jacobowitz
CodeSourcery


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]