This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gcc 3.2.3 config/os/bits/i386/atomicity.h broken



Loren James Rittle wrote:
> 
> In article <3EA93D8A dot C9A4440B at OARcorp dot com> Joel Sherrill writes:
> 
> Joel, I don't know if you've been reading the list but you are a
> godsend (maybe ;-).

I don't read this list very often.  Too much on my plate already.
Angelo has been fighting this for a few months and has been stuck
at gcc 3.2.  I just got frustrated enough to pester him into debugging
it to the offending instruction and then homed in on that.  

> > This area of libstdc++-v3 appears to have had significant
> > work after the 3.2 branch was cut so I don't know if this
> > bug is still there.
> 
> > This file uses instructions which are not available on a
> > plain i386.
> 
> True, this bug was not fixed on the 3.2 branch (timing of ABI change).
> Your analysis of prohibited instructions on i386 is all right AFAIK.

That's good to hear.  I had trouble finding clear documentation.
I really would like to be more confident that EVERY CPU model i486 and
above
has xaddl support.  

> > static inline _Atomic_word
> > __attribute__ ((__unused__))
> > __exchange_and_add (volatile _Atomic_word *__mem, int __val)
> > {
> >  register _Atomic_word __result, __tmp;
> >  static volatile _Atomic_word __lock = 1;
> >
> > /* obtain the atomic exchange/add lock */
> >  do {
> >    __tmp = 0;
> >    __asm__ __volatile__("xchgl %0,%1"
> >             :"=r" (__tmp)
> >             :"m" (&__lock), "0" (__lock)
> >             :"memory");
> >  } while ( __tmp == 0 );
> >
> >    __result = *__mem;
> >    *__mem += __val;
> >
> >  /* release spin lock */
> >  __lock = 1;
> >
> >  return __result;
> > }
> 
> > As best I can tell, in gcc 3.3 and later, there is no support
> > for atomicity.h on the basic i386 at all.  Is this right?
> > If this implementation works, could it be conditionally used
> > in the x86 atomicity.h for the vanilla i386 and included
> > in the single file?
> 
> Does the above implementation work if the processor is: i486, i586,
> i686, Athlon, etc? 

First, I have no real easy way to ensure that the above implementation
works at all.  By that I would want to see one thread spin until the
other finished this operation. :)

But it should work on any x86 derived CPU since AFAIK all should 
have the xchg.

But I have no opposition to having the lock;xaddl implementation
selected
if the appropriate CPU model cpp predefine is defined.  RTEMS uses
multilibs
on the i386 target so each variant could get teh best algorithm, not
just
an algorithm.

> Does it work in SMP configurations of the higher processors? 

I think it would work better in SMP configurations since the thread that
spins on the lock would have to ensure that the 1st thread gets to run
again
so the lock is released.  With threads on different CPUs, this is easy.
With threads on the same CPU, you have to consider thread scheduling 
algorithms and preemption.

> If you think that the answer to both questions are
> unqualified "yes", then it could be used as a replacement for the old
> bad code in i386/atomicity.h.  If it only works for single-processor
> i386 systems then the generic code must remain as various UNIX systems
> configure as i386-*-* even when they have high-end CPUs.  I have
> googled on "xchgl avoid" and read the results.  Looks promising; as I
> have desired to not ship generic code for i386 in any release.

Not shipping code that runs on a generic i386 breaks C++ for many
embedded
targets. There are quite a few developers out there using CPUs like the
i386 or other embedded i386 variants.

> This strongly implied that any architecture, SMP or single processor,
> in the IA32 family would handle your code correctly:
> 
>   http://www.wlug.org.nz/HowToParallelProcessingHOWTO

I see no reason why it wouldn't work as long as that static "lock
variable"
is in shared memory.  If not, all bets are off.

> However, the patch within implies to me that plain xchgl will fail on
> x86-64 (would a compiler configured as i386-*-* be expected to support
> x86-64? was the patch pessimistic?):
> 
>   http://www.x86-64.org/lists/discuss/msg01969.html

I don't know which part of that patch you are referring to.  I see
ifdef's on the CMPXCHG instruction which (I think) is Pentium and
above.  So that is an optimization.  What about that post bothers you?

The comment in semaphore.c appears to be about some other problem.

> (The reason, I'm questioning above, is because I don't know.  I recall
> that the 68K weakened the strength of TAS between 000 and 0[12]0 when
> they added support for MP and a new bus locking protocol.  But that
> was a lifetime ago.)

Yes it was. :)  I remember using the TAS on the 68020 and you
to be VERY careful when using it with VMEbus shared memory.  Some
boards didn't get it right.

> The regression in performance on configured-as i386-* but really
> i686-* machines has been measurable on both 3.3 and mainline.

This is why RTEMS went to multilibs on the i386 a LONG time ago.
We multilib these variants:

./athlon/libc.a
./k6/libc.a
./m486/soft-float/libc.a
./m486/libc.a
./mpentium/libc.a
./mpentiumpro/libc.a
./soft-float/nofp/libc.a
./soft-float/libc.a
./libc.a

But RTEMS is an embedded operating system and we can't tolerate 
trapping for unsupported instructions.  We want the best libraries
possible for the target.  This also pushes gcc a bit harder as
it ensures that gcc can compile more code with more CPU options.

> This does not fix the ABI issue raised in my response to Matthias.  In
> fact, we need to be careful not to remove the link error, when we
> apply your change.

This one I do not know about.

> Joel, are you willing and able to post a complete updated version of
> config/cpu/i386/atomicity.h?  If the code was reviewed by enough
> people, I'd really want to see it get into 3.3.

Sure.  Would you want it to be stright the generic version of ifdef'ed
so when compiled for the higher level CPU models, it uses the
lock;xaddl?

Does this have a PR?  

> Regards,
> Loren

-- 
Joel Sherrill, Ph.D.             Director of Research & Development
joel at OARcorp dot com                 On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available                (256) 722-9985


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]