[Bug target/93177] PPC: Missing many useful platform intrinsics
memmerto at ca dot ibm.com
gcc-bugzilla@gcc.gnu.org
Mon Jan 13 15:18:00 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93177
--- Comment #9 from Matt Emmerton <memmerto at ca dot ibm.com> ---
(In reply to Segher Boessenkool from comment #6)
> (In reply to Matt Emmerton from comment #4)
> > The intrinsics that we would find useful, having used them as provided by
> > the IBM XL C/C++ compiler, are the following:
> >
> > __sync()
> > __isync()
> > __lwsync()
>
> The sync intrinsics need to be tied to some other code. A volatile asm with
> a "memory" clobber is not good enough, in many cases.
We use these in our internal mutex and atomic implementations, and the
resulting sequences are carefully scrutinized.
> > __lwarx()
> > __ldarx()
> > __stwcx()
> > __stdcx()
>
> The compiler can always insert memory accesses in between those two, if you
> have them as separate intrinsics (and it will, simply stack accesses for
> temporaries will do, already). If those accesses hit the same reservation
> granule as the larx/stcx. uses, you lose.
>
> You need to write the whole sequence in one piece of assembler code.
I would argue that the compiler should be smart enough to realize that these
are part of a decomposed atomic operation, and avoid arbitrary instruction
injection.
As per my previous update, we use these primitives to implement things that the
bulitin __atomic_* functions do not implement.
> > __protected_stream_set()
> > __protected_stream_count()
> > __protected_stream_count_depth() // currently not implemented in gcc
> > __protected_stream_go()
>
> Those are pretty specific to CBE I think?
No. They are implemented on POWER5 and above (ISA 2.02), and are useful in
managing cache prefetch behaviour.
> > The implementation of stwcx() and stdcx() need revision on PPC.
> > As I understand it, there is no need the mfocrf instruction nor the
> > mask-and-shift on result.
>
> How else would you output the CR0.EQ bit?
There is no need to copy CR0 to a GPR - branch instructions such as BNE can
operate on CR0 directly.
More information about the Gcc-bugs
mailing list