This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/71191] aarch64 and others: __atomic_load;arithmetic;__atomic_compare_exchange loops should be able to generate better code with LL/SC-type constructs than a CAS loop


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71191

--- Comment #5 from dhowells at redhat dot com <dhowells at redhat dot com> ---
(In reply to Ramana Radhakrishnan from comment #4)
> (In reply to dhowells@redhat.com from comment #0)
> > ...
> > If the CPU has LL/SC constructs available, something like this is probably
> > better implemented using that than a CMPXCHG loop - _provided_ the bit
> > between the __atomic_load and the __atomic_compare_exchange doesn't resolve
> > to more than a few instructions
> 
> Making the compiler deal with "doesn't resolve to more than a few
> instructions" and dealing with architecture restrictions will be the fun
> part here.
> 
> It's architecture specific as to what can or cannot go into the loop. On
> AArch64 there are restrictions on what kind of instructions can go into
> these LL/SC loops using the exclusive instructions i.e. the LDAXR / STLXR
> instructions.  For e.g. these loops cannot contain other loads and stores
> and we work pretty hard to make sure that there is no accidental spilling
> inside these loops. See PR69904 for an example of where an optimization was
> skirting on the edges of the architecture - aarch32 is quite similar to
> aarch64 in this respect. 

I agree that this won't be easy.  I'm quite okay with it being limited to only
arithmetic instructions and branches out of the protected section (and
definitely no memory accesses).  Further, defining what is meant by a "few
instructions" I can see is also tricky.

> This is probably a bit harder to do in a generic manner rather than just
> adding combine patterns as we will be doing that in the backend till kingdom
> come.

Adding some set patterns might well suffice.  Something approximating
add-unless would be really good as the kernel uses that a lot.

> I'm leaving this as a target bug for now and marking it as relevant to both
> the backends but I suspect this affects other architectures too that have
> LL/SC style atomics.

powerpc64 definitely.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]