This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On 10/30/18 8:32 PM, James Greenhalgh wrote: > On Tue, Oct 02, 2018 at 11:19:09AM -0500, Richard Henderson wrote: >> When the result of an operation is not used, we can ignore the >> result by storing to XZR. For two of the memory models, using >> XZR with LD<op> has a preferred assembler alias, ST<op>. > > ST<op> has different semantics to LD<op>, in particular, ST<op> is not > ordered by a DMB LD; so this could weaken the LDADD and break C11 semantics. > > The relevant Arm Arm text is: > > If the destination register is not one of WZR or XZR, LDADDA and > LDADDAL load from memory with acquire semantics > > LDADDL and LDADDAL store to memory with release semantics. > > LDADD has no memory ordering requirements. > > I'm taking this to mean that even if the result is unused, using XZR is not > a valid transformation; it weakens the expected acquire semantics to > unordered. > > The example I have from Will Deacon on an internal bug database is: > > P0 (atomic_int* y,atomic_int* x) { > atomic_store_explicit(x,1,memory_order_relaxed); > atomic_thread_fence(memory_order_release); > atomic_store_explicit(y,1,memory_order_relaxed); > } > > P1 (atomic_int* y,atomic_int* x) { > int r0 = atomic_fetch_add_explicit(y,1,memory_order_relaxed); > atomic_thread_fence(memory_order_acquire); > int r1 = atomic_load_explicit(x,memory_order_relaxed); > } > > The outcome where y == 2 and P1 has r0 = 1 and r1 = 0 is illegal. > > This example comes from a while back in my memory; so copying Will for > any more detailed questions. > > My impression is that this transformation is not safe, and so the patch is > not OK. Here's a revised patch. Use ST<op> for relaxed and release orderings, retain the (non-xzr) scratch register for other orderings. But the scratch need not be early-clobber, since there's no mid-point of half-consumed inputs. r~
Attachment:
z
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |