This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Reenable CSE of non-volatile inline asm (PR rtl-optimization/63637)


On 01/15/2015 12:13 AM, Jakub Jelinek wrote:
> On Thu, Jan 15, 2015 at 07:46:18AM +0100, Richard Biener wrote:
>>>> That last line means the compiler is free to delete a non-volatile
>>>> asm with a memory clobber if that asm is not needed for dataflow.  Or
>>>> that is how I read it; it is trying to indicate that if you want to
>>>> prevent the memory clobber from being deleted (together with the rest
>>>> of the asm), you need to make the asm volatile.
>>>>
>>>> So as far as I can see the compiler can CSE two identical
>>> non-volatile
>>>> asms with memory clobber just fine.  Older GCC (I tried 4.7.2) does
>>> do
>>>> this; current mainline doesn't.  I think it should.
>>> No, it should not CSE those two cases.  That's simply wrong and if an 
>>> older version did that optimization, that's a bug.
>>
>> I think segher has a point here.  If the asm with memory clobber would store to random memory and the point would be to preserve that then the whole distinction with volatile doesn't make much sense (after all without volatile we happily DCE such asm if the regular outputs are not needed).
>>
>> This doesn't mean 'memory' is a well-designed thing, of course. Just its
>> effects are effectively limited to reads without volatile(?)
> 
> Segher's mails talk about "memory" being a write but not read.
> If we even can't agree on what non-volatile "memory" means, I think
> we should treat it more conservatively, because every user (and there are
> lots of people using non-volatile asm with "memory" in the wild) treats it
> differently.  Just trying to grep for a few:
> glibc:
> ./sysdeps/alpha/bits/atomic.h:# define atomic_full_barrier()	__asm ("mb" : : : "memory");
> ./sysdeps/alpha/bits/atomic.h:# define atomic_read_barrier()	__asm ("mb" : : : "memory");
> ./sysdeps/alpha/bits/atomic.h:# define atomic_write_barrier()	__asm ("wmb" : : : "memory");
> ./sysdeps/sparc/sparc32/bits/atomic.h:# define atomic_full_barrier() __asm ("" ::: "memory")
> ./sysdeps/powerpc/powerpc32/bits/atomic.h:# define atomic_read_barrier()	__asm ("lwsync" ::: "memory")
> ./sysdeps/powerpc/powerpc32/bits/atomic.h:# define atomic_read_barrier()	__asm ("sync" ::: "memory")
> ./sysdeps/powerpc/powerpc64/bits/atomic.h:#define atomic_read_barrier()	__asm ("lwsync" ::: "memory")
> ./sysdeps/powerpc/bits/atomic.h:#define atomic_full_barrier()	__asm ("sync" ::: "memory")
> ./sysdeps/powerpc/bits/atomic.h:#define atomic_write_barrier()	__asm ("eieio" ::: "memory")
> ./sysdeps/generic/malloc-machine.h:# define atomic_full_barrier() __asm ("" ::: "memory")

I think that it's uses like these -- which may well have been written
by folks that also work on gcc -- that are proof that we have at least
intended to support a memory clobber to be a full read+write barrier,
and thus we must consider a memory clobber to be both a read and write.

(The fact that all of these are automatically volatile and would never be CSEd
is beside the point.  If the semantics of a memory clobber differ based on the
volatile flag on the asm, I think that would be too ill-defined to actually
support.)

In the interest of progressing wrt the current regression, I think that
Jakub's patch should go in as-is for now, and then we iterate on how we
think the memory cse ought (or ought not) to occur.

As for my own thoughts on whether two non-volatile asms with memory clobbers
should be CSE'd, in absence of other stores to memory in between, are
complicated and probably not well-formed.  I'll think about it some more.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]