[PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

Wed Jul 26 17:35:00 GMT 2017

On 07/26/2017 11:19 AM, Alexander Monakov wrote:
> On Wed, 26 Jul 2017, Jeff Law wrote:
>> So I think this is up to the target maintainers.  I have no concerns
>> with enabling use of expand_asm_memory_barrier to be used outside of
>> optabs.  So if the s390/x86 maintainers want to go forward, the optabs
>> changes are pre-approved.
> 
> Please see the alternative middle-end patch:
>  https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00549.html
I almost made the comment that we should look if we could do this at a
higher level.  But then figured that the expander could be called from
multiple locations, thus requiring that all those points be fixed or at
least routed to a common function.

But it looks like all the fence stuff is routed into
expand_mem_thread_fence, so indeed that's a better place.

[ And if there were only a good way to enforce that we can't
  directly call the backend  expander from anywhere other than
  the bless location.  ]

> and the recently reported related bug:
>  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81316>
> Since we got it wrong for both fences and loads/stores on two different targets,
> perhaps fixing it once and for all in the middle-end is more appropriate.  I
> hope extraneous compiler barriers can be tolerated.
Agreed on middle-end being more appropriate.

I'm not sure what you mean by extraneous compiler barriers -- isn't the
worst case scenario here that the target emits them as well?  So there
would be an extraneous one in that case, but that ought to be a "don't
care".

In the middle end patch, do we need a barrier before the fence as well?
The post-fence barrier prevents reordering the fence with anything which
follows the fence.  But do we have to also prevent reordering the fence
with prior instructions with any of the memory models?  This isn't my
area of expertise, so if it's dumb question, don't hesitate to let me
know :-)

> 
> Yep - the same applies to atomic loads/stores too.  They can be expanded
> as volatile accesses, but we need compiler barrier(s) anyway (except for
> those with relaxed memory model).
Right.
jeff