This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: should sync builtins be full optimization barriers?

From: Ken Raeburn <raeburn at raeburn dot org>
To: Andrew MacLeod <amacleod at redhat dot com>
Cc: GCC Mailing List <gcc at gcc dot gnu dot org>
Date: Mon, 12 Sep 2011 20:30:37 -0400
Subject: Re: should sync builtins be full optimization barriers?
References: <4E69C942.3090808@gnu.org> <20110909081705.GT2687@tyan-ft48-01.lab.bos.redhat.com> <5F13A1A0-79E5-4733-B543-4A6F6311A247@adacore.com> <4E6CC1E4.5000000@redhat.com> <93C7346D-DC47-4C6B-9755-EF438D82DDEA@adacore.com> <4E6DAE90.3070202@gnu.org> <E4CDF223-D7BA-4450-A429-8C5F39AE2B97@adacore.com> <4E6E9374.3020601@redhat.com>

On Sep 12, 2011, at 19:19, Andrew MacLeod wrote:
> lets say the order of the writes turns out to be  2,4...  is it possible for both writes to be travelling around some bus and have thread 4 actually read the second one first, followed by the first one?   It would imply a lack of memory coherency in the system wouldn't it? My simple understanding is that the hardware gives us this sort of minimum guarantee on all shared memory. which means we should never see that happen.

According to section 8.2.3.5 "Intra-Processor Forwarding Is Allowed" of "Intel 64 and IA-32 Architectures Software Developer's Manual" volume 3A, December 2009, a processor can see its own store happening before another's, though the example works on two different memory locations.  If at least one of the threads reading the values was on the same processor as one of the writing threads, perhaps it could see the locally-issued store first, unless thread-switching is presumed to include a memory fence.  Consistency of order is guaranteed *from the point of view of other processors* (8.2.3.7), which is not necessarily the case here.  A total order across all processors is imposed for locked instructions (8.2.3.8), but I'm not sure whether their use is assumed here.  I'm still reading up on caching protocols, write-back memory, etc.  Still not sure either way whether the original example can work...

Ken

Follow-Ups:
- Re: should sync builtins be full optimization barriers?
  - From: Andy Lutomirski

References:
- should sync builtins be full optimization barriers?
  - From: Paolo Bonzini
- Re: should sync builtins be full optimization barriers?
  - From: Jakub Jelinek
- Re: should sync builtins be full optimization barriers?
  - From: Geert Bosch
- Re: should sync builtins be full optimization barriers?
  - From: Andrew MacLeod
- Re: should sync builtins be full optimization barriers?
  - From: Geert Bosch
- Re: should sync builtins be full optimization barriers?
  - From: Paolo Bonzini
- Re: should sync builtins be full optimization barriers?
  - From: Geert Bosch
- Re: should sync builtins be full optimization barriers?
  - From: Andrew MacLeod

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]