[PATCH] ARM: Weaker memory barriers
Will Deacon
will.deacon@arm.com
Tue Mar 11 10:48:00 GMT 2014
Hi John,
On Tue, Mar 11, 2014 at 02:54:18AM +0000, John Carr wrote:
> A comment in arm/sync.md notes "We should consider issuing a inner
> shareability zone barrier here instead." Here is my first attempt
> at a patch to emit weaker memory barriers. Three instructions seem
> to be relevant for user mode code on my Cortex A9 Linux box:
>
> dmb ishst, dmb ish, dmb sy
>
> I believe these correspond to a release barrier, a full barrier
> with respect to other CPUs, and a full barrier that also orders
> relative to I/O.
Not quite; DMB ISHST only orders writes with other writes, so loads can move
across it in both directions. That means it's not sufficient for releasing a
lock, for example.
<shameless plug>
I gave a presentation at ELCE about the various ARMv7 barrier options (from
a kernel perspective):
https://www.youtube.com/watch?v=6ORn6_35kKo
</shameless plug>
> Consider this a request for comments on whether the approach is correct.
> I haven't done any testing yet (beyond eyeballing the assembly output).
You'll probably find that a lot of ARM micro-architectures treat them
equivalently, which makes this a good source of potentially latent bugs if
you get the wrong options. When I added these options to the kernel, I
tested with a big/little A7/A15 setup using a CCI400 (the cache-coherent
interconnect) to try and tickle things a bit better, but it's by no means
definitive.
> (define_insn "*memory_barrier"
> [(set (match_operand:BLK 0 "" "")
> - (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))]
> - "TARGET_HAVE_MEMORY_BARRIER"
> + (unspec:BLK
> + [(match_dup 0) (match_operand:SI 1 "const_int_operand")]
> + UNSPEC_MEMORY_BARRIER))]
> + "TARGET_HAVE_DMB || TARGET_HAVE_MEMORY_BARRIER"
> {
> if (TARGET_HAVE_DMB)
> {
> - /* Note we issue a system level barrier. We should consider issuing
> - a inner shareabilty zone barrier here instead, ie. "DMB ISH". */
> - /* ??? Differentiate based on SEQ_CST vs less strict? */
> - return "dmb\tsy";
> + switch (INTVAL(operands[1]))
> + {
> + case MEMMODEL_RELEASE:
> + return "dmb\tishst";
As stated above, this isn't correct.
> + case MEMMODEL_CONSUME:
You might be able to build this one out of <control dependency> + ISB (in
lieu of a true address dependency). However, given the recent monolithic C11
thread that took over this list and others, let's play it safe for now and
stick with DMB ISH.
> + case MEMMODEL_ACQUIRE:
> + case MEMMODEL_ACQ_REL:
> + return "dmb\tish";
I think you probably *always* want ISH for GCC. You assume normal, cacheable
, coherent memory right? That also mirrors the first comment you delete
above.
> + case MEMMODEL_SEQ_CST:
> + return "dmb\tsy";
Again, ISH here should be fine. SY is for things like non-coherent devices,
which I don't think GCC has a concept of (the second comment you delete
doesn't make any sense and has probably tricked you).
Hope that helps,
Will
More information about the Gcc-patches
mailing list