[PATCH] ARM: Weaker memory barriers

Will Deacon will.deacon@arm.com
Tue Mar 11 10:48:00 GMT 2014


Hi John,

On Tue, Mar 11, 2014 at 02:54:18AM +0000, John Carr wrote:
> A comment in arm/sync.md notes "We should consider issuing a inner
> shareability zone barrier here instead."  Here is my first attempt
> at a patch to emit weaker memory barriers.  Three instructions seem
> to be relevant for user mode code on my Cortex A9 Linux box:
> 
> dmb ishst, dmb ish, dmb sy
> 
> I believe these correspond to a release barrier, a full barrier
> with respect to other CPUs, and a full barrier that also orders
> relative to I/O.

Not quite; DMB ISHST only orders writes with other writes, so loads can move
across it in both directions. That means it's not sufficient for releasing a
lock, for example.

<shameless plug>

I gave a presentation at ELCE about the various ARMv7 barrier options (from
a kernel perspective):

  https://www.youtube.com/watch?v=6ORn6_35kKo

</shameless plug>

> Consider this a request for comments on whether the approach is correct.
> I haven't done any testing yet (beyond eyeballing the assembly output).

You'll probably find that a lot of ARM micro-architectures treat them
equivalently, which makes this a good source of potentially latent bugs if
you get the wrong options. When I added these options to the kernel, I
tested with a big/little A7/A15 setup using a CCI400 (the cache-coherent
interconnect) to try and tickle things a bit better, but it's by no means
definitive.

>  (define_insn "*memory_barrier"
>    [(set (match_operand:BLK 0 "" "")
> -	(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))]
> -  "TARGET_HAVE_MEMORY_BARRIER"
> +	(unspec:BLK
> +	 [(match_dup 0) (match_operand:SI 1 "const_int_operand")]
> +	 UNSPEC_MEMORY_BARRIER))]
> +  "TARGET_HAVE_DMB || TARGET_HAVE_MEMORY_BARRIER"
>    {
>      if (TARGET_HAVE_DMB)
>        {
> -	/* Note we issue a system level barrier. We should consider issuing
> -	   a inner shareabilty zone barrier here instead, ie. "DMB ISH".  */
> -	/* ??? Differentiate based on SEQ_CST vs less strict?  */
> -	return "dmb\tsy";
> +        switch (INTVAL(operands[1]))
> +	{
> +	case MEMMODEL_RELEASE:
> +          return "dmb\tishst";

As stated above, this isn't correct.

> +	case MEMMODEL_CONSUME:

You might be able to build this one out of <control dependency> + ISB (in
lieu of a true address dependency). However, given the recent monolithic C11
thread that took over this list and others, let's play it safe for now and
stick with DMB ISH.

> +	case MEMMODEL_ACQUIRE:
> +	case MEMMODEL_ACQ_REL:
> +	  return "dmb\tish";

I think you probably *always* want ISH for GCC. You assume normal, cacheable
, coherent memory right? That also mirrors the first comment you delete
above.

> +	case MEMMODEL_SEQ_CST:
> +	  return "dmb\tsy";

Again, ISH here should be fine. SY is for things like non-coherent devices,
which I don't think GCC has a concept of (the second comment you delete
doesn't make any sense and has probably tricked you).

Hope that helps,

Will



More information about the Gcc-patches mailing list