This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[AArch64][PATCH 1/5] Use atomic instructions for swap and fetch-update operations.


Hello,

ARMv8.1 adds atomic swap and atomic load-operate instructions with
optional memory ordering specifiers. This patch series adds the
instructions to GCC, making them available with -march=armv8.1-a or
-march=armv8+lse, and uses them to implement the __sync and __atomic
builtins.

The ARMv8.1 swap instruction swaps the value in a register with a value
in memory. The load-operate instructions load a value from memory,
update it with the result of an operation and store the result in
memory.

This series uses the swap instruction to implement the atomic_exchange
patterns and the load-operate instructions to implement the
atomic_fetch_<op> and atomic_<op>_fetch patterns. For the
atomic_<op>_fetch patterns, the value returned as the result of the
operation has to be recalculated from the loaded data. The ARMv8 BIC
instruction is added so that it can be used for this recalculation.

The patches in this series
- add and use the atomic swap instruction.
- add the Aarch64 BIC instruction,
- add the ARMv8.1 load-operate instructions,
- use the load-operate instructions to implement the atomic_fetch_<op>
  patterns,
- use the load-operate instructions to implement the patterns
  atomic_<op>_fetch patterns,

The code-generation changes in this series are based around a new
function, aarch64_gen_atomic_ldop, which takes the operation to be
implemented and emits the appropriate code, making use of the atomic
instruction. This follows the existing uses aarch64_split_atomic_op for
the same purpose when atomic instructions aren't available.

This patch adds the ARMv8.1 SWAP instruction and function
aarch64_gen_atomic_ldop and changes the implementation of the
atomic_exchange pattern to the atomic instruction when it is available.

The general form of the code generated for an atomic_exchange, with
destination D, source S, memory address A and memory order MO is:

   swp<mo><sz> S, D, [A]

where
   <mo> is one of {'', 'a', 'l', 'al'} depending on memory order MO.
   <sz> is one of {'', 'b', 'h'} depending on the data size.

This patch also adds tests for the changes. These reuse the support code
introduced for the atomic CAS tests, adding macros to test functions
taking one memory ordering argument. These are used to iteratively
define functions using the __atomic_exchange builtins, which should be
implemented using the atomic swap.

Tested the series for aarch64-none-linux-gnu with native bootstrap and
make check. Also tested for aarch64-none-elf with cross-compiled
check-gcc on an ARMv8.1 emulator with +lse enabled by default.

Ok for trunk?
Matthew

gcc/
2015-09-17  Matthew Wahab  <matthew.wahab@arm.com>

	* config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop):
	Declare.
	* config/aarch64/aarch64.c (aarch64_emit_atomic_swp): New.
	(aarch64_gen_atomic_ldop): New.
	(aarch64_split_atomic_op): Fix whitespace and add a comment.
	* config/aarch64/atomics.md (UNSPECV_ATOMIC_SWP): New.
	(atomic_compare_and_swap<mode>_lse): Remove comments and fix
	whitespace.
	(atomic_exchange<mode>): Replace with an expander.
	(aarch64_atomic_exchange<mode>): New.
	(aarch64_atomic_exchange<mode>_lse): New.
	(aarch64_atomic_<atomic_optab><mode>): Fix some whitespace.
	(aarch64_atomic_swp<mode>): New.


gcc/testsuite/
2015-09-17  Matthew Wahab  <matthew.wahab@arm.com>

	* gcc.target/aarch64/atomic-inst-ops.inc: (TEST_MODEL): New.
	(TEST_ONE): New.
         * gcc.target/aarch64/atomic-inst-swap.c: New.

>From 425fa05a5e3656c8d6d0d085628424b4c846cd49 Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Fri, 7 Aug 2015 17:18:37 +0100
Subject: [PATCH 1/5] Add atomic SWP instruction

Change-Id: I87bf48526cb11e65edd15691f5eab20446e418c9
---
 gcc/config/aarch64/aarch64-protos.h                |  1 +
 gcc/config/aarch64/aarch64.c                       | 46 ++++++++++-
 gcc/config/aarch64/atomics.md                      | 92 +++++++++++++++++++---
 .../gcc.target/aarch64/atomic-inst-ops.inc         | 13 +++
 gcc/testsuite/gcc.target/aarch64/atomic-inst-swp.c | 44 +++++++++++
 5 files changed, 183 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-inst-swp.c

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index ff19851..eba4c76 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -378,6 +378,7 @@ rtx aarch64_load_tp (rtx);
 void aarch64_expand_compare_and_swap (rtx op[]);
 void aarch64_split_compare_and_swap (rtx op[]);
 void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
+void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx);
 void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_gen_adjusted_ldpstp (rtx *, bool, enum machine_mode, RTX_CODE);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 4d2126b..dc05c6e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11185,11 +11185,54 @@ aarch64_split_compare_and_swap (rtx operands[])
     aarch64_emit_post_barrier (model);
 }
 
+/* Emit an atomic swap.  */
+
+static void
+aarch64_emit_atomic_swap (machine_mode mode, rtx dst, rtx value,
+			  rtx mem, rtx model)
+{
+  rtx (*gen) (rtx, rtx, rtx, rtx);
+
+  switch (mode)
+    {
+    case QImode: gen = gen_aarch64_atomic_swpqi; break;
+    case HImode: gen = gen_aarch64_atomic_swphi; break;
+    case SImode: gen = gen_aarch64_atomic_swpsi; break;
+    case DImode: gen = gen_aarch64_atomic_swpdi; break;
+    default:
+      gcc_unreachable ();
+    }
+
+  emit_insn (gen (dst, mem, value, model));
+}
+
+/* Emit an atomic operation where the architecture supports it.  */
+
+void
+aarch64_gen_atomic_ldop (enum rtx_code code, rtx out_data,
+			 rtx mem, rtx value, rtx model_rtx)
+{
+  machine_mode mode = GET_MODE (mem);
+
+  out_data = gen_lowpart (mode, out_data);
+
+  switch (code)
+    {
+    case SET:
+      aarch64_emit_atomic_swap (mode, out_data, value, mem, model_rtx);
+      return;
+
+    default:
+      /* The operation can't be done with atomic instructions.  */
+      gcc_unreachable ();
+    }
+}
+
 /* Split an atomic operation.  */
 
 void
 aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
-		     rtx value, rtx model_rtx, rtx cond)
+			 rtx value, rtx model_rtx, rtx cond)
 {
   machine_mode mode = GET_MODE (mem);
   machine_mode wmode = (mode == DImode ? DImode : SImode);
@@ -11198,6 +11241,7 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
   rtx_code_label *label;
   rtx x;
 
+  /* Split the atomic operation into a sequence.  */
   label = gen_label_rtx ();
   emit_label (label);
 
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 65d2cc9..0e71002 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -27,6 +27,7 @@
     UNSPECV_ATOMIC_CMPSW		; Represent an atomic compare swap.
     UNSPECV_ATOMIC_EXCHG		; Represent an atomic exchange.
     UNSPECV_ATOMIC_CAS			; Represent an atomic CAS.
+    UNSPECV_ATOMIC_SWP			; Represent an atomic SWP.
     UNSPECV_ATOMIC_OP			; Represent an atomic operation.
 ])
 
@@ -122,19 +123,19 @@
 )
 
 (define_insn_and_split "aarch64_compare_and_swap<mode>_lse"
-  [(set (reg:CC CC_REGNUM)					;; bool out
+  [(set (reg:CC CC_REGNUM)
     (unspec_volatile:CC [(const_int 0)] UNSPECV_ATOMIC_CMPSW))
-   (set (match_operand:GPI 0 "register_operand" "=&r")		;; val out
-    (match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q"))   ;; memory
+   (set (match_operand:GPI 0 "register_operand" "=&r")
+    (match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q"))
    (set (match_dup 1)
     (unspec_volatile:GPI
-      [(match_operand:GPI 2 "aarch64_plus_operand" "rI")	;; expect
-       (match_operand:GPI 3 "register_operand" "r")		;; desired
-       (match_operand:SI 4 "const_int_operand")			;; is_weak
-       (match_operand:SI 5 "const_int_operand")			;; mod_s
-       (match_operand:SI 6 "const_int_operand")]		;; mod_f
+      [(match_operand:GPI 2 "aarch64_plus_operand" "rI")
+       (match_operand:GPI 3 "register_operand" "r")
+       (match_operand:SI 4 "const_int_operand")
+       (match_operand:SI 5 "const_int_operand")
+       (match_operand:SI 6 "const_int_operand")]
       UNSPECV_ATOMIC_CMPSW))]
-  "TARGET_LSE "
+  "TARGET_LSE"
   "#"
   "&& reload_completed"
   [(const_int 0)]
@@ -146,7 +147,32 @@
   }
 )
 
-(define_insn_and_split "atomic_exchange<mode>"
+(define_expand "atomic_exchange<mode>"
+ [(parallel
+   [(set (match_operand:ALLI 0 "register_operand" "")
+     (match_operand:ALLI 1 "aarch64_sync_memory_operand" ""))
+    (set (match_dup 1)
+     (unspec_volatile:ALLI
+      [(match_operand:ALLI 2 "register_operand" "")
+       (match_operand:SI 3 "const_int_operand" "")]
+      UNSPECV_ATOMIC_EXCHG))])]
+  ""
+  {
+    rtx (*gen) (rtx, rtx, rtx, rtx);
+
+    /* Use an atomic SWP when available.  */
+    if (TARGET_LSE)
+      gen = gen_aarch64_atomic_exchange<mode>_lse;
+    else
+      gen = gen_aarch64_atomic_exchange<mode>;
+
+    emit_insn (gen (operands[0], operands[1], operands[2], operands[3]));
+
+    DONE;
+  }
+)
+
+(define_insn_and_split "aarch64_atomic_exchange<mode>"
   [(set (match_operand:ALLI 0 "register_operand" "=&r")		;; output
     (match_operand:ALLI 1 "aarch64_sync_memory_operand" "+Q")) ;; memory
    (set (match_dup 1)
@@ -162,7 +188,27 @@
   [(const_int 0)]
   {
     aarch64_split_atomic_op (SET, operands[0], NULL, operands[1],
-			    operands[2], operands[3], operands[4]);
+			     operands[2], operands[3], operands[4]);
+    DONE;
+  }
+)
+
+(define_insn_and_split "aarch64_atomic_exchange<mode>_lse"
+  [(set (match_operand:ALLI 0 "register_operand" "=&r")
+    (match_operand:ALLI 1 "aarch64_sync_memory_operand" "+Q"))
+   (set (match_dup 1)
+    (unspec_volatile:ALLI
+      [(match_operand:ALLI 2 "register_operand" "r")
+       (match_operand:SI 3 "const_int_operand" "")]
+      UNSPECV_ATOMIC_EXCHG))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_LSE"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+  {
+    aarch64_gen_atomic_ldop (SET, operands[0], operands[1],
+			     operands[2], operands[3]);
     DONE;
   }
 )
@@ -183,7 +229,7 @@
   [(const_int 0)]
   {
     aarch64_split_atomic_op (<CODE>, NULL, operands[3], operands[0],
-			    operands[1], operands[2], operands[4]);
+			     operands[1], operands[2], operands[4]);
     DONE;
   }
 )
@@ -425,6 +471,28 @@
 
 ;; ARMv8.1 LSE instructions.
 
+;; Atomic swap with memory.
+(define_insn "aarch64_atomic_swp<mode>"
+ [(set (match_operand:ALLI 0 "register_operand" "+&r")
+   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "+Q"))
+  (set (match_dup 1)
+   (unspec_volatile:ALLI
+    [(match_operand:ALLI 2 "register_operand" "r")
+     (match_operand:SI 3 "const_int_operand" "")]
+    UNSPECV_ATOMIC_SWP))]
+  "TARGET_LSE && reload_completed"
+  {
+    enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+    if (is_mm_relaxed (model))
+      return "swp<atomic_sfx>\t%<w>2, %<w>0, %1";
+    else if (is_mm_acquire (model) || is_mm_consume (model))
+      return "swpa<atomic_sfx>\t%<w>2, %<w>0, %1";
+    else if (is_mm_release (model))
+      return "swpl<atomic_sfx>\t%<w>2, %<w>0, %1";
+    else
+      return "swpal<atomic_sfx>\t%<w>2, %<w>0, %1";
+  })
+
 ;; Atomic compare-and-swap: HI and smaller modes.
 
 (define_insn "aarch64_atomic_cas<mode>"
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-inst-ops.inc b/gcc/testsuite/gcc.target/aarch64/atomic-inst-ops.inc
index 72c7e5c..c2fdcba 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-inst-ops.inc
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-inst-ops.inc
@@ -32,6 +32,15 @@ typedef __uint128_t uint128;
   TEST_M##N (NAME, FN, int128, MODEL1, MODEL2)		\
   TEST_M##N (NAME, FN, uint128, MODEL1, MODEL2)
 
+/* Models to test.  */
+#define TEST_MODEL(NAME, FN, N)					\
+  TEST_TY (NAME##_relaxed, FN, N, __ATOMIC_RELAXED, DUMMY)	\
+  TEST_TY (NAME##_consume, FN, N, __ATOMIC_CONSUME, DUMMY)	\
+  TEST_TY (NAME##_acquire, FN, N, __ATOMIC_ACQUIRE, DUMMY)	\
+  TEST_TY (NAME##_release, FN, N, __ATOMIC_RELEASE, DUMMY)	\
+  TEST_TY (NAME##_acq_rel, FN, N, __ATOMIC_ACQ_REL, DUMMY)	\
+  TEST_TY (NAME##_seq_cst, FN, N, __ATOMIC_SEQ_CST, DUMMY)	\
+
 /* Cross-product of models to test.  */
 #define TEST_MODEL_M1(NAME, FN, N, M)			\
   TEST_TY (NAME##_relaxed, FN, N, M, __ATOMIC_RELAXED)	\
@@ -51,3 +60,7 @@ typedef __uint128_t uint128;
 
 /* Expand functions for a cross-product of memory models and types.  */
 #define TEST_TWO(NAME, FN) TEST_MODEL_M2 (NAME, FN)
+
+/* Expand functions for a set of memory models and types.  */
+#define TEST_ONE(NAME, FN) TEST_MODEL (NAME, FN, 1)
+
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-inst-swp.c b/gcc/testsuite/gcc.target/aarch64/atomic-inst-swp.c
new file mode 100644
index 0000000..dabc9b9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-inst-swp.c
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv8-a+lse" } */
+
+/* Test ARMv8.1-A SWP instruction.  */
+
+#include "atomic-inst-ops.inc"
+
+#define TEST TEST_ONE
+
+#define SWAP_ATOMIC(FN, TY, MODEL)					\
+  TY FNNAME (FN, TY) (TY* val, TY foo)					\
+  {									\
+    return __atomic_exchange_n (val, foo, MODEL);			\
+  }
+
+#define SWAP_ATOMIC_NORETURN(FN, TY, MODEL)				\
+  void FNNAME (FN, TY) (TY* val, TY* foo, TY* bar)			\
+  {									\
+    __atomic_exchange (val, foo, bar, MODEL);				\
+  }
+
+
+TEST (swap_atomic, SWAP_ATOMIC)
+TEST (swap_atomic_noreturn, SWAP_ATOMIC_NORETURN)
+
+
+/* { dg-final { scan-assembler-times "swpb\t" 4} } */
+/* { dg-final { scan-assembler-times "swpab\t" 8} } */
+/* { dg-final { scan-assembler-times "swplb\t" 4} } */
+/* { dg-final { scan-assembler-times "swpalb\t" 8} } */
+
+/* { dg-final { scan-assembler-times "swph\t" 4} } */
+/* { dg-final { scan-assembler-times "swpah\t" 8} } */
+/* { dg-final { scan-assembler-times "swplh\t" 4} } */
+/* { dg-final { scan-assembler-times "swpalh\t" 8} } */
+
+/* { dg-final { scan-assembler-times "swp\t" 8} } */
+/* { dg-final { scan-assembler-times "swpa\t" 16} } */
+/* { dg-final { scan-assembler-times "swpl\t" 8} } */
+/* { dg-final { scan-assembler-times "swpal\t" 16} } */
+
+/* { dg-final { scan-assembler-not "ldaxr\t" } } */
+/* { dg-final { scan-assembler-not "stlxr\t" } } */
+/* { dg-final { scan-assembler-not "dmb" } } */
-- 
2.1.4


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]