This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][ARM] Tie operand 1 to operand 0 in AESMC pattern when fusing AES/AESMC



On 20/05/16 11:04, Kyrill Tkachov wrote:
Hi all,

The recent -frename-registers change exposed a deficiency in the way we fuse AESE/AESMC instruction
pairs in arm.

Basically we want to enforce:
    AESE Vn, _
    AESMC Vn, Vn

to enable the fusion, but regrename comes along and renames the output Vn register in AESMC to something
else, killing the fusion in the hardware.

The solution in this patch is to add an alternative that ties the input and output registers in the AESMC pattern
and enable that alternative when the fusion is enabled.

With this patch I've confirmed that the above preferred register sequence is kept even with -frename-registers
when tuning for a cpu that enables the fusion and that the chain is broken by regrename otherwise and have
seen the appropriate improvement in a proprietary benchmark (that I cannot name) that exercises this sequence.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?


Following James's feedback on the AArch64 version, this slightly modified version uses the enum type for the argument of the new function.
Is this ok instead?

Thanks,
Kyrill

2016-05-27  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

    * config/arm/arm.c (arm_fusion_enabled_p): New function.
    * config/arm/arm-protos.h (arm_fusion_enabled_p): Declare prototype.
    * config/arm/crypto.md (crypto_<crypto_pattern>, CRYPTO_UNARY):
    Add "=w,0" alternative.  Enable it when AES/AESMC fusion is enabled.

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index cf221d6793eaf0959f2713fe0903a5d8602ec2f4..12a781de13f2f7816cc2b16b04835d87c83f7abb 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -320,6 +320,7 @@ extern int vfp3_const_double_for_bits (rtx);
 
 extern void arm_emit_coreregs_64bit_shift (enum rtx_code, rtx, rtx, rtx, rtx,
 					   rtx);
+extern bool arm_fusion_enabled_p (tune_params::fuse_ops);
 extern bool arm_valid_symbolic_address_p (rtx);
 extern bool arm_validize_comparison (rtx *, rtx *, rtx *);
 #endif /* RTX_CODE */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5110d9e989d605a9e2c262e6007b89a1c7dc7080..39a24c06c123b86883134368ef39794abf11898b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -29704,6 +29704,13 @@ aarch_macro_fusion_pair_p (rtx_insn* prev, rtx_insn* curr)
   return false;
 }
 
+/* Return true iff the instruction fusion described by OP is enabled.  */
+bool
+arm_fusion_enabled_p (tune_params::fuse_ops op)
+{
+  return current_tune->fusible_ops & op;
+}
+
 /* Implement the TARGET_ASAN_SHADOW_OFFSET hook.  */
 
 static unsigned HOST_WIDE_INT
diff --git a/gcc/config/arm/crypto.md b/gcc/config/arm/crypto.md
index c6f17270b1dbaf6dc43eb1e9b8a182dbb0f5a1e1..0f510f069408471fcbf6751f161e984f39929813 100644
--- a/gcc/config/arm/crypto.md
+++ b/gcc/config/arm/crypto.md
@@ -18,14 +18,27 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
+
+;; When AES/AESMC fusion is enabled we want the register allocation to
+;; look like:
+;;    AESE Vn, _
+;;    AESMC Vn, Vn
+;; So prefer to tie operand 1 to operand 0 when fusing.
+
 (define_insn "crypto_<crypto_pattern>"
-  [(set (match_operand:<crypto_mode> 0 "register_operand" "=w")
+  [(set (match_operand:<crypto_mode> 0 "register_operand" "=w,w")
         (unspec:<crypto_mode> [(match_operand:<crypto_mode> 1
-                       "register_operand" "w")]
+                       "register_operand" "0,w")]
          CRYPTO_UNARY))]
   "TARGET_CRYPTO"
   "<crypto_pattern>.<crypto_size_sfx>\\t%q0, %q1"
-  [(set_attr "type" "<crypto_type>")]
+  [(set_attr "type" "<crypto_type>")
+   (set_attr_alternative "enabled"
+     [(if_then_else (match_test
+		       "arm_fusion_enabled_p (tune_params::FUSE_AES_AESMC)")
+		     (const_string "yes" )
+		     (const_string "no"))
+      (const_string "yes")])]
 )
 
 (define_insn "crypto_<crypto_pattern>"

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]