This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Patch/ARM] Cortex-M4 core pipeline patch to tune LDR/STR pairs


Hello,

The attached pipeline patch intends to turn following code generation

ldr r5, [r4, #12]
adds r2, r2, #16
str r5, [r3, #8]

to

ldr r5, [r4, #12]
str r5, [r3, #8]
adds r2, r2, #16

The reason is that the STR can be started from the second cycle of its
preceding LDR which takes 2 cycles, as long as the result of LDR isn't used
as memory address of STR.

Tested with various benchmarks on Cortex-M4 MPS. Except one regression
caused by register allocation, the others either show performance
improvement or no change.

Is it OK to trunk?

BR,
Terry

2013-03-29  Terry Guo  <terry.guo@arm.com>

                * gcc/config/arm/cortex-m4.md: New bypass to tune LDR/STR
pairs.
From 19dd8bdc9a03f78690700ded911e0cee66328c01 Mon Sep 17 00:00:00 2001
From: Terry Guo <terry.guo@arm.com>
Date: Wed, 27 Mar 2013 17:23:09 +0800
Subject: [PATCH] improve m4 pipeline description

---
 gcc/config/arm/cortex-m4.md |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/arm/cortex-m4.md b/gcc/config/arm/cortex-m4.md
index 187867b..47b0364 100644
--- a/gcc/config/arm/cortex-m4.md
+++ b/gcc/config/arm/cortex-m4.md
@@ -84,6 +84,10 @@
        (eq_attr "type" "store4"))
   "cortex_m4_ex*5")
 
+(define_bypass 1 "cortex_m4_load1"
+                 "cortex_m4_store1_1,cortex_m4_store1_2"
+                 "arm_no_early_store_addr_dep")
+
 ;; If the address of load or store depends on the result of the preceding
 ;; instruction, the latency is increased by one.
 
-- 
1.7.9.5

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]