This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[AArch64] Peepholes to generate ldp and stp instructions
- From: "Hurugalawadi, Naveen" <Naveen dot Hurugalawadi at caviumnetworks dot com>
- To: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 26 Mar 2013 10:27:13 +0000
- Subject: [AArch64] Peepholes to generate ldp and stp instructions
Hi,
Please find attached the patch that implements load pair(ldp) and store
pair(stp) peephole for aarch64 target.
Please review the same and let me know if its okay.
Build and tested on aarch64-thunder-elf (using Cavium's internal
simulator). No new regressions.
Thanks,
Naveen
gcc/
2013-03-26 Naveen H.S <Naveen.Hurugalawadi@caviumnetworks.com>
* config/aarch64/aarch64.md (peephole2s to generate ldp
instruction for 2 consecutive loads from memory): New.
(peephole2s to generate stp instruction for 2 consecutive
stores to memory in integer mode): New.
(peephole2s to generate ldp instruction for 2 consecutive
loads from memory in floating point mode): New.
(peephole2s to generate stp instruction for 2 consecutive
stores to memory in floating point mode): New.
--- gcc/config/aarch64/aarch64.md 2013-03-14 16:04:19.705897493 +0530
+++ gcc/config/aarch64/aarch64.md 2013-03-19 15:45:49.808730935 +0530
@@ -1013,6 +1013,26 @@
(set_attr "mode" "<MODE>")]
)
+(define_peephole2
+ [(set (match_operand:GPI 0 "register_operand")
+ (match_operand:GPI 1 "aarch64_mem_pair_operand"))
+ (set (match_operand:GPI 2 "register_operand")
+ (match_operand:GPI 3 "memory_operand"))]
+ "GET_CODE (operands[1]) == MEM
+ && GET_CODE (XEXP (operands[1], 0)) == PLUS
+ && GET_CODE (XEXP (XEXP (operands[1], 0), 0)) == REG
+ && GET_CODE (XEXP (XEXP (operands[1], 0), 1)) == CONST_INT
+ && REGNO (operands[0]) != REGNO (operands[2])
+ && REGNO_REG_CLASS (REGNO (operands[0]))
+ == REGNO_REG_CLASS (REGNO (operands[2]))
+ && rtx_equal_p (XEXP (operands[3], 0),
+ plus_constant (Pmode, XEXP (operands[1], 0),
+ GET_MODE_SIZE (<MODE>mode)))
+ && optimize_size"
+ [(parallel [(set (match_dup 0) (match_dup 1))
+ (set (match_dup 2) (match_dup 3))])]
+)
+
;; Operands 0 and 2 are tied together by the final condition; so we allow
;; fairly lax checking on the second memory operation.
(define_insn "store_pair<mode>"
@@ -1029,6 +1049,26 @@
(set_attr "mode" "<MODE>")]
)
+(define_peephole2
+ [(set (match_operand:GPI 0 "aarch64_mem_pair_operand")
+ (match_operand:GPI 1 "register_operand"))
+ (set (match_operand:GPI 2 "memory_operand")
+ (match_operand:GPI 3 "register_operand"))]
+ "GET_CODE (operands[0]) == MEM
+ && GET_CODE (XEXP (operands[0], 0)) == PLUS
+ && GET_CODE (XEXP (XEXP (operands[0], 0), 0)) == REG
+ && GET_CODE (XEXP (XEXP (operands[0], 0), 1)) == CONST_INT
+ && REGNO (operands[1]) != REGNO (operands[3])
+ && REGNO_REG_CLASS (REGNO (operands[1]))
+ == REGNO_REG_CLASS (REGNO (operands[3]))
+ && rtx_equal_p (XEXP (operands[2], 0),
+ plus_constant (Pmode, XEXP (operands[0], 0),
+ GET_MODE_SIZE (<MODE>mode)))
+ && optimize_size"
+ [(parallel [(set (match_dup 0) (match_dup 1))
+ (set (match_dup 2) (match_dup 3))])]
+)
+
;; Operands 1 and 3 are tied together by the final condition; so we allow
;; fairly lax checking on the second memory operation.
(define_insn "load_pair<mode>"
@@ -1045,6 +1085,27 @@
(set_attr "mode" "<MODE>")]
)
+(define_peephole2
+ [(set (match_operand:GPF 0 "register_operand")
+ (match_operand:GPF 1 "aarch64_mem_pair_operand"))
+ (set (match_operand:GPF 2 "register_operand")
+ (match_operand:GPF 3 "memory_operand"))]
+ "GET_CODE (operands[1]) == MEM
+ && GET_CODE (XEXP (operands[1], 0)) == PLUS
+ && GET_CODE (XEXP (XEXP (operands[1], 0), 0)) == REG
+ && GET_CODE (XEXP (XEXP (operands[1], 0), 1)) == CONST_INT
+ && REGNO (operands[0]) != REGNO (operands[2])
+ && REGNO (operands[0]) >= 32 && REGNO (operands[2]) >= 32
+ && REGNO_REG_CLASS (REGNO (operands[0]))
+ == REGNO_REG_CLASS (REGNO (operands[2]))
+ && rtx_equal_p (XEXP (operands[3], 0),
+ plus_constant (Pmode, XEXP (operands[1], 0),
+ GET_MODE_SIZE (<MODE>mode)))
+ && optimize_size"
+ [(parallel [(set (match_dup 0) (match_dup 1))
+ (set (match_dup 2) (match_dup 3))])]
+)
+
;; Operands 0 and 2 are tied together by the final condition; so we allow
;; fairly lax checking on the second memory operation.
(define_insn "store_pair<mode>"
@@ -1061,6 +1122,27 @@
(set_attr "mode" "<MODE>")]
)
+(define_peephole2
+ [(set (match_operand:GPF 0 "aarch64_mem_pair_operand")
+ (match_operand:GPF 1 "register_operand"))
+ (set (match_operand:GPF 2 "memory_operand")
+ (match_operand:GPF 3 "register_operand"))]
+ "GET_CODE (operands[0]) == MEM
+ && GET_CODE (XEXP (operands[0], 0)) == PLUS
+ && GET_CODE (XEXP (XEXP (operands[0], 0), 0)) == REG
+ && GET_CODE (XEXP (XEXP (operands[0], 0), 1)) == CONST_INT
+ && REGNO (operands[1]) != REGNO (operands[3])
+ && REGNO (operands[1]) >= 32 && REGNO (operands[3]) >= 32
+ && REGNO_REG_CLASS (REGNO (operands[1]))
+ == REGNO_REG_CLASS (REGNO (operands[3]))
+ && rtx_equal_p (XEXP (operands[2], 0),
+ plus_constant (Pmode, XEXP (operands[0], 0),
+ GET_MODE_SIZE (<MODE>mode)))
+ && optimize_size"
+ [(parallel [(set (match_dup 0) (match_dup 1))
+ (set (match_dup 2) (match_dup 3))])]
+)
+
;; Load pair with writeback. This is primarily used in function epilogues
;; when restoring [fp,lr]
(define_insn "loadwb_pair<GPI:mode>_<PTR:mode>"