This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH][AArch64] vec_pack_trunc_<mode> should split after register allocator
- From: "Hurugalawadi, Naveen" <Naveen dot Hurugalawadi at cavium dot com>
- To: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Cc: James Greenhalgh <james dot greenhalgh at arm dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, Marcus Shawcroft <marcus dot shawcroft at arm dot com>
- Date: Thu, 27 Apr 2017 05:08:38 +0000
- Subject: [PATCH][AArch64] vec_pack_trunc_<mode> should split after register allocator
- Authentication-results: sourceware.org; auth=none
- Authentication-results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=cavium.com;
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Hi,
The instruction "vec_pack_trunc_<mode>" should be split after register
allocator for scheduling reasons. Currently the instruction is marked as type
multiple which means it will scheduled as single issued. However, nothing can
be scheduled with either xtn/xtn2 which is a problem in some cases.
The patch splits the instruction and fixes the issue.
Please review the patch and let me know if its okay.
Bootstrapped and Regression tested on aarch64-thunder-linux.
2017-04-27 Naveen H.S <Naveen.Hurugalawadi@cavium.com>
* config/aarch64/aarch64-simd.md
(aarch64_simd_vec_pack_trunc_hi_<mode>): New pattern.
(vec_pack_trunc_<mode>): Split the instruction pattern.
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index c462164..9b5135c 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1278,6 +1278,18 @@
[(set_attr "type" "neon_shift_imm_narrow_q")]
)
+(define_insn "aarch64_simd_vec_pack_trunc_hi_<mode>"
+ [(set (match_operand:<VNARROWQ2> 0 "register_operand" "=w")
+ (vec_concat:<VNARROWQ2>
+ (truncate:<VNARROWQ> (match_operand:VQN 1 "register_operand" "w"))
+ (vec_select:<VNARROWQ>
+ (match_operand:<VNARROWQ2> 3 "register_operand" "0")
+ (match_operand:<VNARROWQ2> 2 "vect_par_cnst_hi_half" ""))))]
+ "TARGET_SIMD"
+ "xtn2\\t%0.<V2ntype>, %1.<Vtype>"
+ [(set_attr "type" "neon_shift_imm_narrow_q")]
+)
+
(define_expand "vec_pack_trunc_<mode>"
[(match_operand:<VNARROWD> 0 "register_operand" "")
(match_operand:VDN 1 "register_operand" "")
@@ -1296,17 +1308,41 @@
;; For quads.
-(define_insn "vec_pack_trunc_<mode>"
+(define_insn_and_split "vec_pack_trunc_<mode>"
[(set (match_operand:<VNARROWQ2> 0 "register_operand" "=&w")
(vec_concat:<VNARROWQ2>
(truncate:<VNARROWQ> (match_operand:VQN 1 "register_operand" "w"))
(truncate:<VNARROWQ> (match_operand:VQN 2 "register_operand" "w"))))]
"TARGET_SIMD"
+ "#"
+ "&& reload_completed"
+ [(const_int 0)]
{
if (BYTES_BIG_ENDIAN)
- return "xtn\\t%0.<Vntype>, %2.<Vtype>\;xtn2\\t%0.<V2ntype>, %1.<Vtype>";
+ {
+ rtx low_part = gen_lowpart (<VNARROWQ>mode, operands[0]);
+ emit_insn (gen_aarch64_simd_vec_pack_trunc_<mode> (low_part,
+ operands[2]));
+ rtx high_part = aarch64_simd_vect_par_cnst_half (<VNARROWQ2>mode,
+ true);
+ emit_insn (gen_aarch64_simd_vec_pack_trunc_hi_<mode> (operands[0],
+ operands[1],
+ high_part,
+ operands[0]));
+ }
else
- return "xtn\\t%0.<Vntype>, %1.<Vtype>\;xtn2\\t%0.<V2ntype>, %2.<Vtype>";
+ {
+ rtx low_part = gen_lowpart (<VNARROWQ>mode, operands[0]);
+ emit_insn (gen_aarch64_simd_vec_pack_trunc_<mode> (low_part,
+ operands[1]));
+ rtx high_part = aarch64_simd_vect_par_cnst_half (<VNARROWQ2>mode,
+ true);
+ emit_insn (gen_aarch64_simd_vec_pack_trunc_hi_<mode> (operands[0],
+ operands[2],
+ high_part,
+ operands[0]));
+ }
+ DONE;
}
[(set_attr "type" "multiple")
(set_attr "length" "8")]