This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[ARM] [Neon types 5/10] Update Cortex-A8 pipeline model
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: gcc-patches at gcc dot gnu dot org
- Cc: marcus dot shawcroft at arm dot com, ramana dot radhakrishnan at arm dot com, richard dot earnshaw at arm dot com
- Date: Tue, 15 Oct 2013 12:23:41 +0100
- Subject: [ARM] [Neon types 5/10] Update Cortex-A8 pipeline model
- Authentication-results: sourceware.org; auth=none
- References: <1381836226-430-1-git-send-email-james dot greenhalgh at arm dot com>
Hi,
This patch updates the A8 pipeline model to handle the new
type classifications. We do that by defining cortex_a8_neon_type,
which simply regroups the new classifications in to the old groups.
After regrouping, we can remove three categories:
cortex_a8_neon_vshl_ddd: This instruction is an outlier in the
instruction tables and would represent the only single-cycle Neon
instruction. I am inclined to believe this is a typo.
cortex_a8_neon_vst3_vst4: These instructions collapse in with
vst2-4regs, we could be more accurate if we had alignment
information, but we don't.
cortex_a8_neon_vld3_vld4_all_lanes: These fall in with
vld1_one_lane.
And add one new Category:
cortex_a8_neon_bit_ops_q: Which covers the bit operations VCLZ, VCNT,
VCLS, VBIT, VBSL and VBIF.
Because of this, we must update the bypasses to reflect the new
and deleted categories.
Sanity checked to ensure similar Neon schedules are generated before
and after the patch series.
OK?
Thanks,
James
---
2013-10-15 James Greenhalgh <james.greenhalgh@arm.com>
* config/arm/cortex-a8-neon.md (cortex_a8_neon_type): New.
(cortex_a8_neon_vshl_ddd): Remove.
(cortex_a8_neon_vst3_vst4): Likewise.
(cortex_a8_neon_vld3_vld4_all_lanes): Likewise.
(cortex_a8_neon_bit_ops_q): New.
(cortex_a8_neon_int_1): Use cortex_a8_neon_type.
(cortex_a8_neon_int_2): Likewise..
(cortex_a8_neon_int_3): Likewise.
(cortex_a8_neon_int_5): Likewise.
(cortex_a8_neon_vqneg_vqabs): Likewise.
(cortex_a8_neon_int_4): Likewise.
(cortex_a8_neon_vaba): Likewise.
(cortex_a8_neon_vaba_qqq): Likewise.
(cortex_a8_neon_shift_1): Likewise.
(cortex_a8_neon_shift_2): Likewise.
(cortex_a8_neon_shift_3): Likewise.
(cortex_a8_neon_vqshl_vrshl_vqrshl_qqq): Likewise.
(cortex_a8_neon_vsra_vrsra): Likewise.
(cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long): Likewise.
(cortex_a8_neon_mul_qqq_8_16_32_ddd_32): Likewise.
(cortex_a8_neon_mul_qdd_64_32_long_qqd_16_ddd_32_scalar_64_32_long_scalar):
Likewise.
(cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long): Likewise.
(cortex_a8_neon_mla_qqq_8_16): Likewise.
(cortex_a8_neon_mla_ddd_32_qqd_16_ddd_32_scalar_qdd_64_32_long_scalar_qdd_64_32_long):
Likewise.
(cortex_a8_neon_mla_qqq_32_qqd_32_scalar): Likewise.
(cortex_a8_neon_mul_ddd_16_scalar_32_16_long_scalar): Likewise.
(cortex_a8_neon_mul_qqd_32_scalar): Likewise.
(cortex_a8_neon_mla_ddd_16_scalar_qdd_32_16_long_scalar): Likewise.
(cortex_a8_neon_fp_vadd_ddd_vabs_dd): Likewise.
(cortex_a8_neon_fp_vadd_qqq_vabs_qq): Likewise.
(cortex_a8_neon_fp_vsum): Likewise.
(cortex_a8_neon_fp_vmul_ddd): Likewise.
(cortex_a8_neon_fp_vmul_qqd): Likewise.
(cortex_a8_neon_fp_vmla_ddd): Likewise.
(cortex_a8_neon_fp_vmla_qqq): Likewise.
(cortex_a8_neon_fp_vmla_ddd_scalar): Likewise.
(cortex_a8_neon_fp_vmla_qqq_scalar): Likewise.
(cortex_a8_neon_fp_vrecps_vrsqrts_ddd): Likewise.
(cortex_a8_neon_fp_vrecps_vrsqrts_qqq): Likewise.
(cortex_a8_neon_bp_simple): Likewise.
(cortex_a8_neon_bp_2cycle): Likewise.
(cortex_a8_neon_bp_3cycle): Likewise.
(cortex_a8_neon_ldr): Likewise.
(cortex_a8_neon_str): Likewise.
(cortex_a8_neon_vld1_1_2_regs): Likewise.
(cortex_a8_neon_vld1_3_4_regs): Likewise.
(cortex_a8_neon_vld2_2_regs_vld1_vld2_all_lanes): Likewise.
(cortex_a8_neon_vld2_4_regs): Likewise.
(cortex_a8_neon_vld3_vld4): Likewise.
(cortex_a8_neon_vld1_vld2_lane): Likewise.
(cortex_a8_neon_vld3_vld4_lane): Likewise.
(cortex_a8_neon_vst1_1_2_regs_vst2_2_regs): Likewise.
(cortex_a8_neon_vst1_3_4_regs): Likewise.
(cortex_a8_neon_vst2_4_regs_vst3_vst4): Likewise.
(cortex_a8_neon_vst1_vst2_lane): Likewise.
(cortex_a8_neon_vst3_vst4_lane): Likewise.
(cortex_a8_neon_mcr): Likewise.
(cortex_a8_neon_mcr_2_mcrr): Likewise.
(cortex_a8_neon_mrc): Likewise.
(cortex_a8_neon_mrrc): Likewise.
diff --git a/gcc/config/arm/cortex-a8-neon.md b/gcc/config/arm/cortex-a8-neon.md
index b7773891669061d3f08b4a74de9881db982aa246..6adfd1365693c55550230b8b97f5419043144f8c 100644
--- a/gcc/config/arm/cortex-a8-neon.md
+++ b/gcc/config/arm/cortex-a8-neon.md
@@ -18,6 +18,221 @@
;; along with GCC; see the file COPYING3. If not see
;; <http://www.gnu.org/licenses/>.
+(define_attr "cortex_a8_neon_type"
+ "neon_int_1,neon_int_2,neon_int_3,neon_int_4,neon_int_5,neon_vqneg_vqabs,
+ neon_bit_ops_q,
+ neon_vaba,neon_vaba_qqq, neon_vmov,
+ neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,neon_mul_qqq_8_16_32_ddd_32,
+ neon_mul_qdd_64_32_long_qqd_16_ddd_32_scalar_64_32_long_scalar,
+ neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,neon_mla_qqq_8_16,
+ neon_mla_ddd_32_qqd_16_ddd_32_scalar_qdd_64_32_long_scalar_qdd_64_32_long,
+ neon_mla_qqq_32_qqd_32_scalar,neon_mul_ddd_16_scalar_32_16_long_scalar,
+ neon_mul_qqd_32_scalar,neon_mla_ddd_16_scalar_qdd_32_16_long_scalar,
+ neon_shift_1,neon_shift_2,neon_shift_3,
+ neon_vqshl_vrshl_vqrshl_qqq,neon_vsra_vrsra,neon_fp_vadd_ddd_vabs_dd,
+ neon_fp_vadd_qqq_vabs_qq,neon_fp_vsum,neon_fp_vmul_ddd,neon_fp_vmul_qqd,
+ neon_fp_vmla_ddd,neon_fp_vmla_qqq,neon_fp_vmla_ddd_scalar,
+ neon_fp_vmla_qqq_scalar,neon_fp_vrecps_vrsqrts_ddd,
+ neon_fp_vrecps_vrsqrts_qqq,neon_bp_simple,neon_bp_2cycle,neon_bp_3cycle,
+ neon_ldr,neon_str,neon_vld1_1_2_regs,neon_vld1_3_4_regs,
+ neon_vld2_2_regs_vld1_vld2_all_lanes,neon_vld2_4_regs,neon_vld3_vld4,
+ neon_vst1_1_2_regs_vst2_2_regs,neon_vst1_3_4_regs,
+ neon_vst2_4_regs_vst3_vst4,neon_vld1_vld2_lane,
+ neon_vld3_vld4_lane,neon_vst1_vst2_lane,neon_vst3_vst4_lane,
+ neon_vld3_vld4_all_lanes,neon_mcr,neon_mcr_2_mcrr,neon_mrc,neon_mrrc,
+ neon_ldm_2,neon_stm_2,none,unknown"
+ (cond [
+ (eq_attr "type" "neon_logic, neon_logic_q,\
+ neon_bsl, neon_cls, neon_cnt,\
+ neon_add, neon_add_q")
+ (const_string "neon_int_1")
+ (eq_attr "type" "neon_add_widen, neon_sub_widen,\
+ neon_sub, neon_sub_q")
+ (const_string "neon_int_2")
+ (eq_attr "type" "neon_neg, neon_neg_q,\
+ neon_reduc_add, neon_reduc_add_q,\
+ neon_reduc_add_long,\
+ neon_add_long, neon_sub_long")
+ (const_string "neon_int_3")
+ (eq_attr "type" "neon_abs, neon_abs_q,
+ neon_compare_zero, neon_compare_zero_q,\
+ neon_add_halve_narrow_q,\
+ neon_sub_halve_narrow_q,\
+ neon_add_halve, neon_add_halve_q,\
+ neon_qadd, neon_qadd_q,\
+ neon_tst, neon_tst_q")
+ (const_string "neon_int_4")
+ (eq_attr "type" "neon_abd_long, neon_sub_halve, neon_sub_halve_q,\
+ neon_qsub, neon_qsub_q,\
+ neon_abd, neon_abd_q,\
+ neon_compare, neon_compare_q,\
+ neon_minmax, neon_minmax_q, neon_reduc_minmax,\
+ neon_reduc_minmax_q")
+ (const_string "neon_int_5")
+ (eq_attr "type" "neon_qneg, neon_qneg_q, neon_qabs, neon_qabs_q")
+ (const_string "neon_vqneg_vqabs")
+ (eq_attr "type" "neon_move, neon_move_q")
+ (const_string "neon_vmov")
+ (eq_attr "type" "neon_bsl_q, neon_cls_q, neon_cnt_q")
+ (const_string "neon_bit_ops_q")
+ (eq_attr "type" "neon_arith_acc, neon_reduc_add_acc")
+ (const_string "neon_vaba")
+ (eq_attr "type" "neon_arith_acc_q")
+ (const_string "neon_vaba_qqq")
+ (eq_attr "type" "neon_shift_imm, neon_shift_imm_q,\
+ neon_shift_imm_long, neon_shift_imm_narrow_q,\
+ neon_shift_reg")
+ (const_string "neon_shift_1")
+ (eq_attr "type" "neon_sat_shift_imm, neon_sat_shift_imm_q,
+ neon_sat_shift_imm_narrow_q,\
+ neon_sat_shift_reg")
+ (const_string "neon_shift_2")
+ (eq_attr "type" "neon_shift_reg_q")
+ (const_string "neon_shift_3")
+ (eq_attr "type" "neon_sat_shift_reg_q")
+ (const_string "neon_vqshl_vrshl_vqrshl_qqq")
+ (eq_attr "type" "neon_shift_acc, neon_shift_acc_q")
+ (const_string "neon_vsra_vrsra")
+ (eq_attr "type" "neon_mul_b, neon_mul_h,\
+ neon_mul_b_long, neon_mul_h_long,\
+ neon_sat_mul_b, neon_sat_mul_h,\
+ neon_sat_mul_b_long, neon_sat_mul_h_long")
+ (const_string
+ "neon_mul_ddd_8_16_qdd_16_8_long_32_16_long")
+ (eq_attr "type" "neon_mul_b_q, neon_mul_h_q,\
+ neon_sat_mul_b_q, neon_sat_mul_h_q")
+ (const_string "neon_mul_qqq_8_16_32_ddd_32")
+ (eq_attr "type" "neon_mul_s, neon_mul_s_long,\
+ neon_sat_mul_s, neon_sat_mul_s_long,\
+ neon_mul_h_scalar_q, neon_sat_mul_h_scalar_q,\
+ neon_mul_s_scalar, neon_sat_mul_s_scalar,\
+ neon_mul_s_scalar_long,\
+ neon_sat_mul_s_scalar_long")
+ (const_string
+ "neon_mul_qdd_64_32_long_qqd_16_ddd_32_scalar_64_32_long_scalar")
+ (eq_attr "type" "neon_mla_b, neon_mla_h,\
+ neon_mla_b_long, neon_mla_h_long,\
+ neon_sat_mla_b_long, neon_sat_mla_h_long,\
+ neon_sat_mla_h_scalar_long")
+ (const_string
+ "neon_mla_ddd_8_16_qdd_16_8_long_32_16_long")
+ (eq_attr "type" "neon_mla_b_q, neon_mla_h_q")
+ (const_string "neon_mla_qqq_8_16")
+ (eq_attr "type" "neon_mla_s, neon_mla_s_long,\
+ neon_sat_mla_s_long,\
+ neon_mla_h_scalar_q, neon_mla_s_scalar,\
+ neon_mla_s_scalar_long,\
+ neon_sat_mla_s_scalar_long")
+ (const_string
+ "neon_mla_ddd_32_qqd_16_ddd_32_scalar_qdd_64_32_long_scalar_qdd_64_32_long")
+ (eq_attr "type" "neon_mla_s_q, neon_mla_s_scalar_q")
+ (const_string "neon_mla_qqq_32_qqd_32_scalar")
+ (eq_attr "type" "neon_mul_h_scalar, neon_sat_mul_h_scalar,\
+ neon_mul_h_scalar_long,\
+ neon_sat_mul_h_scalar_long")
+ (const_string
+ "neon_mul_ddd_16_scalar_32_16_long_scalar")
+ (eq_attr "type" "neon_mul_s_q, neon_sat_mul_s_q,\
+ neon_mul_s_scalar_q")
+ (const_string "neon_mul_qqd_32_scalar")
+ (eq_attr "type" "neon_mla_h_scalar, neon_mla_h_scalar_long")
+ (const_string
+ "neon_mla_ddd_16_scalar_qdd_32_16_long_scalar")
+ (eq_attr "type" "neon_fp_abd_s, neon_fp_abs_s, neon_fp_neg_s,\
+ neon_fp_addsub_s, neon_fp_compare_s,\
+ neon_fp_minmax_s, neon_fp_mul_s,\
+ neon_fp_recpe_s, neon_fp_rsqrte_s,\
+ neon_fp_to_int_s, neon_int_to_fp_s")
+ (const_string "neon_fp_vadd_ddd_vabs_dd")
+ (eq_attr "type" "neon_fp_abd_s_q, neon_fp_abs_s_q,\
+ neon_fp_neg_s_q,\
+ neon_fp_addsub_s_q, neon_fp_compare_s_q,\
+ neon_fp_minmax_s_q, neon_fp_mul_s_q,\
+ neon_fp_recpe_s_q, neon_fp_rsqrte_s_q,\
+ neon_fp_to_int_s_q, neon_int_to_fp_s_q")
+ (const_string "neon_fp_vadd_qqq_vabs_qq")
+ (eq_attr "type" "neon_fp_reduc_add_s, neon_fp_reduc_minmax_s,\
+ neon_fp_reduc_add_s_q, neon_fp_reduc_minmax_s_q")
+ (const_string "neon_fp_vsum")
+ (eq_attr "type" "neon_fp_mul_s_scalar")
+ (const_string "neon_fp_vmul_ddd")
+ (eq_attr "type" "neon_fp_mul_s_scalar_q")
+ (const_string "neon_fp_vmul_qqd")
+ (eq_attr "type" "neon_fp_mla_s")
+ (const_string "neon_fp_vmla_ddd")
+ (eq_attr "type" "neon_fp_mla_s_q")
+ (const_string "neon_fp_vmla_qqq")
+ (eq_attr "type" "neon_fp_mla_s_scalar")
+ (const_string "neon_fp_vmla_ddd_scalar")
+ (eq_attr "type" "neon_fp_mla_s_scalar_q")
+ (const_string "neon_fp_vmla_qqq_scalar")
+ (eq_attr "type" "neon_fp_recps_s, neon_fp_rsqrts_s")
+ (const_string "neon_fp_vrecps_vrsqrts_ddd")
+ (eq_attr "type" "neon_fp_recps_s_q, neon_fp_rsqrts_s_q")
+ (const_string "neon_fp_vrecps_vrsqrts_qqq")
+ (eq_attr "type" "neon_move_narrow_q, neon_dup,\
+ neon_dup_q, neon_permute, neon_zip,\
+ neon_ext, neon_rev, neon_rev_q")
+ (const_string "neon_bp_simple")
+ (eq_attr "type" "neon_permute_q, neon_ext_q, neon_tbl1, neon_tbl2")
+ (const_string "neon_bp_2cycle")
+ (eq_attr "type" "neon_zip_q, neon_tbl3, neon_tbl4")
+ (const_string "neon_bp_3cycle")
+ (eq_attr "type" "neon_ldr")
+ (const_string "neon_ldr")
+ (eq_attr "type" "neon_str")
+ (const_string "neon_str")
+ (eq_attr "type" "neon_load1_1reg, neon_load1_1reg_q,\
+ neon_load1_2reg, neon_load1_2reg_q,\
+ neon_load2_2reg, neon_load2_2reg_q")
+ (const_string "neon_vld1_1_2_regs")
+ (eq_attr "type" "neon_load1_3reg, neon_load1_3reg_q,\
+ neon_load1_4reg, neon_load1_4reg_q")
+ (const_string "neon_vld1_3_4_regs")
+ (eq_attr "type" "neon_load1_all_lanes, neon_load1_all_lanes_q,\
+ neon_load2_all_lanes, neon_load2_all_lanes_q")
+ (const_string
+ "neon_vld2_2_regs_vld1_vld2_all_lanes")
+ (eq_attr "type" "neon_load3_all_lanes, neon_load3_all_lanes_q,\
+ neon_load4_all_lanes, neon_load4_all_lanes_q,\
+ neon_load2_4reg, neon_load2_4reg_q")
+ (const_string "neon_vld2_4_regs")
+ (eq_attr "type" "neon_load3_3reg, neon_load3_3reg_q,\
+ neon_load4_4reg, neon_load4_4reg_q")
+ (const_string "neon_vld3_vld4")
+ (eq_attr "type" "f_loads, f_loadd, f_stores, f_stored,\
+ neon_load1_one_lane, neon_load1_one_lane_q,\
+ neon_load2_one_lane, neon_load2_one_lane_q")
+ (const_string "neon_vld1_vld2_lane")
+ (eq_attr "type" "neon_load3_one_lane, neon_load3_one_lane_q,\
+ neon_load4_one_lane, neon_load4_one_lane_q")
+ (const_string "neon_vld3_vld4_lane")
+ (eq_attr "type" "neon_store1_1reg, neon_store1_1reg_q,\
+ neon_store1_2reg, neon_store1_2reg_q,\
+ neon_store2_2reg, neon_store2_2reg_q")
+ (const_string "neon_vst1_1_2_regs_vst2_2_regs")
+ (eq_attr "type" "neon_store1_3reg, neon_store1_3reg_q,\
+ neon_store1_4reg, neon_store1_4reg_q")
+ (const_string "neon_vst1_3_4_regs")
+ (eq_attr "type" "neon_store2_4reg, neon_store2_4reg_q,\
+ neon_store3_3reg, neon_store3_3reg_q,\
+ neon_store4_4reg, neon_store4_4reg_q")
+ (const_string "neon_vst2_4_regs_vst3_vst4")
+ (eq_attr "type" "neon_store1_one_lane, neon_store1_one_lane_q,\
+ neon_store2_one_lane, neon_store2_one_lane_q")
+ (const_string "neon_vst1_vst2_lane")
+ (eq_attr "type" "neon_store3_one_lane, neon_store3_one_lane_q,\
+ neon_store4_one_lane, neon_store4_one_lane_q")
+ (const_string "neon_vst3_vst4_lane")
+ (eq_attr "type" "neon_from_gp, f_mcr")
+ (const_string "neon_mcr")
+ (eq_attr "type" "neon_from_gp_q, f_mcrr")
+ (const_string "neon_mcr_2_mcrr")
+ (eq_attr "type" "neon_to_gp, f_mrc")
+ (const_string "neon_mrc")
+ (eq_attr "type" "neon_to_gp_q, f_mrrc")
+ (const_string "neon_mrrc")]
+ (const_string "unknown")))
(define_automaton "cortex_a8_neon")
@@ -184,62 +399,62 @@ (define_insn_reservation "cortex_a8_vfp_
(define_insn_reservation "cortex_a8_neon_mrc" 20
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mrc"))
+ (eq_attr "cortex_a8_neon_type" "neon_mrc"))
"cortex_a8_neon_ls")
(define_insn_reservation "cortex_a8_neon_mrrc" 21
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mrrc"))
+ (eq_attr "cortex_a8_neon_type" "neon_mrrc"))
"cortex_a8_neon_ls_2")
-;; The remainder of this file is auto-generated by neon-schedgen.
+;; Arithmetic Operations
;; Instructions using this reservation read their source operands at N2, and
;; produce a result at N3.
(define_insn_reservation "cortex_a8_neon_int_1" 3
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_int_1"))
+ (eq_attr "cortex_a8_neon_type" "neon_int_1"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their (D|Q)m operands at N1,
;; their (D|Q)n operands at N2, and produce a result at N3.
(define_insn_reservation "cortex_a8_neon_int_2" 3
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_int_2"))
+ (eq_attr "cortex_a8_neon_type" "neon_int_2"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N3.
(define_insn_reservation "cortex_a8_neon_int_3" 3
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_int_3"))
+ (eq_attr "cortex_a8_neon_type" "neon_int_3"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their source operands at N2, and
;; produce a result at N4.
(define_insn_reservation "cortex_a8_neon_int_4" 4
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_int_4"))
+ (eq_attr "cortex_a8_neon_type" "neon_int_4"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their (D|Q)m operands at N1,
;; their (D|Q)n operands at N2, and produce a result at N4.
(define_insn_reservation "cortex_a8_neon_int_5" 4
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_int_5"))
+ (eq_attr "cortex_a8_neon_type" "neon_int_5"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N4.
(define_insn_reservation "cortex_a8_neon_vqneg_vqabs" 4
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vqneg_vqabs"))
+ (eq_attr "cortex_a8_neon_type" "neon_vqneg_vqabs"))
"cortex_a8_neon_dp")
;; Instructions using this reservation produce a result at N3.
(define_insn_reservation "cortex_a8_neon_vmov" 3
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vmov"))
+ (eq_attr "cortex_a8_neon_type" "neon_vmov"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -247,7 +462,7 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N6.
(define_insn_reservation "cortex_a8_neon_vaba" 6
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vaba"))
+ (eq_attr "cortex_a8_neon_type" "neon_vaba"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -255,35 +470,39 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N6 on cycle 2.
(define_insn_reservation "cortex_a8_neon_vaba_qqq" 7
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vaba_qqq"))
+ (eq_attr "cortex_a8_neon_type" "neon_vaba_qqq"))
"cortex_a8_neon_dp_2")
-;; Instructions using this reservation read their (D|Q)m operands at N1,
-;; their (D|Q)d operands at N3, and produce a result at N6.
-(define_insn_reservation "cortex_a8_neon_vsma" 6
+;; Instructions using this reservation read their source operands at N2, and
+;; produce a result at N3 on cycle 2.
+(define_insn_reservation "cortex_a8_neon_bit_ops_q" 4
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vsma"))
- "cortex_a8_neon_dp")
+ (eq_attr "cortex_a8_neon_type" "neon_bit_ops_q"))
+ "cortex_a8_neon_dp_2")
+
+;; Integer Multiply/Accumulate Operations
;; Instructions using this reservation read their source operands at N2, and
;; produce a result at N6.
(define_insn_reservation "cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long" 6
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mul_ddd_8_16_qdd_16_8_long_32_16_long"))
+ (eq_attr "cortex_a8_neon_type"
+ "neon_mul_ddd_8_16_qdd_16_8_long_32_16_long"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their source operands at N2, and
;; produce a result at N6 on cycle 2.
(define_insn_reservation "cortex_a8_neon_mul_qqq_8_16_32_ddd_32" 7
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mul_qqq_8_16_32_ddd_32"))
+ (eq_attr "cortex_a8_neon_type" "neon_mul_qqq_8_16_32_ddd_32"))
"cortex_a8_neon_dp_2")
;; Instructions using this reservation read their (D|Q)n operands at N2,
;; their (D|Q)m operands at N1, and produce a result at N6 on cycle 2.
(define_insn_reservation "cortex_a8_neon_mul_qdd_64_32_long_qqd_16_ddd_32_scalar_64_32_long_scalar" 7
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mul_qdd_64_32_long_qqd_16_ddd_32_scalar_64_32_long_scalar"))
+ (eq_attr "cortex_a8_neon_type"
+ "neon_mul_qdd_64_32_long_qqd_16_ddd_32_scalar_64_32_long_scalar"))
"cortex_a8_neon_dp_2")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -291,7 +510,8 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N6.
(define_insn_reservation "cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long" 6
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mla_ddd_8_16_qdd_16_8_long_32_16_long"))
+ (eq_attr "cortex_a8_neon_type"
+ "neon_mla_ddd_8_16_qdd_16_8_long_32_16_long"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -299,7 +519,7 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N6 on cycle 2.
(define_insn_reservation "cortex_a8_neon_mla_qqq_8_16" 7
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mla_qqq_8_16"))
+ (eq_attr "cortex_a8_neon_type" "neon_mla_qqq_8_16"))
"cortex_a8_neon_dp_2")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -307,7 +527,8 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N6 on cycle 2.
(define_insn_reservation "cortex_a8_neon_mla_ddd_32_qqd_16_ddd_32_scalar_qdd_64_32_long_scalar_qdd_64_32_long" 7
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mla_ddd_32_qqd_16_ddd_32_scalar_qdd_64_32_long_scalar_qdd_64_32_long"))
+ (eq_attr "cortex_a8_neon_type"
+ "neon_mla_ddd_32_qqd_16_ddd_32_scalar_qdd_64_32_long_scalar_qdd_64_32_long"))
"cortex_a8_neon_dp_2")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -315,21 +536,22 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N6 on cycle 4.
(define_insn_reservation "cortex_a8_neon_mla_qqq_32_qqd_32_scalar" 9
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mla_qqq_32_qqd_32_scalar"))
+ (eq_attr "cortex_a8_neon_type" "neon_mla_qqq_32_qqd_32_scalar"))
"cortex_a8_neon_dp_4")
;; Instructions using this reservation read their (D|Q)n operands at N2,
;; their (D|Q)m operands at N1, and produce a result at N6.
(define_insn_reservation "cortex_a8_neon_mul_ddd_16_scalar_32_16_long_scalar" 6
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mul_ddd_16_scalar_32_16_long_scalar"))
+ (eq_attr "cortex_a8_neon_type"
+ "neon_mul_ddd_16_scalar_32_16_long_scalar"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their (D|Q)n operands at N2,
;; their (D|Q)m operands at N1, and produce a result at N6 on cycle 4.
(define_insn_reservation "cortex_a8_neon_mul_qqd_32_scalar" 9
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mul_qqd_32_scalar"))
+ (eq_attr "cortex_a8_neon_type" "neon_mul_qqd_32_scalar"))
"cortex_a8_neon_dp_4")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -337,84 +559,82 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N6.
(define_insn_reservation "cortex_a8_neon_mla_ddd_16_scalar_qdd_32_16_long_scalar" 6
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mla_ddd_16_scalar_qdd_32_16_long_scalar"))
+ (eq_attr "cortex_a8_neon_type"
+ "neon_mla_ddd_16_scalar_qdd_32_16_long_scalar"))
"cortex_a8_neon_dp")
+;; Shift Operations
+
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N3.
(define_insn_reservation "cortex_a8_neon_shift_1" 3
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_shift_1"))
+ (eq_attr "cortex_a8_neon_type" "neon_shift_1"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N4.
(define_insn_reservation "cortex_a8_neon_shift_2" 4
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_shift_2"))
+ (eq_attr "cortex_a8_neon_type" "neon_shift_2"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N3 on cycle 2.
(define_insn_reservation "cortex_a8_neon_shift_3" 4
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_shift_3"))
+ (eq_attr "cortex_a8_neon_type" "neon_shift_3"))
"cortex_a8_neon_dp_2")
;; Instructions using this reservation read their source operands at N1, and
-;; produce a result at N1.
-(define_insn_reservation "cortex_a8_neon_vshl_ddd" 1
- (and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vshl_ddd"))
- "cortex_a8_neon_dp")
-
-;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N4 on cycle 2.
(define_insn_reservation "cortex_a8_neon_vqshl_vrshl_vqrshl_qqq" 5
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vqshl_vrshl_vqrshl_qqq"))
+ (eq_attr "cortex_a8_neon_type" "neon_vqshl_vrshl_vqrshl_qqq"))
"cortex_a8_neon_dp_2")
;; Instructions using this reservation read their (D|Q)m operands at N1,
;; their (D|Q)d operands at N3, and produce a result at N6.
(define_insn_reservation "cortex_a8_neon_vsra_vrsra" 6
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vsra_vrsra"))
+ (eq_attr "cortex_a8_neon_type" "neon_vsra_vrsra"))
"cortex_a8_neon_dp")
+;; Floating point Operations
+
;; Instructions using this reservation read their source operands at N2, and
;; produce a result at N5.
(define_insn_reservation "cortex_a8_neon_fp_vadd_ddd_vabs_dd" 5
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vadd_ddd_vabs_dd"))
- "cortex_a8_neon_fadd")
+ (eq_attr "cortex_a8_neon_type" "neon_fp_vadd_ddd_vabs_dd"))
+ "cortex_a8_neon_fadd")
;; Instructions using this reservation read their source operands at N2, and
;; produce a result at N5 on cycle 2.
(define_insn_reservation "cortex_a8_neon_fp_vadd_qqq_vabs_qq" 6
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vadd_qqq_vabs_qq"))
+ (eq_attr "cortex_a8_neon_type" "neon_fp_vadd_qqq_vabs_qq"))
"cortex_a8_neon_fadd_2")
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N5.
(define_insn_reservation "cortex_a8_neon_fp_vsum" 5
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vsum"))
+ (eq_attr "cortex_a8_neon_type" "neon_fp_vsum"))
"cortex_a8_neon_fadd")
;; Instructions using this reservation read their (D|Q)n operands at N2,
;; their (D|Q)m operands at N1, and produce a result at N5.
(define_insn_reservation "cortex_a8_neon_fp_vmul_ddd" 5
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vmul_ddd"))
+ (eq_attr "cortex_a8_neon_type" "neon_fp_vmul_ddd"))
"cortex_a8_neon_dp")
;; Instructions using this reservation read their (D|Q)n operands at N2,
;; their (D|Q)m operands at N1, and produce a result at N5 on cycle 2.
(define_insn_reservation "cortex_a8_neon_fp_vmul_qqd" 6
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vmul_qqd"))
+ (eq_attr "cortex_a8_neon_type" "neon_fp_vmul_qqd"))
"cortex_a8_neon_dp_2")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -422,7 +642,7 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N9.
(define_insn_reservation "cortex_a8_neon_fp_vmla_ddd" 9
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vmla_ddd"))
+ (eq_attr "cortex_a8_neon_type" "neon_fp_vmla_ddd"))
"cortex_a8_neon_fmul_then_fadd")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -430,7 +650,7 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N9 on cycle 2.
(define_insn_reservation "cortex_a8_neon_fp_vmla_qqq" 10
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vmla_qqq"))
+ (eq_attr "cortex_a8_neon_type" "neon_fp_vmla_qqq"))
"cortex_a8_neon_fmul_then_fadd_2")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -438,7 +658,7 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N9.
(define_insn_reservation "cortex_a8_neon_fp_vmla_ddd_scalar" 9
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vmla_ddd_scalar"))
+ (eq_attr "cortex_a8_neon_type" "neon_fp_vmla_ddd_scalar"))
"cortex_a8_neon_fmul_then_fadd")
;; Instructions using this reservation read their (D|Q)n operands at N2,
@@ -446,152 +666,148 @@ (define_insn_reservation "cortex_a8_neon
;; produce a result at N9 on cycle 2.
(define_insn_reservation "cortex_a8_neon_fp_vmla_qqq_scalar" 10
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vmla_qqq_scalar"))
+ (eq_attr "cortex_a8_neon_type" "neon_fp_vmla_qqq_scalar"))
"cortex_a8_neon_fmul_then_fadd_2")
;; Instructions using this reservation read their source operands at N2, and
;; produce a result at N9.
(define_insn_reservation "cortex_a8_neon_fp_vrecps_vrsqrts_ddd" 9
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vrecps_vrsqrts_ddd"))
+ (eq_attr "cortex_a8_neon_type" "neon_fp_vrecps_vrsqrts_ddd"))
"cortex_a8_neon_fmul_then_fadd")
;; Instructions using this reservation read their source operands at N2, and
;; produce a result at N9 on cycle 2.
(define_insn_reservation "cortex_a8_neon_fp_vrecps_vrsqrts_qqq" 10
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_fp_vrecps_vrsqrts_qqq"))
+ (eq_attr "type" "neon_fp_recps_s_q, neon_fp_rsqrts_s_q"))
"cortex_a8_neon_fmul_then_fadd_2")
+;; Permute operations.
+
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N2.
(define_insn_reservation "cortex_a8_neon_bp_simple" 2
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_bp_simple"))
+ (eq_attr "cortex_a8_neon_type" "neon_bp_simple"))
"cortex_a8_neon_perm")
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N2 on cycle 2.
(define_insn_reservation "cortex_a8_neon_bp_2cycle" 3
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_bp_2cycle"))
+ (eq_attr "cortex_a8_neon_type" "neon_bp_2cycle"))
"cortex_a8_neon_perm_2")
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N2 on cycle 3.
(define_insn_reservation "cortex_a8_neon_bp_3cycle" 4
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_bp_3cycle"))
+ (eq_attr "cortex_a8_neon_type" "neon_bp_3cycle"))
"cortex_a8_neon_perm_3")
+;; Load Operations.
+
;; Instructions using this reservation produce a result at N1.
(define_insn_reservation "cortex_a8_neon_ldr" 1
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_ldr"))
+ (eq_attr "cortex_a8_neon_type" "neon_ldr"))
"cortex_a8_neon_ls")
;; Instructions using this reservation read their source operands at N1.
(define_insn_reservation "cortex_a8_neon_str" 0
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_str"))
+ (eq_attr "cortex_a8_neon_type" "neon_str"))
"cortex_a8_neon_ls")
;; Instructions using this reservation produce a result at N1 on cycle 2.
(define_insn_reservation "cortex_a8_neon_vld1_1_2_regs" 2
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vld1_1_2_regs"))
+ (eq_attr "cortex_a8_neon_type" "neon_vld1_1_2_regs"))
"cortex_a8_neon_ls_2")
;; Instructions using this reservation produce a result at N1 on cycle 3.
(define_insn_reservation "cortex_a8_neon_vld1_3_4_regs" 3
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vld1_3_4_regs"))
+ (eq_attr "cortex_a8_neon_type" "neon_vld1_3_4_regs"))
"cortex_a8_neon_ls_3")
;; Instructions using this reservation produce a result at N2 on cycle 2.
(define_insn_reservation "cortex_a8_neon_vld2_2_regs_vld1_vld2_all_lanes" 3
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vld2_2_regs_vld1_vld2_all_lanes"))
+ (eq_attr "cortex_a8_neon_type" "neon_vld2_2_regs_vld1_vld2_all_lanes"))
"cortex_a8_neon_ls_2")
;; Instructions using this reservation produce a result at N2 on cycle 3.
(define_insn_reservation "cortex_a8_neon_vld2_4_regs" 4
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vld2_4_regs"))
+ (eq_attr "cortex_a8_neon_type" "neon_vld2_4_regs"))
"cortex_a8_neon_ls_3")
;; Instructions using this reservation produce a result at N2 on cycle 4.
(define_insn_reservation "cortex_a8_neon_vld3_vld4" 5
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vld3_vld4"))
+ (eq_attr "cortex_a8_neon_type" "neon_vld3_vld4"))
"cortex_a8_neon_ls_4")
+;; Store operations.
+
;; Instructions using this reservation read their source operands at N1.
(define_insn_reservation "cortex_a8_neon_vst1_1_2_regs_vst2_2_regs" 0
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vst1_1_2_regs_vst2_2_regs"))
+ (eq_attr "cortex_a8_neon_type" "neon_vst1_1_2_regs_vst2_2_regs"))
"cortex_a8_neon_ls_2")
;; Instructions using this reservation read their source operands at N1.
(define_insn_reservation "cortex_a8_neon_vst1_3_4_regs" 0
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vst1_3_4_regs"))
+ (eq_attr "cortex_a8_neon_type" "neon_vst1_3_4_regs"))
"cortex_a8_neon_ls_3")
;; Instructions using this reservation read their source operands at N1.
(define_insn_reservation "cortex_a8_neon_vst2_4_regs_vst3_vst4" 0
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vst2_4_regs_vst3_vst4"))
- "cortex_a8_neon_ls_4")
-
-;; Instructions using this reservation read their source operands at N1.
-(define_insn_reservation "cortex_a8_neon_vst3_vst4" 0
- (and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vst3_vst4"))
+ (eq_attr "cortex_a8_neon_type" "neon_vst2_4_regs_vst3_vst4"))
"cortex_a8_neon_ls_4")
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N2 on cycle 3.
(define_insn_reservation "cortex_a8_neon_vld1_vld2_lane" 4
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vld1_vld2_lane"))
+ (eq_attr "cortex_a8_neon_type" "neon_vld1_vld2_lane"))
"cortex_a8_neon_ls_3")
;; Instructions using this reservation read their source operands at N1, and
;; produce a result at N2 on cycle 5.
(define_insn_reservation "cortex_a8_neon_vld3_vld4_lane" 6
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vld3_vld4_lane"))
+ (eq_attr "cortex_a8_neon_type" "neon_vld3_vld4_lane"))
"cortex_a8_neon_ls_5")
;; Instructions using this reservation read their source operands at N1.
(define_insn_reservation "cortex_a8_neon_vst1_vst2_lane" 0
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vst1_vst2_lane"))
+ (eq_attr "cortex_a8_neon_type" "neon_vst1_vst2_lane"))
"cortex_a8_neon_ls_2")
;; Instructions using this reservation read their source operands at N1.
(define_insn_reservation "cortex_a8_neon_vst3_vst4_lane" 0
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vst3_vst4_lane"))
+ (eq_attr "cortex_a8_neon_type" "neon_vst3_vst4_lane"))
"cortex_a8_neon_ls_3")
-;; Instructions using this reservation produce a result at N2 on cycle 2.
-(define_insn_reservation "cortex_a8_neon_vld3_vld4_all_lanes" 3
- (and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_vld3_vld4_all_lanes"))
- "cortex_a8_neon_ls_3")
+;; Register Transfer Operations
;; Instructions using this reservation produce a result at N2.
(define_insn_reservation "cortex_a8_neon_mcr" 2
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mcr"))
+ (eq_attr "cortex_a8_neon_type" "neon_mcr"))
"cortex_a8_neon_perm")
;; Instructions using this reservation produce a result at N2.
(define_insn_reservation "cortex_a8_neon_mcr_2_mcrr" 2
(and (eq_attr "tune" "cortexa8")
- (eq_attr "type" "neon_mcr_2_mcrr"))
+ (eq_attr "cortex_a8_neon_type" "neon_mcr_2_mcrr"))
"cortex_a8_neon_perm_2")
;; Exceptions to the default latencies.
@@ -599,6 +815,7 @@ (define_insn_reservation "cortex_a8_neon
(define_bypass 1 "cortex_a8_neon_mcr_2_mcrr"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -613,20 +830,7 @@ (define_bypass 1 "cortex_a8_neon_mcr_2_m
(define_bypass 1 "cortex_a8_neon_mcr"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
- cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
- cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
- cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
- cortex_a8_neon_mla_qqq_8_16,\
- cortex_a8_neon_fp_vadd_ddd_vabs_dd,\
- cortex_a8_neon_fp_vadd_qqq_vabs_qq,\
- cortex_a8_neon_fp_vmla_ddd,\
- cortex_a8_neon_fp_vmla_qqq,\
- cortex_a8_neon_fp_vrecps_vrsqrts_ddd,\
- cortex_a8_neon_fp_vrecps_vrsqrts_qqq")
-
-(define_bypass 2 "cortex_a8_neon_vld3_vld4_all_lanes"
- "cortex_a8_neon_int_1,\
- cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -641,6 +845,7 @@ (define_bypass 2 "cortex_a8_neon_vld3_vl
(define_bypass 5 "cortex_a8_neon_vld3_vld4_lane"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -655,6 +860,7 @@ (define_bypass 5 "cortex_a8_neon_vld3_vl
(define_bypass 3 "cortex_a8_neon_vld1_vld2_lane"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -669,6 +875,7 @@ (define_bypass 3 "cortex_a8_neon_vld1_vl
(define_bypass 4 "cortex_a8_neon_vld3_vld4"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -683,6 +890,7 @@ (define_bypass 4 "cortex_a8_neon_vld3_vl
(define_bypass 3 "cortex_a8_neon_vld2_4_regs"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -697,6 +905,7 @@ (define_bypass 3 "cortex_a8_neon_vld2_4_
(define_bypass 2 "cortex_a8_neon_vld2_2_regs_vld1_vld2_all_lanes"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -711,6 +920,7 @@ (define_bypass 2 "cortex_a8_neon_vld2_2_
(define_bypass 2 "cortex_a8_neon_vld1_3_4_regs"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -725,6 +935,7 @@ (define_bypass 2 "cortex_a8_neon_vld1_3_
(define_bypass 1 "cortex_a8_neon_vld1_1_2_regs"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -739,6 +950,7 @@ (define_bypass 1 "cortex_a8_neon_vld1_1_
(define_bypass 0 "cortex_a8_neon_ldr"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -753,6 +965,7 @@ (define_bypass 0 "cortex_a8_neon_ldr"
(define_bypass 3 "cortex_a8_neon_bp_3cycle"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -767,6 +980,7 @@ (define_bypass 3 "cortex_a8_neon_bp_3cyc
(define_bypass 2 "cortex_a8_neon_bp_2cycle"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -781,6 +995,7 @@ (define_bypass 2 "cortex_a8_neon_bp_2cyc
(define_bypass 1 "cortex_a8_neon_bp_simple"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -795,6 +1010,7 @@ (define_bypass 1 "cortex_a8_neon_bp_simp
(define_bypass 9 "cortex_a8_neon_fp_vrecps_vrsqrts_qqq"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -809,6 +1025,7 @@ (define_bypass 9 "cortex_a8_neon_fp_vrec
(define_bypass 8 "cortex_a8_neon_fp_vrecps_vrsqrts_ddd"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -823,6 +1040,7 @@ (define_bypass 8 "cortex_a8_neon_fp_vrec
(define_bypass 9 "cortex_a8_neon_fp_vmla_qqq_scalar"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -837,6 +1055,7 @@ (define_bypass 9 "cortex_a8_neon_fp_vmla
(define_bypass 8 "cortex_a8_neon_fp_vmla_ddd_scalar"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -851,6 +1070,7 @@ (define_bypass 8 "cortex_a8_neon_fp_vmla
(define_bypass 9 "cortex_a8_neon_fp_vmla_qqq"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -865,6 +1085,7 @@ (define_bypass 9 "cortex_a8_neon_fp_vmla
(define_bypass 8 "cortex_a8_neon_fp_vmla_ddd"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -879,6 +1100,7 @@ (define_bypass 8 "cortex_a8_neon_fp_vmla
(define_bypass 5 "cortex_a8_neon_fp_vmul_qqd"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -893,6 +1115,7 @@ (define_bypass 5 "cortex_a8_neon_fp_vmul
(define_bypass 4 "cortex_a8_neon_fp_vmul_ddd"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -907,6 +1130,7 @@ (define_bypass 4 "cortex_a8_neon_fp_vmul
(define_bypass 4 "cortex_a8_neon_fp_vsum"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -921,6 +1145,7 @@ (define_bypass 4 "cortex_a8_neon_fp_vsum
(define_bypass 5 "cortex_a8_neon_fp_vadd_qqq_vabs_qq"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -935,6 +1160,7 @@ (define_bypass 5 "cortex_a8_neon_fp_vadd
(define_bypass 4 "cortex_a8_neon_fp_vadd_ddd_vabs_dd"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -949,6 +1175,7 @@ (define_bypass 4 "cortex_a8_neon_fp_vadd
(define_bypass 5 "cortex_a8_neon_vsra_vrsra"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -963,20 +1190,7 @@ (define_bypass 5 "cortex_a8_neon_vsra_vr
(define_bypass 4 "cortex_a8_neon_vqshl_vrshl_vqrshl_qqq"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
- cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
- cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
- cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
- cortex_a8_neon_mla_qqq_8_16,\
- cortex_a8_neon_fp_vadd_ddd_vabs_dd,\
- cortex_a8_neon_fp_vadd_qqq_vabs_qq,\
- cortex_a8_neon_fp_vmla_ddd,\
- cortex_a8_neon_fp_vmla_qqq,\
- cortex_a8_neon_fp_vrecps_vrsqrts_ddd,\
- cortex_a8_neon_fp_vrecps_vrsqrts_qqq")
-
-(define_bypass 0 "cortex_a8_neon_vshl_ddd"
- "cortex_a8_neon_int_1,\
- cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -991,6 +1205,7 @@ (define_bypass 0 "cortex_a8_neon_vshl_dd
(define_bypass 3 "cortex_a8_neon_shift_3"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1005,6 +1220,7 @@ (define_bypass 3 "cortex_a8_neon_shift_3
(define_bypass 3 "cortex_a8_neon_shift_2"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1019,6 +1235,7 @@ (define_bypass 3 "cortex_a8_neon_shift_2
(define_bypass 2 "cortex_a8_neon_shift_1"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1033,6 +1250,7 @@ (define_bypass 2 "cortex_a8_neon_shift_1
(define_bypass 5 "cortex_a8_neon_mla_ddd_16_scalar_qdd_32_16_long_scalar"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1047,6 +1265,7 @@ (define_bypass 5 "cortex_a8_neon_mla_ddd
(define_bypass 8 "cortex_a8_neon_mul_qqd_32_scalar"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1061,6 +1280,7 @@ (define_bypass 8 "cortex_a8_neon_mul_qqd
(define_bypass 5 "cortex_a8_neon_mul_ddd_16_scalar_32_16_long_scalar"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1075,6 +1295,7 @@ (define_bypass 5 "cortex_a8_neon_mul_ddd
(define_bypass 8 "cortex_a8_neon_mla_qqq_32_qqd_32_scalar"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1089,6 +1310,7 @@ (define_bypass 8 "cortex_a8_neon_mla_qqq
(define_bypass 6 "cortex_a8_neon_mla_ddd_32_qqd_16_ddd_32_scalar_qdd_64_32_long_scalar_qdd_64_32_long"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1103,6 +1325,7 @@ (define_bypass 6 "cortex_a8_neon_mla_ddd
(define_bypass 6 "cortex_a8_neon_mla_qqq_8_16"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1117,6 +1340,7 @@ (define_bypass 6 "cortex_a8_neon_mla_qqq
(define_bypass 5 "cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1131,6 +1355,7 @@ (define_bypass 5 "cortex_a8_neon_mla_ddd
(define_bypass 6 "cortex_a8_neon_mul_qdd_64_32_long_qqd_16_ddd_32_scalar_64_32_long_scalar"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1145,6 +1370,7 @@ (define_bypass 6 "cortex_a8_neon_mul_qdd
(define_bypass 6 "cortex_a8_neon_mul_qqq_8_16_32_ddd_32"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1159,20 +1385,7 @@ (define_bypass 6 "cortex_a8_neon_mul_qqq
(define_bypass 5 "cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
- cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
- cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
- cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
- cortex_a8_neon_mla_qqq_8_16,\
- cortex_a8_neon_fp_vadd_ddd_vabs_dd,\
- cortex_a8_neon_fp_vadd_qqq_vabs_qq,\
- cortex_a8_neon_fp_vmla_ddd,\
- cortex_a8_neon_fp_vmla_qqq,\
- cortex_a8_neon_fp_vrecps_vrsqrts_ddd,\
- cortex_a8_neon_fp_vrecps_vrsqrts_qqq")
-
-(define_bypass 5 "cortex_a8_neon_vsma"
- "cortex_a8_neon_int_1,\
- cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1187,6 +1400,7 @@ (define_bypass 5 "cortex_a8_neon_vsma"
(define_bypass 6 "cortex_a8_neon_vaba_qqq"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1201,6 +1415,7 @@ (define_bypass 6 "cortex_a8_neon_vaba_qq
(define_bypass 5 "cortex_a8_neon_vaba"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1212,9 +1427,10 @@ (define_bypass 5 "cortex_a8_neon_vaba"
cortex_a8_neon_fp_vrecps_vrsqrts_ddd,\
cortex_a8_neon_fp_vrecps_vrsqrts_qqq")
-(define_bypass 2 "cortex_a8_neon_vmov"
+(define_bypass 3 "cortex_a8_neon_bit_ops_q"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1229,6 +1445,7 @@ (define_bypass 2 "cortex_a8_neon_vmov"
(define_bypass 3 "cortex_a8_neon_vqneg_vqabs"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1243,6 +1460,7 @@ (define_bypass 3 "cortex_a8_neon_vqneg_v
(define_bypass 3 "cortex_a8_neon_int_5"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1257,6 +1475,7 @@ (define_bypass 3 "cortex_a8_neon_int_5"
(define_bypass 3 "cortex_a8_neon_int_4"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1271,6 +1490,7 @@ (define_bypass 3 "cortex_a8_neon_int_4"
(define_bypass 2 "cortex_a8_neon_int_3"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1285,6 +1505,7 @@ (define_bypass 2 "cortex_a8_neon_int_3"
(define_bypass 2 "cortex_a8_neon_int_2"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\
@@ -1299,6 +1520,7 @@ (define_bypass 2 "cortex_a8_neon_int_2"
(define_bypass 2 "cortex_a8_neon_int_1"
"cortex_a8_neon_int_1,\
cortex_a8_neon_int_4,\
+ cortex_a8_neon_bit_ops_q,\
cortex_a8_neon_mul_ddd_8_16_qdd_16_8_long_32_16_long,\
cortex_a8_neon_mul_qqq_8_16_32_ddd_32,\
cortex_a8_neon_mla_ddd_8_16_qdd_16_8_long_32_16_long,\