This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH/AARCH64] Add scheduler for Thunderx2t99


On Thu, Feb 02, 2017 at 05:21:05AM +0000, Hurugalawadi, Naveen wrote:
> Hi James,
> 
> Thanks for reviewing the patch and comments.
> 
> >> I wonder whether the current modeling of:
> >> (define_insn_reservation "thunderx2t99_asimd_load4_elts" 6
> >> Actually benefits the schedule in a meaningful way, or if it just increases
> 
> Done. Removed the scheduler modeling for thunderx2t99_asimd_load*_mult and
> thunderx2t99_asimd_load*_elts for ld3/ld4 and st3/st4 which are rarely used.
> 
> The automaton size has come down drastically without that and hopefully
> should be okay.
> ============================================================
> Automaton `thunderx2t99'
>       184 NDFA states,            838 NDFA arcs
>       184 DFA states,             838 DFA arcs
>       184 minimal DFA states,     838 minimal DFA arcs
>       360 all insns          8 insn equivalence classes
>     0 locked states
>  1016 transition comb vector els,  1472 trans table els: use simple vect
>  1472 min delay table els, compression factor 4
> 
> Automaton `thunderx2t99_advsimd'
>       453 NDFA states,           1966 NDFA arcs
>       453 DFA states,            1966 DFA arcs
>       351 minimal DFA states,    1562 minimal DFA arcs
>       360 all insns          7 insn equivalence classes
>     0 locked states
>  1901 transition comb vector els,  2457 trans table els: use simple vect
>  2457 min delay table els, compression factor 2
> 
> Automaton `thunderx2t99_ldst'
>        41 NDFA states,            163 NDFA arcs
>        41 DFA states,             163 DFA arcs
>        14 minimal DFA states,      78 minimal DFA arcs
>       360 all insns          8 insn equivalence classes
>     0 locked states
>    83 transition comb vector els,   112 trans table els: use simple vect
>   112 min delay table els, compression factor 4
> 
> Automaton `thunderx2t99_mult'
>         2 NDFA states,              5 NDFA arcs
>         2 DFA states,               5 DFA arcs
>         2 minimal DFA states,       5 minimal DFA arcs
>       360 all insns          3 insn equivalence classes
>     0 locked states
>     6 transition comb vector els,     6 trans table els: use simple vect
>     6 min delay table els, compression factor 8
> ============================================================
> 
> >> You'll want to update this to use your new scheduling model :-).
> 
> Done. I had overlooked it :-).
> 
> >> you should be changing vulcan to use the new thunderx2t99 model. 
> 
> Done. Using the new thunderx2t99 model.

That looks much better.

I'm assuming you've tested this as appropriate for the subtargets you're
modifying and are comfortable with the level of risk taking the patch at
this stage. As it only changes behaviour for the thunderx2t99 and vulcan
targets, I'd be happy to take the patch now. Though please give
Richard/Marcus 24 hours to object.

OK if no objections from others in the next 24 hours.

Thanks,
James

> Please review the modified patch and let us know your comments on the same.
> 
> Thanks,
> Naveen

> diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
> index a7a4b33..1b958e3 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -74,8 +74,8 @@ AARCH64_CORE("xgene1",      xgene1,    xgene1,    8A,  AARCH64_FL_FOR_ARCH8, xge
>  /* V8.1 Architecture Processors.  */
>  
>  /* Broadcom ('B') cores. */
> -AARCH64_CORE("thunderx2t99",  thunderx2t99, cortexa57, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1)
> -AARCH64_CORE("vulcan",  vulcan, cortexa57, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1)
> +AARCH64_CORE("thunderx2t99",  thunderx2t99, thunderx2t99, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1)
> +AARCH64_CORE("vulcan",  vulcan, thunderx2t99, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1)
>  
>  /* V8 big.LITTLE implementations.  */
>  
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index a693a3b..7550c3e 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -225,6 +225,7 @@
>  (include "../arm/exynos-m1.md")
>  (include "thunderx.md")
>  (include "../arm/xgene1.md")
> +(include "thunderx2t99.md")
>  
>  ;; -------------------------------------------------------------------
>  ;; Jumps and other miscellaneous insns
> diff --git a/gcc/config/aarch64/thunderx2t99.md b/gcc/config/aarch64/thunderx2t99.md
> new file mode 100644
> index 0000000..0dd7199
> --- /dev/null
> +++ b/gcc/config/aarch64/thunderx2t99.md
> @@ -0,0 +1,443 @@
> +;; Cavium ThunderX 2 CN99xx pipeline description
> +;; Copyright (C) 2016-2017 Free Software Foundation, Inc.
> +;;
> +;; Contributed by Cavium, Broadcom and Mentor Embedded.
> +
> +;; This file is part of GCC.
> +
> +;; GCC is free software; you can redistribute it and/or modify
> +;; it under the terms of the GNU General Public License as published by
> +;; the Free Software Foundation; either version 3, or (at your option)
> +;; any later version.
> +
> +;; GCC is distributed in the hope that it will be useful,
> +;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +;; GNU General Public License for more details.
> +
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +(define_automaton "thunderx2t99, thunderx2t99_advsimd, thunderx2t99_ldst")
> +(define_automaton "thunderx2t99_mult")
> +
> +(define_cpu_unit "thunderx2t99_i0" "thunderx2t99")
> +(define_cpu_unit "thunderx2t99_i1" "thunderx2t99")
> +(define_cpu_unit "thunderx2t99_i2" "thunderx2t99")
> +
> +(define_cpu_unit "thunderx2t99_ls0" "thunderx2t99_ldst")
> +(define_cpu_unit "thunderx2t99_ls1" "thunderx2t99_ldst")
> +(define_cpu_unit "thunderx2t99_sd" "thunderx2t99_ldst")
> +
> +; Pseudo-units for multiply pipeline.
> +
> +(define_cpu_unit "thunderx2t99_i1m1" "thunderx2t99_mult")
> +(define_cpu_unit "thunderx2t99_i1m2" "thunderx2t99_mult")
> +(define_cpu_unit "thunderx2t99_i1m3" "thunderx2t99_mult")
> +
> +; Pseudo-units for load delay (assuming dcache hit).
> +
> +(define_cpu_unit "thunderx2t99_ls0d1" "thunderx2t99_ldst")
> +(define_cpu_unit "thunderx2t99_ls0d2" "thunderx2t99_ldst")
> +(define_cpu_unit "thunderx2t99_ls0d3" "thunderx2t99_ldst")
> +
> +(define_cpu_unit "thunderx2t99_ls1d1" "thunderx2t99_ldst")
> +(define_cpu_unit "thunderx2t99_ls1d2" "thunderx2t99_ldst")
> +(define_cpu_unit "thunderx2t99_ls1d3" "thunderx2t99_ldst")
> +
> +; Make some aliases for f0/f1.
> +(define_cpu_unit "thunderx2t99_f0" "thunderx2t99_advsimd")
> +(define_cpu_unit "thunderx2t99_f1" "thunderx2t99_advsimd")
> +
> +(define_reservation "thunderx2t99_i012" "thunderx2t99_i0|thunderx2t99_i1|thunderx2t99_i2")
> +(define_reservation "thunderx2t99_ls01" "thunderx2t99_ls0|thunderx2t99_ls1")
> +(define_reservation "thunderx2t99_f01" "thunderx2t99_f0|thunderx2t99_f1")
> +
> +(define_reservation "thunderx2t99_ls_both" "thunderx2t99_ls0+thunderx2t99_ls1")
> +
> +; A load with delay in the ls0/ls1 pipes.
> +(define_reservation "thunderx2t99_l0delay" "thunderx2t99_ls0,\
> +				      thunderx2t99_ls0d1,thunderx2t99_ls0d2,\
> +				      thunderx2t99_ls0d3")
> +(define_reservation "thunderx2t99_l1delay" "thunderx2t99_ls1,\
> +				      thunderx2t99_ls1d1,thunderx2t99_ls1d2,\
> +				      thunderx2t99_ls1d3")
> +(define_reservation "thunderx2t99_l01delay" "thunderx2t99_l0delay|thunderx2t99_l1delay")
> +
> +;; Branch and call instructions.
> +
> +(define_insn_reservation "thunderx2t99_branch" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "call,branch"))
> +  "thunderx2t99_i2")
> +
> +;; Integer arithmetic/logic instructions.
> +
> +; Plain register moves are handled by renaming, and don't create any uops.
> +
> +(define_insn_reservation "thunderx2t99_regmove" 0
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "mov_reg"))
> +  "nothing")
> +
> +(define_insn_reservation "thunderx2t99_alu_basic" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "alu_imm,alu_sreg,alus_imm,alus_sreg,\
> +			adc_reg,adc_imm,adcs_reg,adcs_imm,\
> +			logic_reg,logic_imm,logics_reg,logics_imm,\
> +			csel,adr,mov_imm,shift_reg,shift_imm,bfm,\
> +			rbit,rev,extend,rotate_imm"))
> +  "thunderx2t99_i012")
> +
> +(define_insn_reservation "thunderx2t99_alu_shift" 2
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "alu_shift_imm,alu_ext,alu_shift_reg,\
> +			alus_shift_imm,alus_ext,alus_shift_reg,\
> +			logic_shift_imm,logics_shift_reg"))
> +  "thunderx2t99_i012,thunderx2t99_i012")
> +
> +(define_insn_reservation "thunderx2t99_div" 13
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "sdiv,udiv"))
> +  "thunderx2t99_i1*3")
> +
> +(define_insn_reservation "thunderx2t99_madd" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "mla,smlal,umlal"))
> +  "thunderx2t99_i1,thunderx2t99_i1m1,thunderx2t99_i1m2,thunderx2t99_i1m3,\
> +   thunderx2t99_i012")
> +
> +; NOTE: smull, umull are used for "high part" multiplies too.
> +(define_insn_reservation "thunderx2t99_mul" 4
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "mul,smull,umull"))
> +  "thunderx2t99_i1,thunderx2t99_i1m1,thunderx2t99_i1m2,thunderx2t99_i1m3")
> +
> +(define_insn_reservation "thunderx2t99_countbits" 3
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "clz"))
> +  "thunderx2t99_i1")
> +
> +;; Integer loads and stores.
> +
> +(define_insn_reservation "thunderx2t99_load_basic" 4
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "load1"))
> +  "thunderx2t99_ls01")
> +
> +(define_insn_reservation "thunderx2t99_loadpair" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "load2"))
> +  "thunderx2t99_i012,thunderx2t99_ls01")
> +
> +(define_insn_reservation "thunderx2t99_store_basic" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "store1"))
> +  "thunderx2t99_ls01,thunderx2t99_sd")
> +
> +(define_insn_reservation "thunderx2t99_storepair_basic" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "store2"))
> +  "thunderx2t99_ls01,thunderx2t99_sd")
> +
> +;; FP data processing instructions.
> +
> +(define_insn_reservation "thunderx2t99_fp_simple" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "ffariths,ffarithd,f_minmaxs,f_minmaxd"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_fp_addsub" 6
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "fadds,faddd"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_fp_cmp" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "fcmps,fcmpd"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_fp_divsqrt_s" 16
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "fdivs,fsqrts"))
> +  "thunderx2t99_f0*3|thunderx2t99_f1*3")
> +
> +(define_insn_reservation "thunderx2t99_fp_divsqrt_d" 23
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "fdivd,fsqrtd"))
> +  "thunderx2t99_f0*5|thunderx2t99_f1*5")
> +
> +(define_insn_reservation "thunderx2t99_fp_mul_mac" 6
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "fmuls,fmuld,fmacs,fmacd"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_frint" 7
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "f_rints,f_rintd"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_fcsel" 4
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "fcsel"))
> +  "thunderx2t99_f01")
> +
> +;; FP miscellaneous instructions.
> +
> +(define_insn_reservation "thunderx2t99_fp_cvt" 7
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "f_cvtf2i,f_cvt,f_cvti2f"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_fp_mov" 4
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "fconsts,fconstd,fmov,f_mrc"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_fp_mov_to_gen" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "f_mcr"))
> +  "thunderx2t99_f01")
> +
> +;; FP loads and stores.
> +
> +(define_insn_reservation "thunderx2t99_fp_load_basic" 4
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "f_loads,f_loadd"))
> +  "thunderx2t99_ls01")
> +
> +(define_insn_reservation "thunderx2t99_fp_loadpair_basic" 4
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_load1_2reg"))
> +  "thunderx2t99_ls01*2")
> +
> +(define_insn_reservation "thunderx2t99_fp_store_basic" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "f_stores,f_stored"))
> +  "thunderx2t99_ls01,thunderx2t99_sd")
> +
> +(define_insn_reservation "thunderx2t99_fp_storepair_basic" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_store1_2reg"))
> +  "thunderx2t99_ls01,(thunderx2t99_ls01+thunderx2t99_sd),thunderx2t99_sd")
> +
> +;; ASIMD integer instructions.
> +
> +(define_insn_reservation "thunderx2t99_asimd_int" 7
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_abd,neon_abd_q,\
> +			neon_arith_acc,neon_arith_acc_q,\
> +			neon_abs,neon_abs_q,\
> +			neon_add,neon_add_q,\
> +			neon_neg,neon_neg_q,\
> +			neon_add_long,neon_add_widen,\
> +			neon_add_halve,neon_add_halve_q,\
> +			neon_sub_long,neon_sub_widen,\
> +			neon_sub_halve,neon_sub_halve_q,\
> +			neon_add_halve_narrow_q,neon_sub_halve_narrow_q,\
> +			neon_qabs,neon_qabs_q,\
> +			neon_qadd,neon_qadd_q,\
> +			neon_qneg,neon_qneg_q,\
> +			neon_qsub,neon_qsub_q,\
> +			neon_minmax,neon_minmax_q,\
> +			neon_reduc_minmax,neon_reduc_minmax_q,\
> +			neon_mul_b,neon_mul_h,neon_mul_s,\
> +			neon_mul_b_q,neon_mul_h_q,neon_mul_s_q,\
> +			neon_sat_mul_b,neon_sat_mul_h,neon_sat_mul_s,\
> +			neon_sat_mul_b_q,neon_sat_mul_h_q,neon_sat_mul_s_q,\
> +			neon_mla_b,neon_mla_h,neon_mla_s,\
> +			neon_mla_b_q,neon_mla_h_q,neon_mla_s_q,\
> +			neon_mul_b_long,neon_mul_h_long,\
> +			neon_mul_s_long,neon_mul_d_long,\
> +			neon_sat_mul_b_long,neon_sat_mul_h_long,\
> +			neon_sat_mul_s_long,\
> +			neon_mla_b_long,neon_mla_h_long,neon_mla_s_long,\
> +			neon_sat_mla_b_long,neon_sat_mla_h_long,\
> +			neon_sat_mla_s_long,\
> +			neon_shift_acc,neon_shift_acc_q,\
> +			neon_shift_imm,neon_shift_imm_q,\
> +			neon_shift_reg,neon_shift_reg_q,\
> +			neon_shift_imm_long,neon_shift_imm_narrow_q,\
> +			neon_sat_shift_imm,neon_sat_shift_imm_q,\
> +			neon_sat_shift_reg,neon_sat_shift_reg_q,\
> +			neon_sat_shift_imm_narrow_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_reduc_add" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_reduc_add,neon_reduc_add_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_cmp" 7
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_compare,neon_compare_q,neon_compare_zero,\
> +			neon_tst,neon_tst_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_logic" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_logic,neon_logic_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_polynomial" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_mul_d_long"))
> +  "thunderx2t99_f01")
> +
> +;; ASIMD floating-point instructions.
> +
> +(define_insn_reservation "thunderx2t99_asimd_fp_simple" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_fp_abs_s,neon_fp_abs_d,\
> +			neon_fp_abs_s_q,neon_fp_abs_d_q,\
> +			neon_fp_compare_s,neon_fp_compare_d,\
> +			neon_fp_compare_s_q,neon_fp_compare_d_q,\
> +			neon_fp_minmax_s,neon_fp_minmax_d,\
> +			neon_fp_minmax_s_q,neon_fp_minmax_d_q,\
> +			neon_fp_reduc_minmax_s,neon_fp_reduc_minmax_d,\
> +			neon_fp_reduc_minmax_s_q,neon_fp_reduc_minmax_d_q,\
> +			neon_fp_neg_s,neon_fp_neg_d,\
> +			neon_fp_neg_s_q,neon_fp_neg_d_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_fp_arith" 6
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_fp_abd_s,neon_fp_abd_d,\
> +			neon_fp_abd_s_q,neon_fp_abd_d_q,\
> +			neon_fp_addsub_s,neon_fp_addsub_d,\
> +			neon_fp_addsub_s_q,neon_fp_addsub_d_q,\
> +			neon_fp_reduc_add_s,neon_fp_reduc_add_d,\
> +			neon_fp_reduc_add_s_q,neon_fp_reduc_add_d_q,\
> +			neon_fp_mul_s,neon_fp_mul_d,\
> +			neon_fp_mul_s_q,neon_fp_mul_d_q,\
> +			neon_fp_mla_s,neon_fp_mla_d,\
> +			neon_fp_mla_s_q,neon_fp_mla_d_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_fp_conv" 7
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_fp_cvt_widen_s,neon_fp_cvt_narrow_d_q,\
> +			neon_fp_to_int_s,neon_fp_to_int_d,\
> +			neon_fp_to_int_s_q,neon_fp_to_int_d_q,\
> +			neon_fp_round_s,neon_fp_round_d,\
> +			neon_fp_round_s_q,neon_fp_round_d_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_fp_div_s" 16
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_fp_div_s,neon_fp_div_s_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_fp_div_d" 23
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_fp_div_d,neon_fp_div_d_q"))
> +  "thunderx2t99_f01")
> +
> +;; ASIMD miscellaneous instructions.
> +
> +(define_insn_reservation "thunderx2t99_asimd_misc" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_rbit,\
> +			neon_bsl,neon_bsl_q,\
> +			neon_cls,neon_cls_q,\
> +			neon_cnt,neon_cnt_q,\
> +			neon_from_gp,neon_from_gp_q,\
> +			neon_dup,neon_dup_q,\
> +			neon_ext,neon_ext_q,\
> +			neon_ins,neon_ins_q,\
> +			neon_move,neon_move_q,\
> +			neon_fp_recpe_s,neon_fp_recpe_d,\
> +			neon_fp_recpe_s_q,neon_fp_recpe_d_q,\
> +			neon_fp_recpx_s,neon_fp_recpx_d,\
> +			neon_fp_recpx_s_q,neon_fp_recpx_d_q,\
> +			neon_rev,neon_rev_q,\
> +			neon_dup,neon_dup_q,\
> +			neon_permute,neon_permute_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_recip_step" 6
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_fp_recps_s,neon_fp_recps_s_q,\
> +			neon_fp_recps_d,neon_fp_recps_d_q,\
> +			neon_fp_rsqrts_s, neon_fp_rsqrts_s_q,\
> +			neon_fp_rsqrts_d, neon_fp_rsqrts_d_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_lut" 8
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_tbl1,neon_tbl1_q,neon_tbl2_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_elt_to_gr" 6
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_to_gp,neon_to_gp_q"))
> +  "thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_ext" 7
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_shift_imm_narrow_q,neon_sat_shift_imm_narrow_q"))
> +  "thunderx2t99_f01")
> +
> +;; ASIMD load instructions.
> +
> +; NOTE: These reservations attempt to model latency and throughput correctly,
> +; but the cycle timing of unit allocation is not necessarily accurate (because
> +; insns are split into uops, and those may be issued out-of-order).
> +
> +(define_insn_reservation "thunderx2t99_asimd_load1_1_mult" 4
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_load1_1reg,neon_load1_1reg_q"))
> +  "thunderx2t99_ls01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_load1_2_mult" 4
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_load1_2reg,neon_load1_2reg_q"))
> +  "thunderx2t99_ls_both")
> +
> +(define_insn_reservation "thunderx2t99_asimd_load1_onelane" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_load1_one_lane,neon_load1_one_lane_q"))
> +  "thunderx2t99_l01delay,thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_load1_all" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_load1_all_lanes,neon_load1_all_lanes_q"))
> +  "thunderx2t99_l01delay,thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_load2" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_load2_2reg,neon_load2_2reg_q,\
> +			neon_load2_one_lane,neon_load2_one_lane_q,\
> +			neon_load2_all_lanes,neon_load2_all_lanes_q"))
> +  "(thunderx2t99_l0delay,thunderx2t99_f01)|(thunderx2t99_l1delay,\
> +    thunderx2t99_f01)")
> +
> +;; ASIMD store instructions.
> +
> +; Same note applies as for ASIMD load instructions.
> +
> +(define_insn_reservation "thunderx2t99_asimd_store1_1_mult" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_store1_1reg,neon_store1_1reg_q"))
> +  "thunderx2t99_ls01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_store1_2_mult" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_store1_2reg,neon_store1_2reg_q"))
> +  "thunderx2t99_ls_both")
> +
> +(define_insn_reservation "thunderx2t99_asimd_store1_onelane" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_store1_one_lane,neon_store1_one_lane_q"))
> +  "thunderx2t99_ls01,thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_store2_mult" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_store2_2reg,neon_store2_2reg_q"))
> +  "thunderx2t99_ls_both,thunderx2t99_f01")
> +
> +(define_insn_reservation "thunderx2t99_asimd_store2_onelane" 1
> +  (and (eq_attr "tune" "thunderx2t99")
> +       (eq_attr "type" "neon_store2_one_lane,neon_store2_one_lane_q"))
> +  "thunderx2t99_ls01,thunderx2t99_f01")


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]