This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] PowerPC DFA description


David Edelsohn wrote:
> 
> >>>>> Vladimir Makarov writes:
> 
> Vlad> I'll look at this. David.  But at the first look, to have smaller
> Vlad> automata, fpu1 fpu2 should be not in the same automata as fpu1_iter and
> Vlad> fpu2_iter because fpu1/fpu2 is reserved always short time and
> Vlad> fpu1_iter/fpu2_iter is reserved during long time.  The automaton
> Vlad> containing them will have too many combinations of fpu1/fpu2
> Vlad> reservations during maximal time reservations of fpu1_other/fpu2_other
> Vlad> (26 cycles).  To achieve it without the genattrtab complaint, we could
> Vlad> reserve fpu_iter1/fpu_iter2 only starting with the second cycle.  The
> Vlad> description will behave the same way but the automata will be smaller.
> 
>         I tried to preserve fpu1_iter and fpu2_iter in separate automata,
> as in the original description, but genautomata complained:
> 
> ;; Dual floating point units (FPU1 and FPU2)
> (define_cpu_unit "fpu1" "fp_other")
> (define_cpu_unit "fpu2" "fp_other")
> (define_cpu_unit "fpu1_iter" "fpu1_fdiv")
> (define_cpu_unit "fpu2_iter" "fpu2_fdiv")
> 
> ./genattrtab /u/dje/src/GNU/gcc/gcc/config/rs6000/rs6000.md > tmp-attrtab.c
> Check description...done
> Reservation transformation...done
> genattrtab: Units `fpu1' and `fpu2_iter' should be in the same automaton
> genattrtab: Units `fpu1' and `fpu1_iter' should be in the same automaton
> genattrtab: Units `fpu2' and `fpu2_iter' should be in the same automaton
> genattrtab: Units `fpu2' and `fpu1_iter' should be in the same automaton
> All other genattrtab stuff...make: *** [s-attrtab] Error 1
> 
>         Either there is a subtle mistake in the description or genautomata
> will not let me use that type of description.

  Units reserved on the same cycle in an insn reservation should be in
the
same automaton if they are reserved not on all alternatives of the insn
reservation on the cycle.  Therefore genattrtab complaints about it.

  Old genattrtab had a bug permiting descriptions ignoring the
requirement.  It resulted in incorrect automata generation (for ppc port
only rios2 had incorrect description).

Here is the output for your original description obtained on old iMac.
...
Automaton `fpu2_other'
     8502 NDFA states,          30620 NDFA arcs
     8502 DFA states,           30620 DFA arcs
     6233 minimal DFA states,   22720 minimal DFA arcs
      182 all insns         10 insn equivalence classes
28292 transition comb vector els, 62330 trans table els: use simple vect
28292 state alts comb vector els, 62330 state alts table els: use simple
vect
85020 min delay table els, compression factor 1

...

15370 all allocated states,     40550 all allocated arcs
14595 all allocated alternative states
30128 all transition comb vector els, 66979 all trans table els
30128 all state alts comb vector els, 66979 all state alts table els
120164 all min delay table els
    0 locked states num

  transformation: 0.020000, building DFA: 9.520000
  DFA minimization: 0.570000, making insn equivalence: 0.030000
 all automaton generation: 10.520000, output: 38.640000

  With my point of view the automata were not big.  May be their size
could be really big for a scanner but not for the pipeline hazard
recognizer (actually states of its automata are automata
(reservations) themselves).  It is 1 minute of building automata and
about 60Kb-100Kb for tables (state alternative tables is actually not
used until a macro is defined).  It is not big deal.

  But it still could be improved as I wrote in my previous message.  I
placed modified description in the attachment.  Here is the output for
the modified description:

...

Automaton `fpu2_other'
        9 NDFA states,             30 NDFA arcs
        9 DFA states,              30 DFA arcs
        9 minimal DFA states,      30 minimal DFA arcs
      182 all insns          4 insn equivalence classes
   32 transition comb vector els,    36 trans table els: use simple vect
   32 state alts comb vector els,    36 state alts table els: use simple
vect
   36 min delay table els, compression factor 4

Automaton `fpu2_iter'
     1542 NDFA states,           4110 NDFA arcs
     1542 DFA states,            4110 DFA arcs
      726 minimal DFA states,    1958 minimal DFA arcs
      182 all insns          8 insn equivalence classes
 1964 transition comb vector els,  5808 trans table els: use comb vect
 1964 state alts comb vector els,  5808 state alts table els: use comb
vect
12336 min delay table els, compression factor 1

...


 7167 all allocated states,     19818 all allocated arcs
12312 all allocated alternative states
 3832 all transition comb vector els, 10493 all trans table els
 3832 all state alts comb vector els, 10493 all state alts table els
47516 all min delay table els
    0 locked states num

  transformation: 0.010000, building DFA: 5.220000
  DFA minimization: 0.090000, making insn equivalence: 0.000000
 all automaton generation: 5.520000, output: 0.660000


  Overall size of the tables is about 30Kb-40Kb and time of building the
automata is about 10sec.

Vlad

(define_automaton "other,idiv,fdiv,memory,fp_other")
(define_automaton "iu2_other,iu2_idiv")
(define_automaton "iu3_other")
(define_automaton "fpu2_other,fpu2_iter")
(define_automaton "vec_alu")
(define_automaton "mciu_other,mciu_idiv")
(define_automaton "dispatch")
(define_automaton "vdisp")

;; Integer unit (IU)
(define_cpu_unit "iu" "other")
(define_cpu_unit "iu_iter" "idiv")

;; Dual integer units (IU1 and IU2)
(define_cpu_unit "iu1" "iu2_other")
(define_cpu_unit "iu1_iter" "iu2_idiv")
(define_cpu_unit "iu2" "iu2_other")

;; Triple integer units (IUa, IUb, IUc)
(define_cpu_unit "iua" "iu3_other")
(define_cpu_unit "iub" "iu3_other")
(define_cpu_unit "iuc" "iu3_other")

;; Load/store unit (LSU)
(define_cpu_unit "lsu" "memory")

;; Multicycle integer unit (MCIU)  (for integer multiply/divide)
(define_cpu_unit "mciu" "mciu_other")
(define_cpu_unit "mciu_iter" "mciu_idiv")

;; Floating point unit (FPU)
(define_cpu_unit "fpu" "fp_other")
(define_cpu_unit "fpu_iter" "fdiv")

;; Dual floating point units (FPU1 and FPU2)
(define_cpu_unit "fpu1" "fpu2_other")
(define_cpu_unit "fpu2" "fpu2_other")
(define_cpu_unit "fpu1_iter" "fpu2_iter")
(define_cpu_unit "fpu2_iter" "fpu2_iter")

;; Dual vector ALUs (VEC1 and VEC2)
(define_cpu_unit "vec_simple" "vec_alu")
(define_cpu_unit "vec_complex" "vec_alu")
(define_cpu_unit "vec_float" "vec_alu")
(define_cpu_unit "vec_permute" "vec_alu")

;; Branch prediction unit (BPU)
(define_cpu_unit "bpu" "other")

;; System register unit (SRU)
(define_cpu_unit "sru" "other")

;; Condition register unit (CRU)
(define_cpu_unit "cru" "other")

;; Dispatch unit (DU)
(define_cpu_unit "du1,du2,du3,du4" "dispatch")
(define_cpu_unit "vdu1,vdu2" "vdisp")


;; RIOS1  32-bit IU, FPU, BPU

(define_insn_reservation "rios1-load" 2
  (and (eq_attr "type" "load,fpload")
       (eq_attr "cpu" "rios1"))
  "iu")

(define_insn_reservation "rios1-store" 1
  (and (eq_attr "type" "store")
       (eq_attr "cpu" "rios1"))
  "iu")

;; ????
(define_insn_reservation "rios1-fpstore" 1
  (and (eq_attr "type" "fpstore")
       (eq_attr "cpu" "rios1"))
  "iu+fpu")

(define_insn_reservation "rios1-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "rios1"))
  "iu")

(define_insn_reservation "rios1-imul" 5
  (and (eq_attr "type" "imul")
       (eq_attr "cpu" "rios1"))
  "iu*5")

(define_insn_reservation "rios1-imul2" 4
  (and (eq_attr "type" "imul2")
       (eq_attr "cpu" "rios1"))
  "iu*4")

(define_insn_reservation "rios1-imul3" 3
  (and (eq_attr "type" "imul")
       (eq_attr "cpu" "rios1"))
  "iu*3")

(define_insn_reservation "rios1-idiv" 19
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "rios1"))
  "iu+iu_iter*19")

; compare executes on integer unit, but feeds insns which
; execute on the branch unit.
(define_insn_reservation "rios1-compare" 4
  (and (eq_attr "type" "compare")
       (eq_attr "cpu" "rios1"))
  "iu")

(define_insn_reservation "rios1-delayed_compare" 5
  (and (eq_attr "type" "delayed_compare")
       (eq_attr "cpu" "rios1"))
  "iu")

(define_insn_reservation "rios1-fpcompare" 9
  (and (eq_attr "type" "fpcompare")
       (eq_attr "cpu" "rios1"))
  "fpu")

(define_insn_reservation "rios1-fp" 2
  (and (eq_attr "type" "fp,dmul")
       (eq_attr "cpu" "rios1"))
  "fpu")

(define_insn_reservation "rios1-sdiv" 19
  (and (eq_attr "type" "sdiv,ddiv")
       (eq_attr "cpu" "rios1"))
  "fpu+fpu_iter*19")

(define_insn_reservation "rios1-crlogical" 4
  (and (eq_attr "type" "cr_logical")
       (eq_attr "cpu" "rios1"))
  "bpu")

(define_insn_reservation "rios1-mtjmpr" 5
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "rios1"))
  "bpu")

(define_insn_reservation "rios1-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch")
       (eq_attr "cpu" "rios1"))
  "bpu")


;; RIOS2 32-bit 2xIU, 2xFPU, BPU
;; IU1 can perform all integer operations
;; IU2 can perform all integer operations except imul and idiv

(define_insn_reservation "rios2-load" 2
  (and (eq_attr "type" "load,fpload")
       (eq_attr "cpu" "rios2"))
  "iu1|iu2")

(define_insn_reservation "rios2-store" 1
  (and (eq_attr "type" "store,fpstore")
       (eq_attr "cpu" "rios2"))
  "iu1|iu2")

(define_insn_reservation "rios2-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "rios2"))
  "iu1|iu2")

(define_insn_reservation "rios2-imul" 2
  (and (eq_attr "type" "imul,imul2,imul3")
       (eq_attr "cpu" "rios2"))
  "iu1*2")

(define_insn_reservation "rios2-idiv" 13
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "rios2"))
  "iu1+iu1_iter*13")

; compare executes on integer unit, but feeds insns which
; execute on the branch unit.
(define_insn_reservation "rios2-compare" 3
  (and (eq_attr "type" "compare,delayed_compare")
       (eq_attr "cpu" "rios2"))
  "iu1|iu2")

(define_insn_reservation "rios2-fp" 2
  (and (eq_attr "type" "fp")
       (eq_attr "cpu" "rios2"))
  "fpu1|fpu2")

(define_insn_reservation "rios2-fpcompare" 5
  (and (eq_attr "type" "fpcompare")
       (eq_attr "cpu" "rios2"))
  "fpu1|fpu2")

(define_insn_reservation "rios2-dmul" 2
  (and (eq_attr "type" "dmul")
       (eq_attr "cpu" "rios2"))
  "fpu1|fpu2")

(define_insn_reservation "rios2-sdiv" 17
  (and (eq_attr "type" "sdiv,ddiv")
       (eq_attr "cpu" "rios2"))
  "(fpu1,fpu1_iter*16)|(fpu2,fpu2_iter*16)")

(define_insn_reservation "rios2-ssqrt" 26
  (and (eq_attr "type" "ssqrt,dsqrt")
       (eq_attr "cpu" "rios2"))
  "(fpu1,fpu1_iter*25)|(fpu2,fpu2_iter*25)")

(define_insn_reservation "rios2-crlogical" 4
  (and (eq_attr "type" "cr_logical")
       (eq_attr "cpu" "rios2"))
  "bpu")

(define_insn_reservation "rios2-mtjmpr" 5
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "rios2"))
  "bpu")

(define_insn_reservation "rios2-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch")
       (eq_attr "cpu" "rios2"))
  "bpu")


;; RS64a 64-bit IU, LSU, FPU, BPU

(define_insn_reservation "rs64a-load" 2
  (and (eq_attr "type" "load")
       (eq_attr "cpu" "rs64a"))
  "lsu")

(define_insn_reservation "rs64a-store" 1
  (and (eq_attr "type" "store,fpstore")
       (eq_attr "cpu" "rs64a"))
  "lsu")

(define_insn_reservation "rs64a-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "rs64a"))
  "iu")

(define_insn_reservation "rs64a-imul" 20
  (and (eq_attr "type" "imul")
       (eq_attr "cpu" "rs64a"))
  "iu+iu_iter*14")

(define_insn_reservation "rs64a-imul2" 12
  (and (eq_attr "type" "imul2")
       (eq_attr "cpu" "rs64a"))
  "iu+iu_iter*6")

(define_insn_reservation "rs64a-imul3" 8
  (and (eq_attr "type" "imul3")
       (eq_attr "cpu" "rs64a"))
  "iu+iu_iter*3")

(define_insn_reservation "rs64a-lmul" 34
  (and (eq_attr "type" "lmul")
       (eq_attr "cpu" "rs64a"))
  "iu+iu_iter*34")

(define_insn_reservation "rs64a-idiv" 66
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "rs64a"))
  "iu+iu_iter*66")

(define_insn_reservation "rs64a-ldiv" 66
  (and (eq_attr "type" "ldiv")
       (eq_attr "cpu" "rs64a"))
  "iu+iu_iter*66")

(define_insn_reservation "rs64a-compare" 3
  (and (eq_attr "type" "compare,delayed_compare")
       (eq_attr "cpu" "rs64a"))
  "iu")

(define_insn_reservation "rs64a-fpload" 3
  (and (eq_attr "type" "fpload")
       (eq_attr "cpu" "rs64a"))
  "lsu")

(define_insn_reservation "rs64a-fpcompare" 5
  (and (eq_attr "type" "fpcompare")
       (eq_attr "cpu" "rs64a"))
  "fpu")

(define_insn_reservation "rs64a-fp" 4
  (and (eq_attr "type" "fp")
       (eq_attr "cpu" "rs64a"))
  "fpu*2")

(define_insn_reservation "rs64a-dmul" 7
  (and (eq_attr "type" "dmul")
       (eq_attr "cpu" "rs64a"))
  "fpu*2")

(define_insn_reservation "rs64a-sdiv" 31
  (and (eq_attr "type" "sdiv")
       (eq_attr "cpu" "rs64a"))
  "fpu+fpu_iter*31")

(define_insn_reservation "rs64a-ddiv" 31
  (and (eq_attr "type" "ddiv")
       (eq_attr "cpu" "rs64a"))
  "fpu+fpu_iter*31")

(define_insn_reservation "rs64a-mtjmpr" 5
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "rs64a"))
  "bpu")

(define_insn_reservation "rs64a-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch,cr_logical")
       (eq_attr "cpu" "rs64a"))
  "bpu")


;; PPC401/ PPC403 32-bit integer only  IU BPU
;; Embedded PowerPC controller
;; In-order execution
;; Max issue two insns/cycle (includes one branch)
(define_insn_reservation "ppc403-load" 2
  (and (eq_attr "type" "load")
       (eq_attr "cpu" "ppc403"))
  "iu")

(define_insn_reservation "ppc403-store" 1
  (and (eq_attr "type" "store")
       (eq_attr "cpu" "ppc403"))
  "iu")

(define_insn_reservation "ppc403-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "ppc403"))
  "iu")

(define_insn_reservation "ppc403-compare" 3
  (and (eq_attr "type" "compare,delayed_compare")
       (eq_attr "cpu" "ppc403"))
  "iu")

(define_insn_reservation "ppc403-imul" 4
  (and (eq_attr "type" "imul,imul2,imul3")
       (eq_attr "cpu" "ppc403"))
  "iu*4")

(define_insn_reservation "ppc403-idiv" 33
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "ppc403"))
  "iu+iu_iter*33")

(define_insn_reservation "ppc403-mtjmpr" 4
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "ppc403"))
  "bpu")

(define_insn_reservation "ppc403-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch,cr_logical")
       (eq_attr "cpu" "ppc403"))
  "bpu")


;; PPC405 32-bit integer only  IU BPU
;; Embedded PowerPC controller
;; In-order execution
;; Max issue two insns/cycle (includes one branch)
(define_insn_reservation "ppc405-load" 2
  (and (eq_attr "type" "load")
       (eq_attr "cpu" "ppc405"))
  "iu")

(define_insn_reservation "ppc405-store" 1
  (and (eq_attr "type" "store")
       (eq_attr "cpu" "ppc405"))
  "iu")

(define_insn_reservation "ppc405-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "ppc405"))
  "iu")

(define_insn_reservation "ppc405-compare" 3
  (and (eq_attr "type" "compare,delayed_compare")
       (eq_attr "cpu" "ppc405"))
  "iu")

(define_insn_reservation "ppc405-imul" 4
  (and (eq_attr "type" "imul")
       (eq_attr "cpu" "ppc405"))
  "iu*3")

(define_insn_reservation "ppc405-imul2" 3
  (and (eq_attr "type" "imul2,imul3")
       (eq_attr "cpu" "ppc405"))
  "iu*2")

(define_insn_reservation "ppc405-idiv" 33
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "ppc405"))
  "iu+iu_iter*35")

(define_insn_reservation "ppc405-mtjmpr" 4
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "ppc405"))
  "bpu")

(define_insn_reservation "ppc405-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch,cr_logical")
       (eq_attr "cpu" "ppc405"))
  "bpu")


;; MPCCORE 32-bit SCIU, MCIU, LSU, FPU, BPU
;; 505/801/821/823

(define_insn_reservation "mpccore-load" 2
  (and (eq_attr "type" "load")
       (eq_attr "cpu" "mpccore"))
  "lsu")

(define_insn_reservation "mpccore-store" 1
  (and (eq_attr "type" "store,fpstore")
       (eq_attr "cpu" "mpccore"))
  "lsu")

(define_insn_reservation "mpccore-fpload" 2
  (and (eq_attr "type" "fpload")
       (eq_attr "cpu" "mpccore"))
  "lsu")

(define_insn_reservation "mpccore-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "mpccore"))
  "iu")

(define_insn_reservation "mpccore-imul" 2
  (and (eq_attr "type" "imul,imul2,imul3")
       (eq_attr "cpu" "mpccore"))
  "mciu")

; Divide latency varies greatly from 2-11, use 6 as average
(define_insn_reservation "mpccore-idiv" 6
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "mpccore"))
  "mciu*6")

(define_insn_reservation "mpccore-compare" 3
  (and (eq_attr "type" "compare,delayed_compare")
       (eq_attr "cpu" "mpccore"))
  "iu")

(define_insn_reservation "mpccore-fpcompare" 1
  (and (eq_attr "type" "fpcompare")
       (eq_attr "cpu" "mpccore"))
  "fpu")

(define_insn_reservation "mpccore-fp" 4
  (and (eq_attr "type" "fp")
       (eq_attr "cpu" "mpccore"))
  "fpu*2")

(define_insn_reservation "mpccore-dmul" 5
  (and (eq_attr "type" "dmul")
       (eq_attr "cpu" "mpccore"))
  "fpu*5")

(define_insn_reservation "mpccore-sdiv" 10
  (and (eq_attr "type" "sdiv")
       (eq_attr "cpu" "mpccore"))
  "fpu+fpu_iter*10")

(define_insn_reservation "mpccore-ddiv" 17
  (and (eq_attr "type" "ddiv")
       (eq_attr "cpu" "mpccore"))
  "fpu+fpu_iter*17")

(define_insn_reservation "mpccore-mtjmpr" 4
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "mpccore"))
  "bpu")

(define_insn_reservation "mpccore-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch,cr_logical")
       (eq_attr "cpu" "mpccore"))
  "bpu")


;; PPC601  32-bit IU, FPU, BPU

(define_insn_reservation "ppc601-load" 2
  (and (eq_attr "type" "load")
       (eq_attr "cpu" "ppc601"))
  "iu")

(define_insn_reservation "ppc601-store" 1
  (and (eq_attr "type" "store")
       (eq_attr "cpu" "ppc601"))
  "iu")

(define_insn_reservation "ppc601-fpload" 3
  (and (eq_attr "type" "fpload")
       (eq_attr "cpu" "ppc601"))
  "iu")

(define_insn_reservation "ppc601-fpstore" 1
  (and (eq_attr "type" "fpstore")
       (eq_attr "cpu" "ppc601"))
  "iu")

(define_insn_reservation "ppc601-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "ppc601"))
  "iu")

(define_insn_reservation "ppc601-imul" 5
  (and (eq_attr "type" "imul,imul2,imul3")
       (eq_attr "cpu" "ppc601"))
  "iu*5")

(define_insn_reservation "ppc601-idiv" 36
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "ppc601"))
  "iu+iu_iter*36")

; compare executes on integer unit, but feeds insns which
; execute on the branch unit.  Actual cmp latency 1.
(define_insn_reservation "ppc601-compare" 3
  (and (eq_attr "type" "compare,delayed_compare")
       (eq_attr "cpu" "ppc601"))
  "iu")

; PPC601 fpcompare takes also 2 cycles from the integer unit
(define_insn_reservation "ppc601-fpcompare" 5
  (and (eq_attr "type" "fpcompare")
       (eq_attr "cpu" "ppc601"))
  "fpu+iu*2")

(define_insn_reservation "ppc601-fp" 4
  (and (eq_attr "type" "fp")
       (eq_attr "cpu" "ppc601"))
  "fpu")

(define_insn_reservation "ppc601-dmul" 5
  (and (eq_attr "type" "dmul")
       (eq_attr "cpu" "ppc601"))
  "fpu*2")

(define_insn_reservation "ppc601-sdiv" 17
  (and (eq_attr "type" "sdiv")
       (eq_attr "cpu" "ppc601"))
  "fpu+fpu_iter*17")

(define_insn_reservation "ppc601-ddiv" 31
  (and (eq_attr "type" "ddiv")
       (eq_attr "cpu" "ppc601"))
  "fpu+fpu_iter*31")

(define_insn_reservation "ppc601-mtjmpr" 4
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "ppc601"))
  "bpu")

(define_insn_reservation "ppc601-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch,cr_logical")
       (eq_attr "cpu" "ppc601"))
  "bpu")


;; PPC603/PPC603e 32-bit IU, LSU, FPU, BPU, SRU
;; Max issue 3 insns/clock cycle (includes 1 branch)

;; Branches go straight to the BPU.  All other insns are handled
;; by a dispatch unit which can issue a max of 2 insns per cycle.
(define_reservation "ppc603_du" "du1|du2")

;; The PPC603e user's manual recommends that to reduce branch mispredictions,
;; the insn that sets CR bits should be separated from the branch insn
;; that evaluates them; separation by more than 9 insns ensures that the CR
;; bits will be immediately available for execution.
;; This could be artificially achieved by exagerating the latency of
;; compare insns but at the expense of a poorer schedule.

;; CR insns get executed in the SRU.  Not modelled.

(define_insn_reservation "ppc603-load" 2
  (and (eq_attr "type" "load")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,lsu")

(define_insn_reservation "ppc603-store" 1
  (and (eq_attr "type" "store,fpstore")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,lsu")

(define_insn_reservation "ppc603-fpload" 2
  (and (eq_attr "type" "fpload")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,lsu")

(define_insn_reservation "ppc603-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,iu")

; This takes 2 or 3 cycles
(define_insn_reservation "ppc603-imul" 3
  (and (eq_attr "type" "imul")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,iu*2")

(define_insn_reservation "ppc603-imul2" 2
  (and (eq_attr "type" "imul2,imul3")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,iu*2")

(define_insn_reservation "ppc603-idiv" 37
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,(iu+iu_iter*37)")

(define_insn_reservation "ppc603-compare" 3
  (and (eq_attr "type" "compare,delayed_compare")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,iu")

(define_insn_reservation "ppc603-fpcompare" 3
  (and (eq_attr "type" "fpcompare")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,fpu+iu*2")

(define_insn_reservation "ppc603-fp" 3
  (and (eq_attr "type" "fp")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,fpu")

(define_insn_reservation "ppc603-dmul" 4
  (and (eq_attr "type" "dmul")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,fpu*2")

; Divides are not pipelined
(define_insn_reservation "ppc603-sdiv" 18
  (and (eq_attr "type" "sdiv")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,(fpu*3+fpu_iter*18)")

(define_insn_reservation "ppc603-ddiv" 33
  (and (eq_attr "type" "ddiv")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,(fpu*3+fpu_iter*33)")

(define_insn_reservation "ppc603-crlogical" 3
  (and (eq_attr "type" "cr_logical")
       (eq_attr "cpu" "ppc603"))
  "ppc603_du,sru*2")

(define_insn_reservation "ppc603-mtjmpr" 4
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "ppc603"))
  "nothing,bpu")

(define_insn_reservation "ppc603-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch")
       (eq_attr "cpu" "ppc603"))
  "nothing,bpu")


;; PPC604  32-bit 2xSCIU, MCIU, LSU, FPU, BPU
;; PPC604e  32-bit 2xSCIU, MCIU, LSU, FPU, BPU, CRU
;; MCIU used for imul/idiv and moves from/to spr
;; LSU 2 stage pipelined
;; FPU 3 stage pipelined
;; Max issue 4 insns/clock cycle

;; PPC604e is PPC604 with larger caches and a CRU.  In the 604
;; the CR logical operations are handled in the BPU.
;; In the 604e, the CRU shares bus with BPU so only one condition
;; register or branch insn can be issued per clock.  Not modelled.

;; No following instruction can dispatch in the same cycle as a branch
;; instruction.  Not modelled.  This is no problem if RCSP is not
;; enabled since the scheduler stops a schedule when it gets to a branch.

;; PPC620  64-bit 2xSCIU, MCIU, LSU, FPU, BPU, CRU
;; Max issue 4 insns/clock cycle
;; Out-of-order execution, in-order completion

;; PPC630 64-bit 2xSCIU, MCIU, LSU, 2xFPU, BPU, CRU

;; Four insns can be dispatched per cycle.
(define_reservation "ppc604_du" "du1|du2|du3|du4")

(define_insn_reservation "ppc604-load" 2
  (and (eq_attr "type" "load")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620,ppc630"))
  "ppc604_du,lsu")

(define_insn_reservation "ppc604-fpload" 3
  (and (eq_attr "type" "fpload")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620,ppc630"))
  "ppc604_du,lsu")

(define_insn_reservation "ppc604-store" 1
  (and (eq_attr "type" "store,fpstore")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620,ppc630"))
  "ppc604_du,lsu")

(define_insn_reservation "ppc604-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620,ppc630"))
  "ppc604_du,(iu1|iu2)")

(define_insn_reservation "ppc604-imul" 4
  (and (eq_attr "type" "imul,imul2,imul3")
       (eq_attr "cpu" "ppc604"))
  "ppc604_du,mciu*2")

(define_insn_reservation "ppc604e-imul" 2
  (and (eq_attr "type" "imul,imul2,imul3")
       (eq_attr "cpu" "ppc604e"))
  "ppc604_du,mciu")

(define_insn_reservation "ppc620-imul" 5
  (and (eq_attr "type" "imul")
       (eq_attr "cpu" "ppc620,ppc630"))
  "ppc604_du,mciu*3")

(define_insn_reservation "ppc620-imul2" 4
  (and (eq_attr "type" "imul2")
       (eq_attr "cpu" "ppc620,ppc630"))
  "ppc604_du,mciu*3")

(define_insn_reservation "ppc620-imul3" 3
  (and (eq_attr "type" "imul3")
       (eq_attr "cpu" "ppc620,ppc630"))
  "ppc604_du,mciu*3")

(define_insn_reservation "ppc620-lmul" 7
  (and (eq_attr "type" "lmul")
       (eq_attr "cpu" "ppc620,ppc630"))
  "ppc604_du,mciu*5")

(define_insn_reservation "ppc604-idiv" 20
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "ppc604,ppc604e"))
  "ppc604_du,(mciu+mciu_iter*19)")

(define_insn_reservation "ppc620-idiv" 37
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "ppc620"))
  "ppc604_du,(mciu+mciu_iter*36)")

(define_insn_reservation "ppc630-idiv" 21
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "ppc630"))
  "ppc604_du,(mciu+mciu_iter*20)")

(define_insn_reservation "ppc620-ldiv" 37
  (and (eq_attr "type" "ldiv")
       (eq_attr "cpu" "ppc620,ppc630"))
  "ppc604_du,(mciu+mciu_iter*36)")

(define_insn_reservation "ppc604-compare" 1
  (and (eq_attr "type" "compare,delayed_compare")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620,ppc630"))
  "ppc604_du,(iu1|iu2)")

; FPU PPC604{,e},PPC620
(define_insn_reservation "ppc604-fpcompare" 5
  (and (eq_attr "type" "fpcompare")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620"))
  "ppc604_du,fpu")

(define_insn_reservation "ppc604-fp" 3
  (and (eq_attr "type" "fp")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620"))
  "ppc604_du,fpu")

(define_insn_reservation "ppc604-dmul" 3
  (and (eq_attr "type" "dmul")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620"))
  "ppc604_du,fpu")

; Divides are not pipelined
(define_insn_reservation "ppc604-sdiv" 18
  (and (eq_attr "type" "sdiv")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620"))
  "ppc604_du,(fpu*3+fpu_iter*18)")

(define_insn_reservation "ppc604-ddiv" 32
  (and (eq_attr "type" "ddiv")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620"))
  "ppc604_du,(fpu*3+fpu_iter*32)")

(define_insn_reservation "ppc620-ssqrt" 31
  (and (eq_attr "type" "ssqrt")
       (eq_attr "cpu" "ppc620"))
  "ppc604_du,(fpu+fpu_iter*31)")

(define_insn_reservation "ppc620-dsqrt" 31
  (and (eq_attr "type" "dsqrt")
       (eq_attr "cpu" "ppc620"))
  "ppc604_du,(fpu+fpu_iter*31)")


; 2xFPU PPC630
(define_insn_reservation "ppc630-fpcompare" 5
  (and (eq_attr "type" "fpcompare")
       (eq_attr "cpu" "ppc630"))
  "ppc604_du,(fpu1|fpu2)")

(define_insn_reservation "ppc630-fp" 3
  (and (eq_attr "type" "fp")
       (eq_attr "cpu" "ppc630"))
  "ppc604_du,(fpu1|fpu2)")

(define_insn_reservation "ppc630-dmul" 3
  (and (eq_attr "type" "dmul")
       (eq_attr "cpu" "ppc630"))
  "ppc604_du,(fpu1|fpu2)")

(define_insn_reservation "ppc630-sdiv" 17
  (and (eq_attr "type" "sdiv")
       (eq_attr "cpu" "ppc630"))
  "ppc604_du,((fpu1,fpu1_iter*16)|(fpu2,fpu2_iter*16))")

(define_insn_reservation "ppc630-ddiv" 21
  (and (eq_attr "type" "ddiv")
       (eq_attr "cpu" "ppc630"))
  "ppc604_du,((fpu1,fpu1_iter*20)|(fpu2,fpu2_iter*20))")

(define_insn_reservation "ppc630-ssqrt" 18
  (and (eq_attr "type" "ssqrt")
       (eq_attr "cpu" "ppc630"))
  "ppc604_du,((fpu1,fpu1_iter*17)|(fpu2,fpu2_iter*17))")

(define_insn_reservation "ppc630-dsqrt" 26
  (and (eq_attr "type" "dsqrt")
       (eq_attr "cpu" "ppc630"))
  "ppc604_du,((fpu1,fpu1_iter*25)|(fpu2,fpu2_iter*25))")

(define_insn_reservation "ppc604-crlogical" 4
  (and (eq_attr "type" "cr_logical")
       (eq_attr "cpu" "ppc604"))
  "ppc604_du,bpu")

(define_insn_reservation "ppc604e-crlogical" 1
  (and (eq_attr "type" "cr_logical")
       (eq_attr "cpu" "ppc604e,ppc620,ppc630"))
  "ppc604_du,cru")

(define_insn_reservation "ppc604-mtjmpr" 4
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620,ppc630"))
  "ppc604_du,bpu")

(define_insn_reservation "ppc604-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch")
       (eq_attr "cpu" "ppc604,ppc604e,ppc620,ppc630"))
  "ppc604_du,bpu")


;; PPC740/PPC750/PPC7400  32-bit 2xIU, LSU, SRU, FPU, BPU
;; IU1 can perform all integer operations
;; IU2 can perform all integer operations except imul and idiv
;; LSU 2 stage pipelined
;; FPU 3 stage pipelined
;; Max issue 3 insns/clock cycle (includes 1 branch)
;; In-order execution


;; The PPC750 user's manual recommends that to reduce branch mispredictions,
;; the insn that sets CR bits should be separated from the branch insn
;; that evaluates them.  There is no advantage have more than 10 cycles
;; of separation.
;; This could be artificially achieved by exagerating the latency of
;; compare insns but at the expense of a poorer schedule.

;; Branches go straight to the BPU.  All other insns are handled
;; by a dispatch unit which can issue a max of 2 insns per cycle.
(define_reservation "ppc750_du" "du1|du2")

(define_insn_reservation "ppc750-load" 2
  (and (eq_attr "type" "load,fpload")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,lsu")

(define_insn_reservation "ppc750-store" 1
  (and (eq_attr "type" "store,fpstore")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,lsu")

(define_insn_reservation "ppc750-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,(iu1|iu2)")

(define_insn_reservation "ppc750-imul" 4
  (and (eq_attr "type" "imul")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,iu1*4")

(define_insn_reservation "ppc750-imul2" 3
  (and (eq_attr "type" "imul2")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,iu1*2")

(define_insn_reservation "ppc750-imul3" 2
  (and (eq_attr "type" "imul3")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,iu1")

(define_insn_reservation "ppc750-idiv" 19
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,(iu1+iu1_iter*19)")

(define_insn_reservation "ppc750-compare" 1
  (and (eq_attr "type" "compare,delayed_compare")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,(iu1|iu2)")

(define_insn_reservation "ppc750-fpcompare" 1
  (and (eq_attr "type" "fpcompare")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,fpu")

(define_insn_reservation "ppc750-fp" 3
  (and (eq_attr "type" "fp")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,fpu")

(define_insn_reservation "ppc750-dmul" 4
  (and (eq_attr "type" "dmul")
       (eq_attr "cpu" "ppc750"))
  "ppc750_du,fpu*2")

(define_insn_reservation "ppc7400-dmul" 3
  (and (eq_attr "type" "dmul")
       (eq_attr "cpu" "ppc7400"))
  "ppc750_du,fpu")

; Divides are not pipelined
(define_insn_reservation "ppc750-sdiv" 17
  (and (eq_attr "type" "sdiv")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,(fpu*3+fpu_iter*17)")

(define_insn_reservation "ppc750-ddiv" 31
  (and (eq_attr "type" "ddiv")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,(fpu*3+fpu_iter*31)")

(define_insn_reservation "ppc750-crlogical" 3
  (and (eq_attr "type" "cr_logical")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "ppc750_du,sru*2")

(define_insn_reservation "ppc750-mtjmpr" 2
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "nothing,sru*2")

(define_insn_reservation "ppc750-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch")
       (eq_attr "cpu" "ppc750,ppc7400"))
  "nothing,bpu")


;; PPC7450  32-bit 3xIU, MCIU, LSU, SRU, FPU, BPU, 4xVEC
;; IUa,IUb,IUc can perform all integer operations
;; MCIU performs imul and idiv, cr logical, SPR moves
;; LSU 2 stage pipelined
;; FPU 3 stage pipelined
;; It also has 4 vector units, one for each type of vector instruction.
;; However, we can only dispatch 2 instructions per cycle. 
;; Max issue 3 insns/clock cycle (includes 1 branch)
;; In-order execution

;; Branches go straight to the BPU.  All other insns are handled
;; by a dispatch unit which can issue a max of 3 insns per cycle.
(define_reservation "ppc7450_du" "du1|du2|du3")
(define_reservation "vec_du" "vdu1|vdu2")

(define_insn_reservation "ppc7450-load" 3
  (and (eq_attr "type" "load,vecload")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,lsu")

(define_insn_reservation "ppc7450-store" 3
  (and (eq_attr "type" "store,vecstore")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,lsu")

(define_insn_reservation "ppc7450-fpload" 4
  (and (eq_attr "type" "fpload")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,lsu")

(define_insn_reservation "ppc7450-fpstore" 3
  (and (eq_attr "type" "fpstore")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,lsu*3")

(define_insn_reservation "ppc7450-integer" 1
  (and (eq_attr "type" "integer")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,(iua|iub|iuc)")

(define_insn_reservation "ppc7450-imul" 4
  (and (eq_attr "type" "imul")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,mciu*2")

(define_insn_reservation "ppc7450-imul2" 3
  (and (eq_attr "type" "imul2,imul3")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,mciu")

(define_insn_reservation "ppc7450-idiv" 23
  (and (eq_attr "type" "idiv")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,(mciu+mciu_iter*23)")

(define_insn_reservation "ppc7450-compare" 1
  (and (eq_attr "type" "compare,delayed_compare")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,(iua|iub|iuc)")

(define_insn_reservation "ppc7450-fpcompare" 3
  (and (eq_attr "type" "fpcompare")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,fpu")

(define_insn_reservation "ppc7450-fp" 5
  (and (eq_attr "type" "fp,dmul")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,fpu")

; Divides are not pipelined
(define_insn_reservation "ppc7450-sdiv" 21
  (and (eq_attr "type" "sdiv")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,(fpu*3+fpu_iter*21)")

(define_insn_reservation "ppc7450-ddiv" 35
  (and (eq_attr "type" "ddiv")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,(fpu*3+fpu_iter*35)")

(define_insn_reservation "ppc7450-crlogical" 1
  (and (eq_attr "type" "cr_logical")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,mciu")

(define_insn_reservation "ppc7450-mtjmpr" 2
  (and (eq_attr "type" "mtjmpr")
       (eq_attr "cpu" "ppc7450"))
  "nothing,mciu*2")

(define_insn_reservation "ppc7450-jmpreg" 1
  (and (eq_attr "type" "jmpreg,branch")
       (eq_attr "cpu" "ppc7450"))
  "nothing,bpu")

(define_insn_reservation "ppc7450-vecsimple" 1
  (and (eq_attr "type" "vecsimple")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,vec_du,vec_simple")

(define_insn_reservation "ppc7450-veccomplex" 4
  (and (eq_attr "type" "veccomplex")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,vec_du,vec_complex")

(define_insn_reservation "ppc7450-veccmp" 2
  (and (eq_attr "type" "veccmp")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,vec_du,vec_complex")

(define_insn_reservation "ppc7450-vecfloat" 4
  (and (eq_attr "type" "vecfloat")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,vec_du,vec_float")

(define_insn_reservation "ppc7450-vecperm" 2
  (and (eq_attr "type" "vecperm")
       (eq_attr "cpu" "ppc7450"))
  "ppc7450_du,vec_du,vec_permute")


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]