This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[mips] SB-1 DFA bits (for contribution purposes only)


*** This post is for code-contribution purposes only ***

Paul Koning (cc'd on this message) did an initial DFA scheduler
description for the SB-1 CPU core.  My goal is to take that work,
improve it, and sync it up with current GCC development so that it
will eventually be integrated into the master sources.

However, before I do that I want to post his work and have him
formally contribute it.  (I'm posting it because of NDA issues.)
Paul, if you could please follow up to this message and say that
you're contributing this code...  8-)

I'm not proposing that the bits in this post be incorporated into GCC
at this time.

Note: this work is known to be incomplete, known not to work with the
current GCC sources, and (after having worked with it for a little
bit) it seems to produce a fairly inaccurate model of the SB-1.
However, I work better starting from something (rather than from
scratch), so it's a good starting point for me, and Paul deserves the
credit for having done the work.  8-)



chris
--
Chris Demetriou                                            Broadcom Corporation
Principal Design Engineer                     Broadband Processor Business Unit
  Any opinions expressed in this message are mine, not necessarily Broadcom's.
--
;; .........................
;;
;;	Automaton definitions
;;
;; The DFA approach to specifying the pipeline behavior
;; is used for CPU type sb1 (SB-1 core for BCM1250)
;;
;; These are all conditional on cpu == sb1, though at the
;; moment that is not necessary.  That way, the definitions
;; are still valid if someone later on decides to add
;; DFA style scheduling rules for other CPU types.
;;
;; .........................

(define_cpu_unit "ls0,ls1,exe0,exe1,fp0,fp1")

; some more "units" to describe issue restrictions
(define_cpu_unit "mul,div,hi,lo,hilomul")

;; the case of move and arith is split up into two definitions
;; even though they have the same latency, because the difference
;; matters to other instructions issuing later

(define_insn_reservation "aluexe" 1
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "move,arith,darith"))
  "(exe0 | exe1)")

(define_insn_reservation "aluls1" 1
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "move,arith,darith"))
  "ls1")

(define_insn_reservation "shift" 1
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "shift"))
  "(exe0 | exe1)")

(define_insn_reservation "branch" 1
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "jump,call,branch,trap"))
  "exe0")

;;*** this needs work to distinguish MFHI/LO from MTHI/LO
;; and to handle HI vs LO separately.
(define_insn_reservation "hilo" 1
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "hilo"))
  "exe1")

(define_insn_reservation "coproc" 1
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "xfer"))
  "exe1")

(define_insn_reservation "load" 1
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "load"))
  "(ls0 | ls1)")

(define_insn_reservation "store" 1
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "store"))
  "(ls0 | ls1)")

(define_bypass 0 "load" "aluexe,shift")

;; load/store takes from the register file early in the pipe, and
;; exe unit stores late in the pipe, so we have some unusual
;; dependency timings that we'll express as "bypass"
;; (even though they are slower rather than faster...)

(define_bypass 4 "load,store,shift,coproc" "load,store")

;; the next one is applied to both ex0/exe1 and ls1 cases for alu
;; instructions.  the sb-1 manual clearly says it applies only to
;; the exe case.  we apply it to both, because there is no way to
;; limit the instruction issuing to one or the other unit.  this way,
;; the scheduler will avoid the result conflict if the alu instruction
;; happened to issue to exe0 or exe1, and it does no harm if the
;; instruction happened to end up in ls1 instead.

(define_bypass 4 "aluexe,aluls1" "load,store")

;;*** currently these rules don't distinguish MUL (to register)
;; from MULT (to hi/lo).  some of the issue restrictions don't
;; apply to MUL...

(define_insn_reservation "muls" 3
  (and (eq_attr "cpu" "sb1")
       (and (eq_attr "type" "imul")
	    (eq_attr "mode" "SI")))
  "(exe1 + hilomul + hi + lo + mul),(hi + lo)*2")

(define_insn_reservation "muld" 4
  (and (eq_attr "cpu" "sb1")
       (and (eq_attr "type" "imul")
	    (eq_attr "mode" "DI")))
  "(exe1 + hilomul + hi + lo + mul),(hi + lo + mul),(hi + lo),hi")

(define_bypass 8 "muls,muld" "load,store")

;; unlike the alu to load/store case above, here we do code the
;; two bypass rules explicitly.  the reason is that in this case,
;; the question of which latency applies is decided when the
;; second (dependent) instruction issues, and the cpu can make
;; that choice.  above, what matters is the unit choice made for
;; the first instruction, and since the cpu is not clairvoyant, we
;; have to code for the worst case there.

(define_bypass 3 "muls,muld" "aluexe")

(define_bypass 8 "muls,muld" "aluls1")

;; divide rules are:
;;   exe1 must be free to issue
;;   can't issue another divide while one is in progress
;;	(until the last cycle, that is)
;;   can't issue multiplies that change hi/lo while divide in progress

(define_insn_reservation "divs" 36
  (and (eq_attr "cpu" "sb1")
       (and (eq_attr "type" "idiv")
	    (eq_attr "mode" "SI")))
  "(exe1 + div + hilomul),(div + hilomul)*34,nothing")

(define_insn_reservation "divd" 68
  (and (eq_attr "cpu" "sb1")
       (and (eq_attr "type" "idiv")
	    (eq_attr "mode" "DI")))
  "(exe1 + div + hilomul),(div + hilomul)*66,nothing")

(define_insn_reservation "farith" 4
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "fadd,fabs,fneg,fcmp,fcvt"))
  "(fp0 | fp1)")

(define_insn_reservation "fmul" 4
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "fmul"))
  "(fp0 | fp1)")

(define_insn_reservation "fmadd" 8
  (and (eq_attr "cpu" "sb1")
       (eq_attr "type" "fmadd"))
  "(fp0 | fp1)")

(define_insn_reservation "fdiv" 24
  (and (eq_attr "cpu" "sb1")
       (and (eq_attr "type" "fdiv")
	    (eq_attr "mode" "SF")))
  "(fp0 | fp1),nothing*5")

(define_insn_reservation "dfdiv" 32
  (and (eq_attr "cpu" "sb1")
       (and (eq_attr "type" "fdiv")
	    (eq_attr "mode" "DF")))
  "(fp0 | fp1),nothing*7")

(define_insn_reservation "fsqrt" 28
  (and (eq_attr "cpu" "sb1")
       (and (eq_attr "type" "fsqrt")
	    (eq_attr "mode" "SF")))
  "(fp0 | fp1),nothing*6")

(define_insn_reservation "dfsqrt" 40
  (and (eq_attr "cpu" "sb1")
       (and (eq_attr "type" "fsqrt")
	    (eq_attr "mode" "DF")))
  "(fp0 | fp1),nothing*9")


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]