This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: automaton based scheduler documentation


Joern Rennecke wrote:
> 
> I find the documentation a bit hard to read.  Does the
> appended patch retain the meaning you want to convey?
> 
> --
> --------------------------
> SuperH
> 2430 Aztec West / Almondsbury / BRISTOL / BS32 4AQ
> T:+44 1454 462330
> 
>   ------------------------------------------------------------------------
> Index: md.texi
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/doc/md.texi,v
> retrieving revision 1.46
> diff -p -u -r1.46 md.texi
> --- md.texi     3 Aug 2002 23:21:31 -0000       1.46
> +++ md.texi     14 Aug 2002 16:09:01 -0000
> @@ -5246,12 +5246,12 @@ branch is true, we might represent this
>  @cindex RISC
>  @cindex VLIW
> 
> -To achieve better productivity most modern processors
> +To achieve better performance most modern processors
>  (super-pipelined, superscalar @acronym{RISC}, and @acronym{VLIW}
>  processors) have many @dfn{functional units} on which several
>  instructions can be executed simultaneously.  An instruction starts
>  execution if its issue conditions are satisfied.  If not, the
> -instruction is interlocked until its conditions are satisfied.  Such
> +instruction is stalled until its conditions are satisfied.  Such
>  @dfn{interlock (pipeline) delay} causes interruption of the fetching
>  of successor instructions (or demands nop instructions, e.g. for some
>  MIPS processors).
> @@ -5274,24 +5274,24 @@ of delay into account is complex especia
>  processors.
> 
>  The task of exploiting more processor parallelism is solved by an
> -instruction scheduler.  For better solution of this problem, the
> +instruction scheduler.  For a better solution of this problem, the
>  instruction scheduler has to have an adequate description of the
>  processor parallelism (or @dfn{pipeline description}).  Currently GCC
>  has two ways to describe processor parallelism.  The first one is old
> -and originated from instruction scheduler written by Michael Tiemann
> +and originated from the instruction scheduler written by Michael Tiemann
>  and described in the first subsequent section.  The second one was
> -created later.  It is based on description of functional unit
> +created later.  It is based on a description of functional unit
>  reservations by processor instructions with the aid of @dfn{regular
>  expressions}.  This is so called @dfn{automaton based description}.
> 
> -Gcc instruction scheduler uses a @dfn{pipeline hazard recognizer} to
> +The GCC instruction scheduler uses a @dfn{pipeline hazard recognizer} to
>  figure out the possibility of the instruction issue by the processor
> -on given simulated processor cycle.  The pipeline hazard recognizer is
> -a code generated from the processor pipeline description.  The
> +on a given simulated processor cycle.  The pipeline hazard recognizer is
> +automatically generated from the processor pipeline description.  The
>  pipeline hazard recognizer generated from the automaton based
> -description is more sophisticated and based on deterministic finite
> +description is more sophisticated and based on a deterministic finite
>  state automaton (@acronym{DFA}) and therefore faster than one
> -generated from the old description.  Also its speed is not depended on
> +generated from the old description.  Also its speed is not dependent on
>  processor complexity.  The instruction issue is possible if there is
>  a transition from one automaton state to another one.
> 
> @@ -5450,7 +5450,7 @@ in the machine description file is not i
>  The following optional construction describes names of automata
>  generated and used for the pipeline hazards recognition.  Sometimes
>  the generated finite state automaton used by the pipeline hazard
> -recognizer is large.  If we use more one automaton and bind functional
> +recognizer is large.  If we use more than one automaton and bind functional
>  units to the automata, the summary size of the automata usually is
>  less than the size of the single automaton.  If there is no one such
>  construction, only one finite state automaton is generated.
> @@ -5477,7 +5477,7 @@ reservations should be described by the
>  separated by commas.  Don't use name @samp{nothing}, it is reserved
>  for other goals.
> 
> -@var{automaton-name} is a string giving the name of automaton with
> +@var{automaton-name} is a string giving the name of the automaton with
>  which the unit is bound.  The automaton should be described in
>  construction @code{define_automaton}.  You should give
>  @dfn{automaton-name}, if there is a defined automaton.
> @@ -5500,14 +5500,14 @@ templates).
>  @var{unit-names} is a string giving names of the functional units
>  separated by commas.
> 
> -@var{automaton-name} is a string giving name of the automaton with
> +@var{automaton-name} is a string giving the name of the automaton with
>  which the unit is bound.
> 
>  @findex define_insn_reservation
>  @cindex instruction latency time
>  @cindex regular expressions
>  @cindex data bypass
> -The following construction is major one to describe pipeline
> +The following construction is the major one to describe pipeline
>  characteristics of an instruction.
> 
>  @smallexample
> @@ -5519,18 +5519,18 @@ characteristics of an instruction.
>  instruction.  There is an important difference between the old
>  description and the automaton based pipeline description.  The latency
>  time is used for all dependencies when we use the old description.  In
> -the automaton based pipeline description, given latency time is used
> +the automaton based pipeline description, the given latency time is used
>  only for true dependencies.  The cost of anti-dependencies is always
>  zero and the cost of output dependencies is the difference between
>  latency times of the producing and consuming insns (if the difference
>  is negative, the cost is considered to be zero).  You always can
> -change the default costs for any description by using target hook
> +change the default costs for any description by using the target hook
>  @code{TARGET_SCHED_ADJUST_COST} (@pxref{Scheduling}).
> 
> -@var{insn-names} is a string giving internal name of the insn.  The
> +@var{insn-names} is a string giving the internal name of the insn.  The
>  internal names are used in constructions @code{define_bypass} and in
>  the automaton description file generated for debugging.  The internal
> -name has nothing common with the names in @code{define_insn}.  It is a
> +name has nothing in common with the names in @code{define_insn}.  It is a
>  good practice to use insn classes described in the processor manual.
> 
>  @var{condition} defines what RTL insns are described by this
> @@ -5545,7 +5545,7 @@ contain @code{symbol_ref}).  It is also
>  pipeline hazard recognizer work because it would slow down the
>  recognizer considerably.
> 
> -@var{regexp} is a string describing reservation of the cpu functional
> +@var{regexp} is a string describing the reservation of the cpu functional
>  units by the instruction.  The reservations are described by a regular
>  expression according to the following syntax:
> 
> @@ -5631,11 +5631,11 @@ given in string @var{out_insn_names} wil
>  instructions given in string @var{in_insn_names}.  The instructions in
>  the string are separated by commas.
> 
> -@var{guard} is an optional string giving name of a C function which
> +@var{guard} is an optional string giving the name of a C function which
>  defines an additional guard for the bypass.  The function will get the
>  two insns as parameters.  If the function returns zero the bypass will
>  be ignored for this case.  The additional guard is necessary to
> -recognize complicated bypasses, e.g. when consumer is only an address
> +recognize complicated bypasses, e.g. when the consumer is only an address
>  of insn @samp{store} (not a stored value).
> 
>  @findex exclusion_set
> @@ -5680,7 +5680,7 @@ it is symmetric).  For example, it is us
>  @acronym{VLIW} @samp{slot0} can not be reserved after @samp{slot1} or
>  @samp{slot2} reservation.
> 
> -All functional units mentioned in a set should belong the same
> +All functional units mentioned in a set should belong to the same
>  automaton.
> 
>  @findex automata_option
> @@ -5747,7 +5747,7 @@ cycles.  The integer division is not pip
>  integer division insn can not be issued until the current division
>  insn finished.  Floating point insns are fully pipelined and their
>  results are ready in 3 cycles.  There is also additional one cycle
> -delay in the usage by integer insns of result produced by floating
> +delay in the usage by integer insns of results produced by floating
>  point insns.  To describe all of this we could specify
> 
>  @smallexample
> @@ -5765,13 +5765,13 @@ point insns.  To describe all of this we
>  (define_insn_reservation "float" 3 (eq_attr "cpu" "float")
>                           "f_pipeline, nothing, (port_0 | port1))
> 

^^^^^^^^^^^^^^^^^^^^^ port_0/port_1 or port0/port1.

> -(define_bypass 4 "float" "simple,mut,div")
> +(define_bypass 4 "float" "simple,mult,div")
>  @end smallexample
> 
>  To simplify the description we could describe the following reservation
> 
>  @smallexample
> -(define_reservation "finish" "port0|port1")
> +(define_reservation "finish" "port_0|port1")
>  @end smallexample
> 
>  and use it in all @code{define_insn_reservation} as in the following
> @@ -5821,17 +5821,18 @@ The interface to the pipeline hazard rec
>  one to the automaton based pipeline recognizer.
> 
>  @item
> -An unnatural description when you write an unit and a condition which
> +An unnatural description when you write a unit and a condition which
>  selects instructions using the unit.  Writing all unit reservations
>  for an instruction (an instruction class) is more natural.
> 
>  @item
> -The recognition of the interlock delays has slow implementation.  GCC
> +The recognition of the interlock delays has a slow implementation.  The GCC
>  scheduler supports structures which describe the unit reservations.
> -The more processor has functional units, the slower pipeline hazard
> -recognizer.  Such implementation would become slower when we enable to
> +The more functional units a processor has, the slower its pipeline hazard
> +recognizer will be.  Such an implementation would become even slower when we
> +allowed to
>  reserve functional units not only at the instruction execution start.
> -The automaton based pipeline hazard recognizer speed is not depended
> +The automaton based pipeline hazard recognizer speed is not dependent
>  on processor complexity.
>  @end itemize
> 

  Joern, thank you for the patch.  It is fine for me.  Only one
comment.  I think it is better to use port_0 and port_1 or port0 and
port1 in the documentation  (see the comment above).

Thanks again for the patch.  Please go ahead and commit it.

Vlad


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]