This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

New version of the patch for automaton based pipeline hazard recognizer


  This is a new version of the patch for automaton based pipeline
hazard recognizer.  It is modified according to all comments which I
got after sending the previous version of the patch.  The patch was
also considerably modified because of latest massive changes of the
scheduler (adding target hooks).

  I'd like to repeat that the patch is safe because to use the new
code you should describe an automaton based pipeline description in
the .md file.  But I've made standard test procedure (the bootstrap
test) for i386 traditional pipeline description and for an automaton
based one.

Vladimir Makarov.

2001-08-26  Vladimir Makarov  <vmakarov@touchme.toronto.redhat.com>

        * rtl.def (DEFINE_CPU_UNIT, DEFINE_QUERY_CPU_UNIT, EXCLUSION_SET,
	PRESENCE_SET, ABSENCE_SET, DEFINE_BYPASS, DEFINE_AUTOMATON,
	AUTOMATA_OPTION, DEFINE_RESERVATION, DEFINE_INSN_RESERVATION): New
	RTL constructions.
	
	* genattr.c (main): New variable num_insn_reservations.  Increase
	it if there is DEFINE_INSN_RESERVATION.  Output automaton based
	pipeline hazard recognizer interface.

	* genattrtab.h: New file.
	
	* genattrtab.c: Include genattrtab.h.
	(attr_printf, check_attr_test, make_internal_attr,
	make_numeric_value): Move protypes into genattrtab.h.  Define them
	as external.
	(num_dfa_decls): New global variable.
	(main): Process DEFINE_CPU_UNIT, DEFINE_QUERY_CPU_UNIT,
	DEFINE_BYPASS, EXCLUSION_SET, PRESENCE_SET, ABSENCE_SET,
	DEFINE_AUTOMATON, AUTOMATA_OPTION, DEFINE_RESERVATION,
	DEFINE_INSN_RESERVATION.  Call expand_automata and write_automata.

	* genautomata.c: New file.

	* rtl.h (LINK_COST_ZERO, LINK_COST_FREE): Remove them.
	
        * sched-int.h: (curr_state): Add the external definition for
	automaton pipeline interface.
	(haifa_insn_data): Add comments for members blockage and units.
	
	* target-def.h (TARGET_SCHED_USE_DFA_PIPELINE_INTERFACE,
	TARGET_SCHED_INIT_DFA_PRE_CYCLE_INSN,
	TARGET_SCHED_DFA_PRE_CYCLE_INSN,
	TARGET_SCHED_INIT_DFA_POST_CYCLE_INSN,
	TARGET_SCHED_DFA_POST_CYCLE_INSN,
	TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD,
	TARGET_SCHED_INIT_DFA_BUBBLES, TARGET_SCHED_DFA_BUBBLE): New
	macros.
	(TARGET_SCHED): Use the new macros.

	* target.h (use_dfa_pipeline_interface, init_dfa_pre_cycle_insn,
	dfa_pre_cycle_insn, init_dfa_post_cycle_insn, dfa_post_cycle_insn,
	first_cycle_multipass_dfa_lookahead, init_dfa_bubbles,
	dfa_bubble): New members in gcc_target.sched.
	
        * haifa-sched.c (insert_schedule_bubbles_p): New variable.
	(MAX_INSN_QUEUE_INDEX): New macro for automaton interface.
	(insn_queue): Redefine it as pointer to array.
	(NEXT_Q, NEXT_Q_AFTER): Use MAX_INSN_QUEUE_INDEX instead of
	INSN_QUEUE_SIZE.
	(max_insn_queue_index_macro_value): New variable.
	(curr_state, dfa_state_size, ready_try): New varaibles for
	automaton interface.
	(ready_element, ready_remove, max_issue): New function prototypes
	for automaton interface.
	(choose_ready): New function prototype.
	(insn_unit, blockage_range): Add comments.
	(unit_last_insn, unit_tick, unit_n_insns): Define them for case
	FUNCTION_UNITS_SIZE == 0.
	(insn_issue_delay, actual_hazard_this_instance, schedule_unit,
	actual_hazard, potential_hazard): Add comments.
	(insn_cost): Use cost -1 as undefined value.  Remove
	LINK_COST_ZERO and LINK_COST_FREE.  Add new code for automaton
	pipeline interface.
	(ready_element, ready_remove): New functions for automaton
	interface.
	(schedule_insn): Add new code for automaton pipeline interface.
	(queue_to_ready): Add new code for automaton pipeline interface.
	Use MAX_INSN_QUEUE_INDEX instead of INSN_QUEUE_SIZE.
	(debug_ready_list): Print newline when the queue is empty.
	(max_issue): New function for automaton pipeline interface.
	(choose_ready): New function.
	(schedule_block): Add new code for automaton pipeline interface.
	Print ready list before scheduling each insn.
	(sched_init): Add new code for automaton pipeline interface.
	Initiate insn cost by -1.
	(sched_finish): Free the current automaton state and finalize
	automaton pipeline interface.
	
	* sched-rgn.c: Include target.h.
	(init_ready_list, new_ready, debug_dependencies): Add new code for
	automaton pipeline interface.

	* sched-vis.c: Include target.h.
	(get_visual_tbl_length): Add code for automaton interface.
	(target_units, print_block_visualization):  Add comments.
	
        * Makefile.in (GETRUNTIME, HASHTAB, HOST_GETRUNTIME, HOST_HASHTAB,
	USE_HOST_GETRUNTIME, USE_HOST_HASHTAB, HOST_VARRAY): New variables.
	(sched-rgn.o, sched-vis.o): Add new dependency file target.h.
	(getruntime.o, genautomata.o): New entries.
	(genattrtab.o): Add new dependency file genattrtab.h.
	(genattrtab): Add new dependencies.  Link it with `libm.a'.
	(getruntime.o, hashtab.o): New entries for canadian cross.

	* doc/md.texi: Description of automaton based model.
	
	* doc/tm.texi (TARGET_SCHED_ISSUE_RATE, TARGET_SCHED_ADJUST_COST):
	Add comments.
	(TARGET_SCHED_USE_DFA_PIPELINE_INTERFACE,
	TARGET_SCHED_DFA_PRE_CYCLE_INSN,
	TARGET_SCHED_INIT_DFA_PRE_CYCLE_INSN,
	TARGET_SCHED_DFA_POST_CYCLE_INSN,
	TARGET_SCHED_INIT_DFA_POST_CYCLE_INSN,
	TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD,
	TARGET_SCHED_INIT_DFA_BUBBLES, TARGET_SCHED_DFA_BUBBLE): The new
	hook descriptions.
	(TRADITIONAL_PIPELINE_INTERFACE, DFA_PIPELINE_INTERFACE,
	MAX_DFA_ISSUE_RATE): New macro descriptions.
	
	* doc/contrib.texi: Add dfa based scheduler contribution.

	* doc/gcc.texi: Add more information about genattrtab.

-------------- New file genattrtab.h -------------------------------------
/* External definitions of source files of genattrtab.
   Copyright (C)  2001 Free Software Foundation, Inc.

This file is part of GNU CC.

GNU CC is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.

GNU CC is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with GNU CC; see the file COPYING.  If not, write to
the Free Software Foundation, 59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA.  */

/* Defined in genattrtab.c: */
extern rtx check_attr_test	PARAMS ((rtx, int, int));
extern rtx make_numeric_value	PARAMS ((int));
extern void make_internal_attr	PARAMS ((const char *, rtx, int));
extern char *attr_printf	PARAMS ((int, const char *, ...))
  ATTRIBUTE_PRINTF_2;

extern int num_dfa_decls;

/* Defined in genautomata.c: */
extern void gen_cpu_unit		PARAMS ((rtx));
extern void gen_query_cpu_unit		PARAMS ((rtx));
extern void gen_bypass			PARAMS ((rtx));
extern void gen_excl_set		PARAMS ((rtx));
extern void gen_presence_set		PARAMS ((rtx));
extern void gen_absence_set		PARAMS ((rtx));
extern void gen_automaton		PARAMS ((rtx));
extern void gen_automata_option		PARAMS ((rtx));
extern void gen_reserv   		PARAMS ((rtx));
extern void gen_insn_reserv     	PARAMS ((rtx));
extern void initiate_automaton_gen	PARAMS ((int, char **));
extern void expand_automata             PARAMS ((void));
extern void write_automata              PARAMS ((void));
-------------- End of file genattrtab.h -------------------------------------

Index: rtl.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/rtl.def,v
retrieving revision 1.46
diff -c -p -r1.46 rtl.def
*** rtl.def	2001/08/22 14:35:35	1.46
--- rtl.def	2001/08/26 21:23:45
*************** DEF_RTL_EXPR(SEQUENCE, "sequence", "E", 
*** 334,339 ****
--- 334,477 ----
  DEF_RTL_EXPR(ADDRESS, "address", "e", 'm')
  
  /* ----------------------------------------------------------------------
+    Constructions for CPU pipeline description described by NDFAs.
+    These do not appear in actual rtl code in the compiler.
+    ---------------------------------------------------------------------- */
+ 
+ /* (define_cpu_unit string [string]) describes cpu functional
+    units (separated by comma).
+ 
+    1st operand: Names of cpu functional units.
+    2nd operand: Name of automaton (see comments for DEFINE_AUTOMATON).
+ 
+    All define_reservations, define_cpu_units, and
+    define_query_cpu_units should have unique names which may not be
+    "nothing".  */
+ DEF_RTL_EXPR(DEFINE_CPU_UNIT, "define_cpu_unit", "sS", 'x')
+ 
+ /* (define_query_cpu_unit string [string]) describes cpu functional
+    units analogously to define_cpu_unit.  If we use automaton without
+    minimization, the reservation of such units can be queried for
+    automaton state.  */
+ DEF_RTL_EXPR(DEFINE_QUERY_CPU_UNIT, "define_query_cpu_unit", "sS", 'x')
+ 
+ /* (exclusion_set string string) means that each CPU functional unit
+    in the first string can not be reserved simultaneously with any
+    unit whose name is in the second string and vise versa.  CPU units
+    in the string are separated by commas.  For example, it is useful
+    for description CPU with fully pipelined floating point functional
+    unit which can execute simultaneously only single floating point
+    insns or only double floating point insns.  */
+ DEF_RTL_EXPR(EXCLUSION_SET, "exclusion_set", "ss", 'x')
+ 
+ /* (presence_set string string) means that each CPU functional unit in
+    the first string can not be reserved unless at least one of units
+    whose names are in the second string is reserved.  This is an
+    asymmetric relation.  CPU units in the string are separated by
+    commas.  For example, it is useful for description that slot1 is
+    reserved after slot0 reservation for VLIW processor.  */
+ DEF_RTL_EXPR(PRESENCE_SET, "presence_set", "ss", 'x')
+ 
+ /* (absence_set string string) means that each CPU functional unit in
+    the first string can not be reserved only if each unit whose name
+    is in the second string is not reserved.  This is an asymmetric
+    relation (actually exclusion set is analogous to this one but it is
+    symmetric).  CPU units in the string are separated by commas.  For
+    example, it is useful for description that slot0 can not be
+    reserved after slot1 or slot2 reservation for VLIW processor.  */
+ DEF_RTL_EXPR(ABSENCE_SET, "absence_set", "ss", 'x')
+ 
+ /* (define_bypass number out_insn_names in_insn_names) names bypass
+    with given latency (the first number) from insns given by the first
+    string (see define_insn_reservation) into insns given by the second
+    string.  Insn names in the strings are separated by commas.  The
+    third operand is optional name of function which is additional
+    guard for the bypass.  The function will get the two insns as
+    parameters.  If the function returns zero the bypass will be
+    ignored for this case.  Additional guard is necessary to recognize
+    complicated bypasses, e.g. when consumer is load address.  */
+ DEF_RTL_EXPR(DEFINE_BYPASS, "define_bypass", "issS", 'x')
+ 
+ /* (define_automaton string) describes names of automata generated and
+    used for pipeline hazards recognition.  The names are separated by
+    comma.  Actually it is possibly to generate the single automaton
+    but unfortunately it can be very large.  If we use more one
+    automata, the summary size of the automata usually is less than the
+    single one.  The automaton name is used in define_cpu_unit and
+    define_query_cpu_unit.  All automata should have unique names.  */
+ DEF_RTL_EXPR(DEFINE_AUTOMATON, "define_automaton", "s", 'x')
+ 
+ /* (automata_option string) describes option for generation of
+    automata.  Currently there are the following options:
+ 
+    o "no-minimization" which makes no minimization of automata.  This
+      is only worth to do when we are going to query CPU functional
+      unit reservations in an automaton state.
+ 
+    o "w" which means generation of file describing the result
+      automaton.  The file can be used for the description verification.
+ 
+    o "ndfa" which makes nondeterministic finite state automata.  */
+ DEF_RTL_EXPR(AUTOMATA_OPTION, "automata_option", "s", 'x')
+ 
+ /* (define_reservation string string) names reservation (the first
+    string) of cpu functional units (the 2nd string).  Sometimes unit
+    reservations for different insns contain common parts.  In such
+    case, you can describe common part and use its name (the 1st
+    parameter) in regular expression in define_insn_reservation.  All
+    define_reservations, define_cpu_units, and define_query_cpu_units
+    should have unique names which may not be "nothing".  */
+ DEF_RTL_EXPR(DEFINE_RESERVATION, "define_reservation", "ss", 'x')
+ 
+ /* (define_insn_reservation name default_latency condition regexpr)
+    describes reservation of cpu functional units (the 3nd operand) for
+    instruction which is selected by the condition (the 2nd parameter).
+    The first parameter is used for output of debugging information.
+    The reservations are described by a regular expression according
+    the following syntax:
+ 
+        regexp = regexp "," oneof
+               | oneof
+ 
+        oneof = oneof "|" allof
+              | allof
+ 
+        allof = allof "+" repeat
+              | repeat
+  
+        repeat = element "*" number
+               | element
+ 
+        element = cpu_function_unit_name
+                | reservation_name
+                | result_name
+                | "nothing"
+                | "(" regexp ")"
+ 
+        1. "," is used for describing start of the next cycle in
+        reservation.
+ 
+        2. "|" is used for describing the reservation described by the
+        first regular expression *or* the reservation described by the
+        second regular expression *or* etc.
+ 
+        3. "+" is used for describing the reservation described by the
+        first regular expression *and* the reservation described by the
+        second regular expression *and* etc.
+ 
+        4. "*" is used for convinience and simply means sequence in
+        which the regular expression are repeated NUMBER times with
+        cycle advancing (see ",").
+ 
+        5. cpu functional unit name which means its reservation.
+ 
+        6. reservation name -- see define_reservation.
+ 
+        7. string "nothing" means no units reservation.  */
+ 
+ DEF_RTL_EXPR(DEFINE_INSN_RESERVATION, "define_insn_reservation", "sies", 'x')
+ 
+ /* ----------------------------------------------------------------------
     Expressions used for insn attributes.  These also do not appear in
     actual rtl code in the compiler.
     ---------------------------------------------------------------------- */
Index: rtl.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/rtl.h,v
retrieving revision 1.292
diff -c -p -r1.292 rtl.h
*** rtl.h	2001/08/25 21:08:27	1.292
--- rtl.h	2001/08/26 21:23:45
*************** struct rtx_def
*** 110,120 ****
    ENUM_BITFIELD(machine_mode) mode : 8;
  
    /* 1 in an INSN if it can alter flow of control
!      within this function.
!      LINK_COST_ZERO in an INSN_LIST.  */
    unsigned int jump : 1;
!   /* 1 in an INSN if it can call another function.
!      LINK_COST_FREE in an INSN_LIST.  */
    unsigned int call : 1;
    /* 1 in a REG if value of this expression will never change during
       the current function, even though it is not manifestly constant.
--- 110,118 ----
    ENUM_BITFIELD(machine_mode) mode : 8;
  
    /* 1 in an INSN if it can alter flow of control
!      within this function.  */
    unsigned int jump : 1;
!   /* 1 in an INSN if it can call another function.  */
    unsigned int call : 1;
    /* 1 in a REG if value of this expression will never change during
       the current function, even though it is not manifestly constant.
*************** extern unsigned int subreg_regno 	PARAMS
*** 898,913 ****
  /* During sched, for an insn, 1 means that the insn must be scheduled together
     with the preceding insn.  */
  #define SCHED_GROUP_P(INSN) ((INSN)->in_struct)
- 
- /* During sched, for the LOG_LINKS of an insn, these cache the adjusted
-    cost of the dependence link.  The cost of executing an instruction
-    may vary based on how the results are used.  LINK_COST_ZERO is 1 when
-    the cost through the link varies and is unchanged (i.e., the link has
-    zero additional cost).  LINK_COST_FREE is 1 when the cost through the
-    link is zero (i.e., the link makes the cost free).  In other cases,
-    the adjustment to the cost is recomputed each time it is needed.  */
- #define LINK_COST_ZERO(X) ((X)->jump)
- #define LINK_COST_FREE(X) ((X)->call)
  
  /* For a SET rtx, SET_DEST is the place that is set
     and SET_SRC is the value it is set to.  */
--- 896,901 ----
Index: target-def.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/target-def.h,v
retrieving revision 1.11
diff -c -p -r1.11 target-def.h
*** target-def.h	2001/08/18 20:25:50	1.11
--- target-def.h	2001/08/26 21:23:45
*************** Foundation, 59 Temple Place - Suite 330,
*** 93,108 ****
  #define TARGET_SCHED_REORDER 0
  #define TARGET_SCHED_REORDER2 0
  #define TARGET_SCHED_CYCLE_DISPLAY 0
  
! #define TARGET_SCHED	{TARGET_SCHED_ADJUST_COST,	\
! 			 TARGET_SCHED_ADJUST_PRIORITY,	\
! 			 TARGET_SCHED_ISSUE_RATE,	\
! 			 TARGET_SCHED_VARIABLE_ISSUE,	\
! 			 TARGET_SCHED_INIT,		\
! 			 TARGET_SCHED_FINISH,		\
! 			 TARGET_SCHED_REORDER,		\
! 			 TARGET_SCHED_REORDER2,		\
! 			 TARGET_SCHED_CYCLE_DISPLAY}
  
  /* All in tree.c.  */
  #define TARGET_MERGE_DECL_ATTRIBUTES merge_decl_attributes
--- 93,125 ----
  #define TARGET_SCHED_REORDER 0
  #define TARGET_SCHED_REORDER2 0
  #define TARGET_SCHED_CYCLE_DISPLAY 0
+ #define TARGET_SCHED_USE_DFA_PIPELINE_INTERFACE 0
+ #define TARGET_SCHED_INIT_DFA_PRE_CYCLE_INSN 0
+ #define TARGET_SCHED_DFA_PRE_CYCLE_INSN 0
+ #define TARGET_SCHED_INIT_DFA_POST_CYCLE_INSN 0
+ #define TARGET_SCHED_DFA_POST_CYCLE_INSN 0
+ #define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD 0
+ #define TARGET_SCHED_INIT_DFA_BUBBLES 0
+ #define TARGET_SCHED_DFA_BUBBLE 0
  
! #define TARGET_SCHED						\
!   {TARGET_SCHED_ADJUST_COST,					\
!    TARGET_SCHED_ADJUST_PRIORITY,				\
!    TARGET_SCHED_ISSUE_RATE,					\
!    TARGET_SCHED_VARIABLE_ISSUE,					\
!    TARGET_SCHED_INIT,						\
!    TARGET_SCHED_FINISH,						\
!    TARGET_SCHED_REORDER,					\
!    TARGET_SCHED_REORDER2,					\
!    TARGET_SCHED_CYCLE_DISPLAY,					\
!    TARGET_SCHED_USE_DFA_PIPELINE_INTERFACE,			\
!    TARGET_SCHED_INIT_DFA_PRE_CYCLE_INSN,			\
!    TARGET_SCHED_DFA_PRE_CYCLE_INSN,				\
!    TARGET_SCHED_INIT_DFA_POST_CYCLE_INSN,			\
!    TARGET_SCHED_DFA_POST_CYCLE_INSN,				\
!    TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD,		\
!    TARGET_SCHED_INIT_DFA_BUBBLES,				\
!    TARGET_SCHED_DFA_BUBBLE}
  
  /* All in tree.c.  */
  #define TARGET_MERGE_DECL_ATTRIBUTES merge_decl_attributes
Index: target.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/target.h,v
retrieving revision 1.13
diff -c -p -r1.13 target.h
*** target.h	2001/08/18 20:25:49	1.13
--- target.h	2001/08/26 21:23:45
*************** struct gcc_target
*** 113,118 ****
--- 113,159 ----
         insn in the new chain we're building.  Returns a new LAST.
         The default is to do nothing.  */
      rtx (* cycle_display) PARAMS ((int clock, rtx last));
+     /* The following member value is a pointer to a function returning
+        nonzero if we should use DFA based scheduling.  The default is
+        to use the old pipeline scheduler.  */
+     int (* use_dfa_pipeline_interface) PARAMS ((void));
+     /* The values of all the following members are used only for the
+        DFA based scheduler: */
+     /* The values of the following four members are pointers to
+        functions used to simplify the automaton descriptions.
+        dfa_pre_cycle_insn and dfa_post_cycle_insn give functions
+        returning insns which are used to change the pipeline hazard
+        recognizer state when the new simulated processor cycle
+        correspondingly starts and finishes.  The function defined by
+        init_dfa_pre_cycle_insn and init_dfa_post_cycle_insn are used
+        to initialize the corresponding insns.  The default values of
+        the memebers result in not changing the automaton state when
+        the new simulated processor cycle correspondingly starts and
+        finishes.  */
+     void (* init_dfa_pre_cycle_insn) PARAMS ((void));
+     rtx (* dfa_pre_cycle_insn) PARAMS ((void));
+     void (* init_dfa_post_cycle_insn) PARAMS ((void));
+     rtx (* dfa_post_cycle_insn) PARAMS ((void));
+     /* The following member value is a pointer to a function returning value
+        which defines how many insns in queue `ready' will we try for
+        multi-pass scheduling.  if the member value is nonzero and the
+        function returns positive value, the DFA based scheduler will make
+        multi-pass scheduling for the first cycle.  In other words, we will
+        try to choose ready insn which permits to start maximum number of
+        insns on the same cycle.  */
+     int (* first_cycle_multipass_dfa_lookahead) PARAMS ((void));
+     /* The values of the following members are pointers to functions
+        used to improve the first cycle multipass scheduling by
+        inserting nop insns.  dfa_scheduler_bubble gives a function
+        returning a nop insn with given index.  The indexes start with
+        zero.  The function should return NULL if there are no more nop
+        insns with indexes greater than given index.  To initialize the
+        nop insn the function given by member
+        init_dfa_scheduler_bubbles is used.  The default values of the
+        members result in not inserting nop insns during the multipass
+        scheduling.  */
+     void (* init_dfa_bubbles) PARAMS ((void));
+     rtx (* dfa_bubble) PARAMS ((int));
    } sched;
  
    /* Given two decls, merge their attributes and return the result.  */
Index: genattr.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/genattr.c,v
retrieving revision 1.41
diff -c -p -r1.41 genattr.c
*** genattr.c	2001/08/22 14:35:15	1.41
--- genattr.c	2001/08/26 21:23:45
*************** main (argc, argv)
*** 193,198 ****
--- 193,199 ----
    int have_delay = 0;
    int have_annul_true = 0;
    int have_annul_false = 0;
+   int num_insn_reservations = 0;
    int num_units = 0;
    struct range all_simultaneity, all_multiplicity;
    struct range all_ready_cost, all_issue_delay, all_blockage;
*************** main (argc, argv)
*** 308,317 ****
  	  extend_range (&all_issue_delay,
  			unit->issue_delay.min, unit->issue_delay.max);
  	}
      }
  
!   if (num_units > 0)
      {
        /* Compute the range of blockage cost values.  See genattrtab.c
  	 for the derivation.  BLOCKAGE (E,C) when SIMULTANEITY is zero is
  
--- 309,326 ----
  	  extend_range (&all_issue_delay,
  			unit->issue_delay.min, unit->issue_delay.max);
  	}
+       else if (GET_CODE (desc) == DEFINE_INSN_RESERVATION)
+ 	num_insn_reservations++;
      }
  
!   if (num_units > 0 || num_insn_reservations > 0)
      {
+       if (num_units > 0)
+ 	printf ("#define TRADITIONAL_PIPELINE_INTERFACE 1\n");
+ 
+       if (num_insn_reservations > 0)
+ 	printf ("#define DFA_PIPELINE_INTERFACE 1\n");
+ 
        /* Compute the range of blockage cost values.  See genattrtab.c
  	 for the derivation.  BLOCKAGE (E,C) when SIMULTANEITY is zero is
  
*************** main (argc, argv)
*** 348,353 ****
--- 357,452 ----
  
        write_units (num_units, &all_multiplicity, &all_simultaneity,
  		   &all_ready_cost, &all_issue_delay, &all_blockage);
+ 
+       /* Output interface for pipeline hazards recognition based on
+ 	 DFA (deterministic finite state automata.  */
+       printf ("\n/* DFA based pipeline interface.  */");
+       printf ("\n#ifndef AUTOMATON_STATE_ALTS\n");
+       printf ("#define AUTOMATON_STATE_ALTS 0\n");
+       printf ("#endif\n\n");
+       printf ("#ifndef CPU_UNITS_QUERY\n");
+       printf ("#define CPU_UNITS_QUERY 0\n");
+       printf ("#endif\n\n");
+       /* Interface itself: */
+       printf ("extern int max_dfa_issue_rate;\n\n");
+       printf ("/* The following macro value is calculated from the\n");
+       printf ("   automaton based pipeline description and is equal to\n");
+       printf ("   maximal number of all insns described in constructions\n");
+       printf ("   `define_insn_reservation' which can be issued on the\n");
+       printf ("   same processor cycle. */\n");
+       printf ("#define MAX_DFA_ISSUE_RATE max_dfa_issue_rate\n\n");
+       printf ("/* Insn latency time defined in define_insn_reservation. */\n");
+       printf ("extern int insn_default_latency PARAMS ((rtx));\n\n");
+       printf ("/* Return nonzero if there is a bypass for given insn\n");
+       printf ("   which is a data producer.  */\n");
+       printf ("extern int bypass_p PARAMS ((rtx));\n\n");
+       printf ("/* Insn latency time on data consumed by the 2nd insn.\n");
+       printf ("   Use the function if bypass_p returns nonzero for\n");
+       printf ("   the 1st insn. */\n");
+       printf ("extern int insn_latency PARAMS ((rtx, rtx));\n\n");
+       printf ("/* The following function returns number of alternative\n");
+       printf ("   reservations of given insn.  It may be used for better\n");
+       printf ("   insns scheduling heuristics. */\n");
+       printf ("extern int insn_alts PARAMS ((rtx));\n\n");
+       printf ("/* Maximal possible number of insns waiting results being\n");
+       printf ("   produced by insns whose execution is not finished. */\n");
+       printf ("extern int max_insn_queue_index;\n\n");
+       printf ("/* Pointer to data describing current state of DFA.  */\n");
+       printf ("typedef void *state_t;\n\n");
+       printf ("/* Size of the data in bytes.  */\n");
+       printf ("extern int state_size PARAMS ((void));\n\n");
+       printf ("/* Initiate given DFA state, i.e. Set up the state\n");
+       printf ("   as all functional units were not reserved.  */\n");
+       printf ("extern void state_reset PARAMS ((state_t));\n");
+       printf ("/* The following function returns negative value if given\n");
+       printf ("   insn can be issued in processor state described by given\n");
+       printf ("   DFA state.  In this case, the DFA state is changed to\n");
+       printf ("   reflect the current and future reservations by given\n");
+       printf ("   insn.  Otherwise the function returns minimal time\n");
+       printf ("   delay to issue the insn.  This delay may be zero\n");
+       printf ("   for superscalar or VLIW processors.  If the second\n");
+       printf ("   parameter is NULL the function changes given DFA state\n");
+       printf ("   as new processor cycle started.  */\n");
+       printf ("extern int state_transition PARAMS ((state_t, rtx));\n");
+       printf ("\n#if AUTOMATON_STATE_ALTS\n");
+       printf ("/* The following function returns number of possible\n");
+       printf ("   alternative reservations of given insn in given\n");
+       printf ("   DFA state.  It may be used for better insns scheduling\n");
+       printf ("   heuristics.  By default the function is defined if\n");
+       printf ("   macro AUTOMATON_STATE_ALTS is defined because its\n");
+       printf ("   implementation may require much memory.  */\n");
+       printf ("extern int state_alts PARAMS ((state_t, rtx));\n");
+       printf ("#endif\n\n");
+       printf ("extern int min_issue_delay PARAMS ((state_t, rtx));\n");
+       printf ("/* The following function returns nonzero if no one insn\n");
+       printf ("   can be issued in current DFA state. */\n");
+       printf ("extern int state_dead_lock_p PARAMS ((state_t));\n");
+       printf ("/* The function returns minimal delay of issue of the 2nd\n");
+       printf ("   insn after issuing the 1st insn in given DFA state.\n");
+       printf ("   The 1st insn should be issued in given state (i.e.\n");
+       printf ("    state_transition should return negative value for\n");
+       printf ("    the insn and the state).  Data dependencies between\n");
+       printf ("    the insns are ignored by the function.  */\n");
+       printf
+ 	("extern int min_insn_conflict_delay PARAMS ((state_t, rtx, rtx));\n");
+       printf ("/* The following function outputs reservations for given\n");
+       printf ("   insn as they are described in the corresponding\n");
+       printf ("   define_insn_reservation.  */\n");
+       printf ("extern void print_reservation PARAMS ((FILE *, rtx));\n");
+       printf ("\n#if CPU_UNITS_QUERY\n");
+       printf ("/* The following function returns code of functional unit\n");
+       printf ("   with given name (see define_cpu_unit). */\n");
+       printf ("extern int get_cpu_unit_code PARAMS ((const char *));\n");
+       printf ("/* The following function returns nonzero if functional\n");
+       printf ("   unit with given code is currently reserved in given\n");
+       printf ("   DFA state.  */\n");
+       printf ("extern int cpu_unit_reservation_p PARAMS ((state_t, int));\n");
+       printf ("#endif\n\n");
+       printf ("/* Initiate and finish work with DFA.  They should be\n");
+       printf ("   called as the first and the last interface\n");
+       printf ("   functions.  */\n");
+       printf ("extern void dfa_start PARAMS ((void));\n");
+       printf ("extern void dfa_finish PARAMS ((void));\n");
      }
  
    /* Output flag masks for use by reorg.  
Index: genattrtab.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/genattrtab.c,v
retrieving revision 1.92
diff -c -p -r1.92 genattrtab.c
*** genattrtab.c	2001/08/22 14:35:15	1.92
--- genattrtab.c	2001/08/26 21:23:46
*************** Software Foundation, 59 Temple Place - S
*** 110,115 ****
--- 110,117 ----
  #include "obstack.h"
  #include "errors.h"
  
+ #include "genattrtab.h"
+ 
  static struct obstack obstack1, obstack2;
  struct obstack *hash_obstack = &obstack1;
  struct obstack *temp_obstack = &obstack2;
*************** static int have_annul_true, have_annul_f
*** 304,309 ****
--- 306,313 ----
  static int num_units, num_unit_opclasses;
  static int num_insn_ents;
  
+ int num_dfa_decls;
+ 
  /* Used as operand to `operate_exp':  */
  
  enum operator {PLUS_OP, MINUS_OP, POS_MINUS_OP, EQ_OP, OR_OP, ORX_OP, MAX_OP, MIN_OP, RANGE_OP};
*************** rtx pic_offset_table_rtx;
*** 365,374 ****
  static void attr_hash_add_rtx	PARAMS ((int, rtx));
  static void attr_hash_add_string PARAMS ((int, char *));
  static rtx attr_rtx		PARAMS ((enum rtx_code, ...));
- static char *attr_printf	PARAMS ((int, const char *, ...))
-   ATTRIBUTE_PRINTF_2;
  static char *attr_string        PARAMS ((const char *, int));
- static rtx check_attr_test	PARAMS ((rtx, int, int));
  static rtx check_attr_value	PARAMS ((rtx, struct attr_desc *));
  static rtx convert_set_attr_alternative PARAMS ((rtx, struct insn_def *));
  static rtx convert_set_attr	PARAMS ((rtx, struct insn_def *));
--- 369,375 ----
*************** static void write_const_num_delay_slots 
*** 452,461 ****
  static int n_comma_elts		PARAMS ((const char *));
  static char *next_comma_elt	PARAMS ((const char **));
  static struct attr_desc *find_attr PARAMS ((const char *, int));
- static void make_internal_attr	PARAMS ((const char *, rtx, int));
  static struct attr_value *find_most_used  PARAMS ((struct attr_desc *));
  static rtx find_single_value	PARAMS ((struct attr_desc *));
- static rtx make_numeric_value	PARAMS ((int));
  static void extend_range	PARAMS ((struct range *, int, int));
  static rtx attr_eq		PARAMS ((const char *, const char *));
  static const char *attr_numeral	PARAMS ((int));
--- 453,460 ----
*************** attr_rtx VPARAMS ((enum rtx_code code, .
*** 742,748 ****
  
     rtx attr_printf (len, format, [arg1, ..., argn])  */
  
! static char *
  attr_printf VPARAMS ((register int len, const char *fmt, ...))
  {
  #ifndef ANSI_PROTOTYPES
--- 741,747 ----
  
     rtx attr_printf (len, format, [arg1, ..., argn])  */
  
! char *
  attr_printf VPARAMS ((register int len, const char *fmt, ...))
  {
  #ifndef ANSI_PROTOTYPES
*************** attr_copy_rtx (orig)
*** 930,936 ****
  
     Return the new expression, if any.   */
  
! static rtx
  check_attr_test (exp, is_const, lineno)
       rtx exp;
       int is_const;
--- 929,935 ----
  
     Return the new expression, if any.   */
  
! rtx
  check_attr_test (exp, is_const, lineno)
       rtx exp;
       int is_const;
*************** find_attr (name, create)
*** 5885,5891 ****
  
  /* Create internal attribute with the given default value.  */
  
! static void
  make_internal_attr (name, value, special)
       const char *name;
       rtx value;
--- 5884,5890 ----
  
  /* Create internal attribute with the given default value.  */
  
! void
  make_internal_attr (name, value, special)
       const char *name;
       rtx value;
*************** find_single_value (attr)
*** 5952,5958 ****
  
  /* Return (attr_value "n") */
  
! static rtx
  make_numeric_value (n)
       int n;
  {
--- 5951,5957 ----
  
  /* Return (attr_value "n") */
  
! rtx
  make_numeric_value (n)
       int n;
  {
*************** from the machine description file `md'. 
*** 6102,6107 ****
--- 6101,6107 ----
  
    /* Read the machine description.  */
  
+   initiate_automaton_gen (argc, argv);
    while (1)
      {
        int lineno;
*************** from the machine description file `md'. 
*** 6130,6135 ****
--- 6130,6175 ----
  	  gen_unit (desc, lineno);
  	  break;
  
+ 	case DEFINE_CPU_UNIT:
+ 	  gen_cpu_unit (desc);
+ 	  break;
+ 	  
+ 	case DEFINE_QUERY_CPU_UNIT:
+ 	  gen_query_cpu_unit (desc);
+ 	  break;
+ 	  
+ 	case DEFINE_BYPASS:
+ 	  gen_bypass (desc);
+ 	  break;
+ 	  
+ 	case EXCLUSION_SET:
+ 	  gen_excl_set (desc);
+ 	  break;
+ 	  
+ 	case PRESENCE_SET:
+ 	  gen_presence_set (desc);
+ 	  break;
+ 	  
+ 	case ABSENCE_SET:
+ 	  gen_absence_set (desc);
+ 	  break;
+ 	  
+ 	case DEFINE_AUTOMATON:
+ 	  gen_automaton (desc);
+ 	  break;
+ 	  
+ 	case AUTOMATA_OPTION:
+ 	  gen_automata_option (desc);
+ 	  break;
+ 	  
+ 	case DEFINE_RESERVATION:
+ 	  gen_reserv (desc);
+ 	  break;
+ 	  
+ 	case DEFINE_INSN_RESERVATION:
+ 	  gen_insn_reserv (desc);
+ 	  break;
+ 
  	default:
  	  break;
  	}
*************** from the machine description file `md'. 
*** 6154,6162 ****
    if (num_delays)
      expand_delays ();
  
!   /* Expand DEFINE_FUNCTION_UNIT information into new attributes.  */
!   if (num_units)
!     expand_units ();
  
    printf ("#include \"config.h\"\n");
    printf ("#include \"system.h\"\n");
--- 6194,6207 ----
    if (num_delays)
      expand_delays ();
  
!   if (num_units || num_dfa_decls)
!     {
!       /* Expand DEFINE_FUNCTION_UNIT information into new attributes.  */
!       expand_units ();
!       /* Build DFA, output some functions and expand DFA information
! 	 into new attributes.  */
!       expand_automata ();
!     }
  
    printf ("#include \"config.h\"\n");
    printf ("#include \"system.h\"\n");
*************** from the machine description file `md'. 
*** 6231,6239 ****
  	write_eligible_delay ("annul_false");
      }
  
!   /* Write out information about function units.  */
!   if (num_units)
!     write_function_unit_info ();
  
    /* Write out constant delay slot info */
    write_const_num_delay_slots ();
--- 6276,6289 ----
  	write_eligible_delay ("annul_false");
      }
  
!   if (num_units || num_dfa_decls)
!     {
!       /* Write out information about function units.  */
!       write_function_unit_info ();
!       /* Output code for pipeline hazards recognition based on DFA
! 	 (deterministic finite state automata. */
!       write_automata ();
!     }
  
    /* Write out constant delay slot info */
    write_const_num_delay_slots ();
Index: sched-int.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sched-int.h,v
retrieving revision 1.11
diff -c -p -r1.11 sched-int.h
*** sched-int.h	2001/08/22 14:35:40	1.11
--- sched-int.h	2001/08/26 21:23:46
*************** along with GCC; see the file COPYING.  I
*** 20,25 ****
--- 20,28 ----
  Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA
  02111-1307, USA.  */
  
+ /* Pointer to data describing the current DFA state.  */
+ extern state_t curr_state;
+ 
  /* Forward declaration.  */
  struct ready_list;
  
*************** struct haifa_insn_data
*** 181,187 ****
    int dep_count;
  
    /* An encoding of the blockage range function.  Both unit and range
!      are coded.  */
    unsigned int blockage;
  
    /* Number of instructions referring to this insn.  */
--- 184,190 ----
    int dep_count;
  
    /* An encoding of the blockage range function.  Both unit and range
!      are coded.  This member is used only for old pipeline interface.  */
    unsigned int blockage;
  
    /* Number of instructions referring to this insn.  */
*************** struct haifa_insn_data
*** 193,199 ****
  
    short cost;
  
!   /* An encoding of the function units used.  */
    short units;
  
    /* This weight is an estimation of the insn's contribution to
--- 196,203 ----
  
    short cost;
  
!   /* An encoding of the function units used.  This member is used only
!      for old pipeline interface.  */
    short units;
  
    /* This weight is an estimation of the insn's contribution to
Index: haifa-sched.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/haifa-sched.c,v
retrieving revision 1.184
diff -c -p -r1.184 haifa-sched.c
*** haifa-sched.c	2001/08/22 14:35:18	1.184
--- haifa-sched.c	2001/08/26 21:23:46
*************** Free Software Foundation, 59 Temple Plac
*** 158,163 ****
--- 158,169 ----
  
  static int issue_rate;
  
+ /* If the following variable value is non zero, the scheduler inserts
+    bubbles (nop insns).  The value of variable affects on scheduler
+    behavior only if automaton pipeline interface with multipass
+    scheduling is used and hook dfa_bubble is defined.  */
+ int insert_schedule_bubbles_p = 0;
+ 
  /* sched-verbose controls the amount of debugging output the
     scheduler prints.  It is controlled by -fsched-verbose=N:
     N>0 and no -DSR : the output is directed to stderr.
*************** static rtx note_list;
*** 254,268 ****
     passes or stalls are introduced.  */
  
  /* Implement a circular buffer to delay instructions until sufficient
!    time has passed.  INSN_QUEUE_SIZE is a power of two larger than
!    MAX_BLOCKAGE and MAX_READY_COST computed by genattr.c.  This is the
!    longest time an isnsn may be queued.  */
! static rtx insn_queue[INSN_QUEUE_SIZE];
  static int q_ptr = 0;
  static int q_size = 0;
! #define NEXT_Q(X) (((X)+1) & (INSN_QUEUE_SIZE-1))
! #define NEXT_Q_AFTER(X, C) (((X)+C) & (INSN_QUEUE_SIZE-1))
  
  /* Describe the ready list of the scheduler.
     VEC holds space enough for all insns in the current region.  VECLEN
     says how many exactly.
--- 260,299 ----
     passes or stalls are introduced.  */
  
  /* Implement a circular buffer to delay instructions until sufficient
!    time has passed.  For the old pipeline description interface,
!    INSN_QUEUE_SIZE is a power of two larger than MAX_BLOCKAGE and
!    MAX_READY_COST computed by genattr.c.  For the new pipeline
!    description interface, MAX_INSN_QUEUE_INDEX is a power of two minus
!    one which is larger than maximal time of instruction execution
!    computed by genattr.c on the base maximal time of functional unit
!    reservations and geting a result.  This is the longest time an
!    insn may be queued.  */
! 
! #define MAX_INSN_QUEUE_INDEX max_insn_queue_index_macro_value
! 
! static rtx *insn_queue;
  static int q_ptr = 0;
  static int q_size = 0;
! #define NEXT_Q(X) (((X)+1) & MAX_INSN_QUEUE_INDEX)
! #define NEXT_Q_AFTER(X, C) (((X)+C) & MAX_INSN_QUEUE_INDEX)
  
+ /* The following variable defines value for macro
+    MAX_INSN_QUEUE_INDEX.  */
+ static int max_insn_queue_index_macro_value;
+ 
+ /* The following variable value refers for all current and future
+    reservations of the processor units.  */
+ state_t curr_state;
+ 
+ /* The following variable value is size of memory representing all
+    current and future reservations of the processor units.  It is used
+    only by DFA based scheduler.  */
+ static size_t dfa_state_size;
+ 
+ /* The following array is used to find the best insn from ready when
+    the automaton pipeline interface is used.  */
+ static char *ready_try;
+ 
  /* Describe the ready list of the scheduler.
     VEC holds space enough for all insns in the current region.  VECLEN
     says how many exactly.
*************** struct ready_list
*** 280,290 ****
--- 311,325 ----
  };
  
  /* Forward declarations.  */
+ 
+ /* The scheduler using only DFA description should never use the
+    following five functions:  */
  static unsigned int blockage_range PARAMS ((int, rtx));
  static void clear_units PARAMS ((void));
  static void schedule_unit PARAMS ((int, rtx, int));
  static int actual_hazard PARAMS ((int, rtx, int, int));
  static int potential_hazard PARAMS ((int, rtx, int));
+ 
  static int priority PARAMS ((rtx));
  static int rank_for_schedule PARAMS ((const PTR, const PTR));
  static void swap_sort PARAMS ((rtx *, int));
*************** static void debug_ready_list PARAMS ((st
*** 331,336 ****
--- 366,379 ----
  static rtx move_insn1 PARAMS ((rtx, rtx));
  static rtx move_insn PARAMS ((rtx, rtx));
  
+ /* The following functions are used to implement multi-pass scheduling
+    on the first cycle.  It is used only for DFA based scheduler.  */
+ static rtx ready_element PARAMS ((struct ready_list *, int));
+ static rtx ready_remove PARAMS ((struct ready_list *, int));
+ static int max_issue PARAMS ((struct ready_list *, state_t, int *, int *));
+ 
+ static rtx choose_ready PARAMS ((struct ready_list *));
+ 
  #endif /* INSN_SCHEDULING */
  
  /* Point to state used for the current scheduling pass.  */
*************** static rtx last_scheduled_insn;
*** 354,360 ****
     returned by function_units_used.  A function unit is encoded as the
     unit number if the value is non-negative and the compliment of a
     mask if the value is negative.  A function unit index is the
!    non-negative encoding.  */
  
  HAIFA_INLINE int
  insn_unit (insn)
--- 397,404 ----
     returned by function_units_used.  A function unit is encoded as the
     unit number if the value is non-negative and the compliment of a
     mask if the value is negative.  A function unit index is the
!    non-negative encoding.  The scheduler using only DFA description
!    should never use the following function.  */
  
  HAIFA_INLINE int
  insn_unit (insn)
*************** insn_unit (insn)
*** 391,397 ****
  /* Compute the blockage range for executing INSN on UNIT.  This caches
     the value returned by the blockage_range_function for the unit.
     These values are encoded in an int where the upper half gives the
!    minimum value and the lower half gives the maximum value.  */
  
  HAIFA_INLINE static unsigned int
  blockage_range (unit, insn)
--- 435,443 ----
  /* Compute the blockage range for executing INSN on UNIT.  This caches
     the value returned by the blockage_range_function for the unit.
     These values are encoded in an int where the upper half gives the
!    minimum value and the lower half gives the maximum value.  The
!    scheduler using only DFA description should never use the following
!    function.  */
  
  HAIFA_INLINE static unsigned int
  blockage_range (unit, insn)
*************** blockage_range (unit, insn)
*** 415,434 ****
    return range;
  }
  
! /* A vector indexed by function unit instance giving the last insn to use
!    the unit.  The value of the function unit instance index for unit U
!    instance I is (U + I * FUNCTION_UNITS_SIZE).  */
  static rtx unit_last_insn[FUNCTION_UNITS_SIZE * MAX_MULTIPLICITY];
  
! /* A vector indexed by function unit instance giving the minimum time when
!    the unit will unblock based on the maximum blockage cost.  */
  static int unit_tick[FUNCTION_UNITS_SIZE * MAX_MULTIPLICITY];
  
  /* A vector indexed by function unit number giving the number of insns
!    that remain to use the unit.  */
  static int unit_n_insns[FUNCTION_UNITS_SIZE];
  
! /* Access the unit_last_insn array.  Used by the visualization code.  */
  
  rtx
  get_unit_last_insn (instance)
--- 461,498 ----
    return range;
  }
  
! /* A vector indexed by function unit instance giving the last insn to
!    use the unit.  The value of the function unit instance index for
!    unit U instance I is (U + I * FUNCTION_UNITS_SIZE).  The scheduler
!    using only DFA description should never use the following variable.  */
! #if FUNCTION_UNITS_SIZE
  static rtx unit_last_insn[FUNCTION_UNITS_SIZE * MAX_MULTIPLICITY];
+ #else
+ static rtx unit_last_insn[1];
+ #endif
  
! /* A vector indexed by function unit instance giving the minimum time
!    when the unit will unblock based on the maximum blockage cost.  The
!    scheduler using only DFA description should never use the following
!    variable.  */
! #if FUNCTION_UNITS_SIZE
  static int unit_tick[FUNCTION_UNITS_SIZE * MAX_MULTIPLICITY];
+ #else
+ static int unit_tick[1];
+ #endif
  
  /* A vector indexed by function unit number giving the number of insns
!    that remain to use the unit.  The scheduler using only DFA
!    description should never use the following variable.  */
! #if FUNCTION_UNITS_SIZE
  static int unit_n_insns[FUNCTION_UNITS_SIZE];
+ #else
+ static int unit_n_insns[1];
+ #endif
  
! /* Access the unit_last_insn array.  Used by the visualization code.
!    The scheduler using only DFA description should never use the
!    following function.  */
  
  rtx
  get_unit_last_insn (instance)
*************** clear_units ()
*** 447,453 ****
    memset ((char *) unit_n_insns, 0, sizeof (unit_n_insns));
  }
  
! /* Return the issue-delay of an insn.  */
  
  HAIFA_INLINE int
  insn_issue_delay (insn)
--- 511,518 ----
    memset ((char *) unit_n_insns, 0, sizeof (unit_n_insns));
  }
  
! /* Return the issue-delay of an insn.  The scheduler using only DFA
!    description should never use the following function.  */
  
  HAIFA_INLINE int
  insn_issue_delay (insn)
*************** insn_issue_delay (insn)
*** 477,483 ****
  
  /* Return the actual hazard cost of executing INSN on the unit UNIT,
     instance INSTANCE at time CLOCK if the previous actual hazard cost
!    was COST.  */
  
  HAIFA_INLINE int
  actual_hazard_this_instance (unit, instance, insn, clock, cost)
--- 542,549 ----
  
  /* Return the actual hazard cost of executing INSN on the unit UNIT,
     instance INSTANCE at time CLOCK if the previous actual hazard cost
!    was COST.  The scheduler using only DFA description should never
!    use the following function.  */
  
  HAIFA_INLINE int
  actual_hazard_this_instance (unit, instance, insn, clock, cost)
*************** actual_hazard_this_instance (unit, insta
*** 513,520 ****
    return cost;
  }
  
! /* Record INSN as having begun execution on the units encoded by UNIT at
!    time CLOCK.  */
  
  HAIFA_INLINE static void
  schedule_unit (unit, insn, clock)
--- 579,587 ----
    return cost;
  }
  
! /* Record INSN as having begun execution on the units encoded by UNIT
!    at time CLOCK.  The scheduler using only DFA description should
!    never use the following function.  */
  
  HAIFA_INLINE static void
  schedule_unit (unit, insn, clock)
*************** schedule_unit (unit, insn, clock)
*** 545,552 ****
  	schedule_unit (i, insn, clock);
  }
  
! /* Return the actual hazard cost of executing INSN on the units encoded by
!    UNIT at time CLOCK if the previous actual hazard cost was COST.  */
  
  HAIFA_INLINE static int
  actual_hazard (unit, insn, clock, cost)
--- 612,621 ----
  	schedule_unit (i, insn, clock);
  }
  
! /* Return the actual hazard cost of executing INSN on the units
!    encoded by UNIT at time CLOCK if the previous actual hazard cost
!    was COST.  The scheduler using only DFA description should never
!    use the following function.  */
  
  HAIFA_INLINE static int
  actual_hazard (unit, insn, clock, cost)
*************** actual_hazard (unit, insn, clock, cost)
*** 591,601 ****
  }
  
  /* Return the potential hazard cost of executing an instruction on the
!    units encoded by UNIT if the previous potential hazard cost was COST.
!    An insn with a large blockage time is chosen in preference to one
!    with a smaller time; an insn that uses a unit that is more likely
!    to be used is chosen in preference to one with a unit that is less
!    used.  We are trying to minimize a subsequent actual hazard.  */
  
  HAIFA_INLINE static int
  potential_hazard (unit, insn, cost)
--- 660,672 ----
  }
  
  /* Return the potential hazard cost of executing an instruction on the
!    units encoded by UNIT if the previous potential hazard cost was
!    COST.  An insn with a large blockage time is chosen in preference
!    to one with a smaller time; an insn that uses a unit that is more
!    likely to be used is chosen in preference to one with a unit that
!    is less used.  We are trying to minimize a subsequent actual
!    hazard.  The scheduler using only DFA description should never use
!    the following function.  */
  
  HAIFA_INLINE static int
  potential_hazard (unit, insn, cost)
*************** insn_cost (insn, link, used)
*** 648,709 ****
  {
    register int cost = INSN_COST (insn);
  
!   if (cost == 0)
      {
!       recog_memoized (insn);
! 
!       /* A USE insn, or something else we don't need to understand.
!          We can't pass these directly to result_ready_cost because it will
!          trigger a fatal error for unrecognizable insns.  */
!       if (INSN_CODE (insn) < 0)
  	{
! 	  INSN_COST (insn) = 1;
! 	  return 1;
  	}
        else
  	{
! 	  cost = result_ready_cost (insn);
! 
! 	  if (cost < 1)
! 	    cost = 1;
! 
  	  INSN_COST (insn) = cost;
  	}
      }
  
    /* In this case estimate cost without caring how insn is used.  */
!   if (link == 0 && used == 0)
      return cost;
- 
-   /* A USE insn should never require the value used to be computed.  This
-      allows the computation of a function's result and parameter values to
-      overlap the return and call.  */
-   recog_memoized (used);
-   if (INSN_CODE (used) < 0)
-     LINK_COST_FREE (link) = 1;
- 
-   /* If some dependencies vary the cost, compute the adjustment.  Most
-      commonly, the adjustment is complete: either the cost is ignored
-      (in the case of an output- or anti-dependence), or the cost is
-      unchanged.  These values are cached in the link as LINK_COST_FREE
-      and LINK_COST_ZERO.  */
  
!   if (LINK_COST_FREE (link))
      cost = 0;
!   else if (!LINK_COST_ZERO (link) && targetm.sched.adjust_cost)
      {
!       int ncost = (*targetm.sched.adjust_cost) (used, link, insn, cost);
! 
!       if (ncost < 1)
  	{
! 	  LINK_COST_FREE (link) = 1;
! 	  ncost = 0;
  	}
-       if (cost == ncost)
- 	LINK_COST_ZERO (link) = 1;
-       cost = ncost;
-     }
  
    return cost;
  }
  
--- 719,785 ----
  {
    register int cost = INSN_COST (insn);
  
!   if (cost < 0)
      {
!       /* A USE insn, or something else we don't need to
! 	 understand.  We can't pass these directly to
! 	 result_ready_cost or insn_default_latency because it will
! 	 trigger a fatal error for unrecognizable insns.  */
!       if (recog_memoized (insn) < 0)
  	{
! 	  INSN_COST (insn) = 0;
! 	  return 0;
  	}
        else
  	{
! 	  if (targetm.sched.use_dfa_pipeline_interface)
! 	    cost = insn_default_latency (insn);
! 	  else
! 	    cost = result_ready_cost (insn);
! 	  
! 	  if (cost < 0)
! 	    cost = 0;
! 	  
  	  INSN_COST (insn) = cost;
  	}
      }
  
    /* In this case estimate cost without caring how insn is used.  */
!   if (link == 0 || used == 0)
      return cost;
  
!   /* A USE insn should never require the value used to be computed.
!      This allows the computation of a function's result and parameter
!      values to overlap the return and call.  */
!   if (recog_memoized (used) < 0)
      cost = 0;
!   else
      {
!       if (targetm.sched.use_dfa_pipeline_interface)
  	{
! 	  if (INSN_CODE (insn) >= 0)
! 	    {
! 	      if (REG_NOTE_KIND (link) == REG_DEP_ANTI)
! 		cost = 0;
! 	      else if (REG_NOTE_KIND (link) == REG_DEP_OUTPUT)
! 		{
! 		  cost = (insn_default_latency (insn)
! 			  - insn_default_latency (used));
! 		  if (cost <= 0)
! 		    cost = 1;
! 		}
! 	      else if (bypass_p (insn))
! 		cost = insn_latency (insn, used);
! 	    }
  	}
  
+       if (targetm.sched.adjust_cost)
+ 	cost = (*targetm.sched.adjust_cost) (used, link, insn, cost);
+ 
+       if (cost < 0)
+ 	cost = 0;
+     }
+   
    return cost;
  }
  
*************** ready_remove_first (ready)
*** 930,935 ****
--- 1006,1053 ----
    return t;
  }
  
+ /* The following code implements multi-pass scheduling for the first
+    cycle.  In other words, we will try to choose ready insn which
+    permits to start maximum number of insns on the same cycle.  */
+ 
+ /* Return a pointer to the element INDEX from the ready.  INDEX for
+    insn with the highest priority is 0, and the lowest priority has
+    N_READY - 1.  */
+ 
+ HAIFA_INLINE static rtx
+ ready_element (ready, index)
+      struct ready_list *ready;
+      int index;
+ {
+   if (ready->n_ready == 0 || index >= ready->n_ready)
+     abort ();
+   return ready->vec[ready->first - index];
+ }
+ 
+ /* Remove the element INDEX from the ready list and return it.  INDEX
+    for insn with the highest priority is 0, and the lowest priority
+    has N_READY - 1.  */
+ 
+ HAIFA_INLINE static rtx
+ ready_remove (ready, index)
+      struct ready_list *ready;
+      int index;
+ {
+   rtx t;
+   int i;
+ 
+   if (index == 0)
+     return ready_remove_first (ready);
+   if (ready->n_ready == 0 || index >= ready->n_ready)
+     abort ();
+   t = ready->vec[ready->first - index];
+   ready->n_ready--;
+   for (i = index; i < ready->n_ready; i++)
+     ready [ready->first - i] = ready [ready->first - i - 1];
+   return t;
+ }
+ 
+ 
  /* Sort the ready list READY by ascending priority, using the SCHED_SORT
     macro.  */
  
*************** schedule_insn (insn, ready, clock)
*** 976,1001 ****
       int clock;
  {
    rtx link;
!   int unit;
  
!   unit = insn_unit (insn);
  
    if (sched_verbose >= 2)
      {
!       fprintf (sched_dump, ";;\t\t--> scheduling insn <<<%d>>> on unit ",
! 	       INSN_UID (insn));
!       insn_print_units (insn);
        fprintf (sched_dump, "\n");
      }
  
!   if (sched_verbose && unit == -1)
!     visualize_no_unit (insn);
  
-   if (MAX_BLOCKAGE > 1 || issue_rate > 1 || sched_verbose)
-     schedule_unit (unit, insn, clock);
  
!   if (INSN_DEPEND (insn) == 0)
!     return;
  
    for (link = INSN_DEPEND (insn); link != 0; link = XEXP (link, 1))
      {
--- 1094,1140 ----
       int clock;
  {
    rtx link;
!   int unit = 0;
  
!   if (!targetm.sched.use_dfa_pipeline_interface)
!     unit = insn_unit (insn);
  
    if (sched_verbose >= 2)
      {
! 
!       if (targetm.sched.use_dfa_pipeline_interface)
! 	{
! 	  fprintf (sched_dump,
! 		   ";;\t\t--> scheduling insn <<<%d>>>:reservation ",
! 		   INSN_UID (insn));
! 	  
! 	  if (recog_memoized (insn) < 0)
! 	    fprintf (sched_dump, "nothing");
! 	  else
! 	    print_reservation (sched_dump, insn);
! 	}
!       else
! 	{
! 	  fprintf (sched_dump, ";;\t\t--> scheduling insn <<<%d>>> on unit ",
! 		   INSN_UID (insn));
! 	  insn_print_units (insn);
! 	}
! 
        fprintf (sched_dump, "\n");
      }
  
!   if (!targetm.sched.use_dfa_pipeline_interface)
!     {
!       if (sched_verbose && unit == -1)
! 	visualize_no_unit (insn);
  
  
!       if (MAX_BLOCKAGE > 1 || issue_rate > 1 || sched_verbose)
! 	schedule_unit (unit, insn, clock);
!       
!       if (INSN_DEPEND (insn) == 0)
! 	return;
!     }
  
    for (link = INSN_DEPEND (insn); link != 0; link = XEXP (link, 1))
      {
*************** schedule_insn (insn, ready, clock)
*** 1037,1043 ****
       to issue on the same cycle as the previous insn.  A machine
       may use this information to decide how the instruction should
       be aligned.  */
!   if (reload_completed && issue_rate > 1)
      {
        PUT_MODE (insn, clock > last_clock_var ? TImode : VOIDmode);
        last_clock_var = clock;
--- 1176,1184 ----
       to issue on the same cycle as the previous insn.  A machine
       may use this information to decide how the instruction should
       be aligned.  */
!   if (reload_completed && issue_rate > 1
!       && GET_CODE (PATTERN (insn)) != USE
!       && GET_CODE (PATTERN (insn)) != CLOBBER)
      {
        PUT_MODE (insn, clock > last_clock_var ? TImode : VOIDmode);
        last_clock_var = clock;
*************** queue_to_ready (ready)
*** 1464,1470 ****
      {
        register int stalls;
  
!       for (stalls = 1; stalls < INSN_QUEUE_SIZE; stalls++)
  	{
  	  if ((link = insn_queue[NEXT_Q_AFTER (q_ptr, stalls)]))
  	    {
--- 1605,1611 ----
      {
        register int stalls;
  
!       for (stalls = 1; stalls <= MAX_INSN_QUEUE_INDEX; stalls++)
  	{
  	  if ((link = insn_queue[NEXT_Q_AFTER (q_ptr, stalls)]))
  	    {
*************** queue_to_ready (ready)
*** 1483,1495 ****
  		}
  	      insn_queue[NEXT_Q_AFTER (q_ptr, stalls)] = 0;
  
  	      if (ready->n_ready)
  		break;
  	    }
  	}
  
!       if (sched_verbose && stalls)
  	visualize_stall_cycles (stalls);
        q_ptr = NEXT_Q_AFTER (q_ptr, stalls);
        clock_var += stalls;
      }
--- 1624,1651 ----
  		}
  	      insn_queue[NEXT_Q_AFTER (q_ptr, stalls)] = 0;
  
+ 	      /* Advance time on one cycle.  */
+ 	      if (targetm.sched.use_dfa_pipeline_interface)
+ 		{
+ 		  if (targetm.sched.dfa_pre_cycle_insn)
+ 		    state_transition (curr_state,
+ 				      (*targetm.sched.dfa_pre_cycle_insn) ());
+ 
+ 		  state_transition (curr_state, NULL);
+ 
+ 		  if (targetm.sched.dfa_post_cycle_insn)
+ 		    state_transition (curr_state,
+ 				      (*targetm.sched.dfa_post_cycle_insn) ());
+ 		}
+ 
  	      if (ready->n_ready)
  		break;
  	    }
  	}
  
!       if (!targetm.sched.use_dfa_pipeline_interface && sched_verbose && stalls)
  	visualize_stall_cycles (stalls);
+ 
        q_ptr = NEXT_Q_AFTER (q_ptr, stalls);
        clock_var += stalls;
      }
*************** debug_ready_list (ready)
*** 1505,1511 ****
    int i;
  
    if (ready->n_ready == 0)
!     return;
  
    p = ready_lastpos (ready);
    for (i = 0; i < ready->n_ready; i++)
--- 1661,1670 ----
    int i;
  
    if (ready->n_ready == 0)
!     {
!       fprintf (sched_dump, "\n");
!       return;
!     }
  
    p = ready_lastpos (ready);
    for (i = 0; i < ready->n_ready; i++)
*************** move_insn (insn, last)
*** 1617,1622 ****
--- 1776,1892 ----
    return retval;
  }
  
+ /* The following function returns maximal (or close to maximal) number
+    of insns which can be issued on the same cycle and one of which
+    insns is insns with the best rank (the last insn in READY).  To
+    make this function tries different samples of ready insns.  READY
+    is current queue `ready'.  Global array READY_TRY reflects what
+    insns are already issued in this try.  STATE is current processor
+    state.  If the function returns nonzero, INDEX will contain index
+    of the best insn in READY.  *LAST_P is nonzero if the insn with the
+    highest rank is in the current sample.  The following function is
+    used only for first cycle multipass scheduling.  */
+ 
+ static int
+ max_issue (ready, state, index, last_p)
+      struct ready_list *ready;
+      state_t state;
+      int *index;
+      int *last_p;
+      
+ {
+   int i, best, n, temp_index, delay;
+   state_t temp_state;
+   rtx insn;
+   int max_lookahead = (*targetm.sched.first_cycle_multipass_dfa_lookahead) ();
+ 
+   if (state_dead_lock_p (state))
+     return 0;
+   
+   temp_state = alloca (dfa_state_size);
+   best = 0;
+   
+   for (i = 0; i < ready->n_ready; i++)
+     if (!ready_try [i])
+       {
+ 	insn = ready_element (ready, i);
+ 	
+ 	if (INSN_CODE (insn) < 0)
+ 	  continue;
+ 	
+ 	memcpy (temp_state, state, dfa_state_size);
+ 	
+ 	delay = state_transition (temp_state, insn);
+ 	
+ 	if (delay == 0)
+ 	  {
+ 	    if (!targetm.sched.dfa_bubble)
+ 	      continue;
+ 	    else
+ 	      {
+ 		int j;
+ 		rtx bubble;
+ 		
+ 		for (j = 0;
+ 		     (bubble = (*targetm.sched.dfa_bubble) (j)) != NULL_RTX;
+ 		     j++)
+ 		  if (state_transition (temp_state, bubble) < 0
+ 		      && state_transition (temp_state, insn) < 0)
+ 		    break;
+ 		
+ 		if (bubble == NULL_RTX)
+ 		  continue;
+ 	      }
+ 	  }
+ 	else if (delay > 0)
+ 	  continue;
+ 	
+ 	--max_lookahead;
+ 	
+ 	if (max_lookahead < 0)
+ 	  break;
+ 	
+ 	ready_try [i] = 1;
+ 	*last_p = 0;
+ 	
+ 	n = max_issue (ready, temp_state, &temp_index, last_p) + 1;
+ 	
+ 	if (best < n && (ready_try [0] || *last_p))
+ 	  {
+ 	    best = n;
+ 	    *index = i;
+ 	    *last_p = 1;
+ 	  }
+ 	ready_try [i] = 0;
+       }
+   
+   return best;
+ }
+ 
+ /* The following function chooses insn from READY and modifies
+    *N_READY and READY.  The following function is used only for first
+    cycle multipass scheduling.  */
+ 
+ static rtx
+ choose_ready (ready)
+      struct ready_list *ready;
+ {
+   if (!targetm.sched.first_cycle_multipass_dfa_lookahead
+       || (*targetm.sched.first_cycle_multipass_dfa_lookahead) () <= 0)
+     return ready_remove_first (ready);
+   else
+     {
+       /* Try to choose the better insn.  */
+       int index;
+       int last_p = 0;
+       
+       if (max_issue (ready, curr_state, &index, &last_p) == 0)
+ 	return ready_remove_first (ready);
+       else
+ 	return ready_remove (ready, index);
+     }
+ }
+ 
  /* Use forward list scheduling to rearrange insns of block B in region RGN,
     possibly bringing insns from subsequent blocks in the same region.  */
  
*************** schedule_block (b, rgn_n_insns)
*** 1627,1633 ****
--- 1897,1905 ----
  {
    rtx last;
    struct ready_list ready;
+   int first_cycle_insn_p;
    int can_issue_more;
+   state_t temp_state = NULL;  /* It is used for multipass scheduling.  */
  
    /* Head/tail info for this block.  */
    rtx prev_head = current_sched_info->prev_head;
*************** schedule_block (b, rgn_n_insns)
*** 1660,1666 ****
        init_block_visualization ();
      }
  
!   clear_units ();
  
    /* Allocate the ready list.  */
    ready.veclen = rgn_n_insns + 1 + issue_rate;
--- 1932,1941 ----
        init_block_visualization ();
      }
  
!   if (targetm.sched.use_dfa_pipeline_interface)
!     state_reset (curr_state);
!   else
!     clear_units ();
  
    /* Allocate the ready list.  */
    ready.veclen = rgn_n_insns + 1 + issue_rate;
*************** schedule_block (b, rgn_n_insns)
*** 1668,1673 ****
--- 1943,1956 ----
    ready.vec = (rtx *) xmalloc (ready.veclen * sizeof (rtx));
    ready.n_ready = 0;
  
+   if (targetm.sched.use_dfa_pipeline_interface)
+     {
+       /* It is used for first cycle multipass scheduling.  */
+       temp_state = alloca (dfa_state_size);
+       ready_try = (char *) xmalloc ((rgn_n_insns + 1) * sizeof (char));
+       memset (ready_try, 0, (rgn_n_insns + 1) * sizeof (char));
+     }
+ 
    (*current_sched_info->init_ready_list) (&ready);
  
    if (targetm.sched.md_init)
*************** schedule_block (b, rgn_n_insns)
*** 1680,1688 ****
       queue.  */
    q_ptr = 0;
    q_size = 0;
-   last_clock_var = 0;
-   memset ((char *) insn_queue, 0, sizeof (insn_queue));
  
    /* Start just before the beginning of time.  */
    clock_var = -1;
  
--- 1963,1978 ----
       queue.  */
    q_ptr = 0;
    q_size = 0;
  
+   if (!targetm.sched.use_dfa_pipeline_interface)
+     max_insn_queue_index_macro_value = INSN_QUEUE_SIZE - 1;
+   else
+     max_insn_queue_index_macro_value = max_insn_queue_index;
+ 
+   insn_queue = (rtx *) alloca ((MAX_INSN_QUEUE_INDEX + 1) * sizeof (rtx));
+   memset ((char *) insn_queue, 0, (MAX_INSN_QUEUE_INDEX + 1) * sizeof (rtx));
+   last_clock_var = -1;
+ 
    /* Start just before the beginning of time.  */
    clock_var = -1;
  
*************** schedule_block (b, rgn_n_insns)
*** 1694,1699 ****
--- 1984,2003 ----
      {
        clock_var++;
  
+       if (targetm.sched.use_dfa_pipeline_interface)
+ 	{
+ 	  if (targetm.sched.dfa_pre_cycle_insn)
+ 	    state_transition (curr_state,
+ 			      (*targetm.sched.dfa_pre_cycle_insn) ());
+ 
+ 	  /* Advance time on one cycle.  */
+ 	  state_transition (curr_state, NULL);
+ 
+ 	  if (targetm.sched.dfa_post_cycle_insn)
+ 	    state_transition (curr_state,
+ 			      (*targetm.sched.dfa_post_cycle_insn) ());
+ 	}
+ 
        /* Add to the ready list all pending insns that can be issued now.
           If there are no ready insns, increment clock until one
           is ready and add all pending insns at that point to the ready
*************** schedule_block (b, rgn_n_insns)
*** 1725,1745 ****
        else
  	can_issue_more = issue_rate;
  
!       if (sched_verbose)
  	{
! 	  fprintf (sched_dump, "\n;;\tReady list (t =%3d):  ", clock_var);
! 	  debug_ready_list (&ready);
! 	}
  
!       /* Issue insns from ready list.  */
!       while (ready.n_ready != 0
! 	     && can_issue_more
! 	     && (*current_sched_info->schedule_more_p) ())
! 	{
! 	  /* Select and remove the insn from the ready list.  */
! 	  rtx insn = ready_remove_first (&ready);
! 	  int cost = actual_hazard (insn_unit (insn), insn, clock_var, 0);
  
  	  if (cost >= 1)
  	    {
  	      queue_insn (insn, cost);
--- 2029,2151 ----
        else
  	can_issue_more = issue_rate;
  
!       first_cycle_insn_p = 1;
!       for (;;)
  	{
! 	  rtx insn;
! 	  int cost;
  
! 	  if (sched_verbose)
! 	    {
! 	      fprintf (sched_dump, ";;\tReady list (t =%3d):  ",
! 		       clock_var);
! 	      debug_ready_list (&ready);
! 	    }
  
+ 	  if (!targetm.sched.use_dfa_pipeline_interface)
+ 	    {
+ 	      if (ready.n_ready == 0 || !can_issue_more
+ 		  || !(*current_sched_info->schedule_more_p) ())
+ 		break;
+ 	      insn = choose_ready (&ready);
+ 	      cost = actual_hazard (insn_unit (insn), insn, clock_var, 0);
+ 	    }
+ 	  else
+ 	    {
+ 	      if (ready.n_ready == 0 || !can_issue_more
+ 		  || state_dead_lock_p (curr_state)
+ 		  || !(*current_sched_info->schedule_more_p) ())
+ 		break;
+ 	      
+ 	      /* Select and remove the insn from the ready list.  */
+ 	      insn = choose_ready (&ready);
+ 	      
+ 	      if (recog_memoized (insn) < 0)
+ 		{
+ 		  if (!first_cycle_insn_p
+ 		      && (GET_CODE (PATTERN (insn)) == ASM_INPUT
+ 			  || asm_noperands (PATTERN (insn)) >= 0))
+ 		    /* This is asm insn which is tryed to be issued on the
+ 		       cycle not first.  Issue it on the next cycle.  */
+ 		    cost = 1;
+ 		  else
+ 		    /* A USE insn, or something else we don't need to
+ 		       understand.  We can't pass these directly to
+ 		       state_transition because it will trigger a
+ 		       fatal error for unrecognizable insns.  */
+ 		    cost = 0;
+ 		}
+ 	      else
+ 		{
+ 		  cost = state_transition (curr_state, insn);
+ 
+ 		  if (targetm.sched.first_cycle_multipass_dfa_lookahead
+ 		      && targetm.sched.dfa_bubble)
+ 		    {
+ 		      if (cost == 0)
+ 			{
+ 			  int j;
+ 			  rtx bubble;
+ 			  
+ 			  for (j = 0;
+ 			       (bubble = (*targetm.sched.dfa_bubble) (j))
+ 				 != NULL_RTX;
+ 			       j++)
+ 			    {
+ 			      memcpy (temp_state, curr_state, dfa_state_size);
+ 			      
+ 			      if (state_transition (temp_state, bubble) < 0
+ 				  && state_transition (temp_state, insn) < 0)
+ 				break;
+ 			    }
+ 			  
+ 			  if (bubble != NULL_RTX)
+ 			    {
+ 			      memcpy (curr_state, temp_state, dfa_state_size);
+ 			      
+ 			      if (insert_schedule_bubbles_p)
+ 				{
+ 				  rtx copy;
+ 				  
+ 				  copy = copy_rtx (PATTERN (bubble));
+ 				  emit_insn_after (copy, last);
+ 				  last = NEXT_INSN (last);
+ 				  INSN_CODE (last) = INSN_CODE (bubble);
+ 				  
+ 				  /* Annotate the same for the first insns
+ 				     scheduling by using mode.  */
+ 				  PUT_MODE (last, (clock_var > last_clock_var
+ 						   ? clock_var - last_clock_var
+ 						   : VOIDmode));
+ 				  last_clock_var = clock_var;
+ 				  
+ 				  if (sched_verbose >= 2)
+ 				    {
+ 				      fprintf (sched_dump,
+ 					       ";;\t\t--> scheduling bubble insn <<<%d>>>:reservation ",
+ 					       INSN_UID (last));
+ 				      
+ 				      if (recog_memoized (last) < 0)
+ 					fprintf (sched_dump, "nothing");
+ 				      else
+ 					print_reservation (sched_dump, last);
+ 				      
+ 				      fprintf (sched_dump, "\n");
+ 				    }
+ 				}
+ 			      cost = -1;
+ 			    }
+ 			}
+ 		    }
+ 
+ 		  if (cost < 0)
+ 		    cost = 0;
+ 		  else if (cost == 0)
+ 		    cost = 1;
+ 		}
+ 	    }
+ 
+ 
  	  if (cost >= 1)
  	    {
  	      queue_insn (insn, cost);
*************** schedule_block (b, rgn_n_insns)
*** 1762,1767 ****
--- 2168,2175 ----
  	  schedule_insn (insn, &ready, clock_var);
  
  	next:
+ 	  first_cycle_insn_p = 0;
+ 
  	  if (targetm.sched.reorder2)
  	    {
  	      /* Sort the ready list based on priority.  */
*************** schedule_block (b, rgn_n_insns)
*** 1775,1782 ****
  	    }
  	}
  
!       /* Debug info.  */
!       if (sched_verbose)
  	visualize_scheduled_insns (clock_var);
      }
  
--- 2183,2190 ----
  	    }
  	}
  
!       if (!targetm.sched.use_dfa_pipeline_interface && sched_verbose)
! 	/* Debug info.  */
  	visualize_scheduled_insns (clock_var);
      }
  
*************** schedule_block (b, rgn_n_insns)
*** 1788,1794 ****
      {
        fprintf (sched_dump, ";;\tReady list (final):  ");
        debug_ready_list (&ready);
!       print_block_visualization ("");
      }
  
    /* Sanity check -- queue must be empty now.  Meaningless if region has
--- 2196,2203 ----
      {
        fprintf (sched_dump, ";;\tReady list (final):  ");
        debug_ready_list (&ready);
!       if (!targetm.sched.use_dfa_pipeline_interface)
! 	print_block_visualization ("");
      }
  
    /* Sanity check -- queue must be empty now.  Meaningless if region has
*************** schedule_block (b, rgn_n_insns)
*** 1833,1838 ****
--- 2242,2250 ----
    current_sched_info->tail = tail;
  
    free (ready.vec);
+ 
+   if (targetm.sched.use_dfa_pipeline_interface)
+     free (ready_try);
  }
  
  /* Set_priorities: compute priority of each insn in the block.  */
*************** sched_init (dump_file)
*** 1874,1879 ****
--- 2286,2292 ----
  {
    int luid, b;
    rtx insn;
+   int i;
  
    /* Disable speculative loads in their presence if cc0 defined.  */
  #ifdef HAVE_cc0
*************** sched_init (dump_file)
*** 1901,1906 ****
--- 2314,2339 ----
  
    h_i_d = (struct haifa_insn_data *) xcalloc (old_max_uid, sizeof (*h_i_d));
  
+   for (i = 0; i < old_max_uid; i++)
+     h_i_d [i].cost = -1;
+ 
+   if (targetm.sched.use_dfa_pipeline_interface)
+     {
+       if (targetm.sched.init_dfa_pre_cycle_insn)
+ 	(*targetm.sched.init_dfa_pre_cycle_insn) ();
+       
+       if (targetm.sched.init_dfa_post_cycle_insn)
+ 	(*targetm.sched.init_dfa_post_cycle_insn) ();
+       
+       if (targetm.sched.first_cycle_multipass_dfa_lookahead
+ 	  && targetm.sched.init_dfa_bubbles)
+ 	(*targetm.sched.init_dfa_bubbles) ();
+       
+       dfa_start ();
+       dfa_state_size = state_size ();
+       curr_state = xmalloc (dfa_state_size);
+     }
+ 
    h_i_d[0].luid = 0;
    luid = 1;
    for (b = 0; b < n_basic_blocks; b++)
*************** sched_init (dump_file)
*** 1958,1965 ****
  	}
      }
  
!   /* Find units used in this fuction, for visualization.  */
!   if (sched_verbose)
      init_target_units ();
  
    /* ??? Add a NOTE after the last insn of the last basic block.  It is not
--- 2391,2398 ----
  	}
      }
  
!   if (!targetm.sched.use_dfa_pipeline_interface && sched_verbose)
!     /* Find units used in this function, for visualization.  */
      init_target_units ();
  
    /* ??? Add a NOTE after the last insn of the last basic block.  It is not
*************** void
*** 1985,1990 ****
--- 2418,2429 ----
  sched_finish ()
  {
    free (h_i_d);
+ 
+   if (targetm.sched.use_dfa_pipeline_interface)
+     {
+       free (curr_state);
+       dfa_finish ();
+     }
    free_dependency_caches ();
    end_alias_analysis ();
    if (write_symbols != NO_DEBUG)
Index: sched-rgn.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sched-rgn.c,v
retrieving revision 1.14
diff -c -p -r1.14 sched-rgn.c
*** sched-rgn.c	2001/08/22 14:35:40	1.14
--- sched-rgn.c	2001/08/26 21:23:46
*************** Free Software Foundation, 59 Temple Plac
*** 61,66 ****
--- 61,67 ----
  #include "toplev.h"
  #include "recog.h"
  #include "sched-int.h"
+ #include "target.h"
  
  #ifdef INSN_SCHEDULING
  /* Some accessor macros for h_i_d members only used within this file.  */
*************** init_ready_list (ready)
*** 2142,2148 ****
  
  	    if (!CANT_MOVE (insn)
  		&& (!IS_SPECULATIVE_INSN (insn)
! 		    || (insn_issue_delay (insn) <= 3
  			&& check_live (insn, bb_src)
  			&& is_exception_free (insn, bb_src, target_bb))))
  	      {
--- 2143,2155 ----
  
  	    if (!CANT_MOVE (insn)
  		&& (!IS_SPECULATIVE_INSN (insn)
! 		    || ((0
! 			 || (targetm.sched.use_dfa_pipeline_interface
! 			     && recog_memoized (insn) >= 0
! 			     && min_insn_conflict_delay (curr_state, insn,
! 							 insn) <= 3)
! 			 || (!targetm.sched.use_dfa_pipeline_interface
! 			     && insn_issue_delay (insn) <= 3))
  			&& check_live (insn, bb_src)
  			&& is_exception_free (insn, bb_src, target_bb))))
  	      {
*************** new_ready (next)
*** 2250,2256 ****
        && (!IS_VALID (INSN_BB (next))
  	  || CANT_MOVE (next)
  	  || (IS_SPECULATIVE_INSN (next)
! 	      && (insn_issue_delay (next) > 3
  		  || !check_live (next, INSN_BB (next))
  		  || !is_exception_free (next, INSN_BB (next), target_bb)))))
      return 0;
--- 2257,2269 ----
        && (!IS_VALID (INSN_BB (next))
  	  || CANT_MOVE (next)
  	  || (IS_SPECULATIVE_INSN (next)
! 	      && (0
! 		  || (targetm.sched.use_dfa_pipeline_interface
! 		      && (recog_memoized (next) < 0
! 			  || min_insn_conflict_delay (curr_state, next,
! 						      next) > 3))
! 		  || (!targetm.sched.use_dfa_pipeline_interface
! 		      && insn_issue_delay (next) > 3)
  		  || !check_live (next, INSN_BB (next))
  		  || !is_exception_free (next, INSN_BB (next), target_bb)))))
      return 0;
*************** debug_dependencies ()
*** 2642,2655 ****
  	  fprintf (sched_dump, "\n;;   --- Region Dependences --- b %d bb %d \n",
  		   BB_TO_BLOCK (bb), bb);
  
! 	  fprintf (sched_dump, ";;   %7s%6s%6s%6s%6s%6s%11s%6s\n",
! 	  "insn", "code", "bb", "dep", "prio", "cost", "blockage", "units");
! 	  fprintf (sched_dump, ";;   %7s%6s%6s%6s%6s%6s%11s%6s\n",
! 	  "----", "----", "--", "---", "----", "----", "--------", "-----");
  	  for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
  	    {
  	      rtx link;
- 	      int unit, range;
  
  	      if (! INSN_P (insn))
  		{
--- 2655,2680 ----
  	  fprintf (sched_dump, "\n;;   --- Region Dependences --- b %d bb %d \n",
  		   BB_TO_BLOCK (bb), bb);
  
! 	  if (targetm.sched.use_dfa_pipeline_interface)
! 	    {
! 	      fprintf (sched_dump, ";;   %7s%6s%6s%6s%6s%6s%14s\n",
! 		       "insn", "code", "bb", "dep", "prio", "cost",
! 		       "reservation");
! 	      fprintf (sched_dump, ";;   %7s%6s%6s%6s%6s%6s%14s\n",
! 		       "----", "----", "--", "---", "----", "----",
! 		       "-----------");
! 	    }
! 	  else
! 	    {
! 	      fprintf (sched_dump, ";;   %7s%6s%6s%6s%6s%6s%11s%6s\n",
! 	      "insn", "code", "bb", "dep", "prio", "cost", "blockage", "units");
! 	      fprintf (sched_dump, ";;   %7s%6s%6s%6s%6s%6s%11s%6s\n",
! 	      "----", "----", "--", "---", "----", "----", "--------", "-----");
! 	    }
! 
  	  for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
  	    {
  	      rtx link;
  
  	      if (! INSN_P (insn))
  		{
*************** debug_dependencies ()
*** 2668,2690 ****
  		    fprintf (sched_dump, " {%s}\n", GET_RTX_NAME (GET_CODE (insn)));
  		  continue;
  		}
  
- 	      unit = insn_unit (insn);
- 	      range = (unit < 0
- 		 || function_units[unit].blockage_range_function == 0) ? 0 :
- 		function_units[unit].blockage_range_function (insn);
- 	      fprintf (sched_dump,
- 		       ";;   %s%5d%6d%6d%6d%6d%6d  %3d -%3d   ",
- 		       (SCHED_GROUP_P (insn) ? "+" : " "),
- 		       INSN_UID (insn),
- 		       INSN_CODE (insn),
- 		       INSN_BB (insn),
- 		       INSN_DEP_COUNT (insn),
- 		       INSN_PRIORITY (insn),
- 		       insn_cost (insn, 0, 0),
- 		       (int) MIN_BLOCKAGE_COST (range),
- 		       (int) MAX_BLOCKAGE_COST (range));
- 	      insn_print_units (insn);
  	      fprintf (sched_dump, "\t: ");
  	      for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
  		fprintf (sched_dump, "%d ", INSN_UID (XEXP (link, 0)));
--- 2693,2738 ----
  		    fprintf (sched_dump, " {%s}\n", GET_RTX_NAME (GET_CODE (insn)));
  		  continue;
  		}
+ 
+ 	      if (targetm.sched.use_dfa_pipeline_interface)
+ 		{
+ 		  fprintf (sched_dump,
+ 			   ";;   %s%5d%6d%6d%6d%6d%6d   ",
+ 			   (SCHED_GROUP_P (insn) ? "+" : " "),
+ 			   INSN_UID (insn),
+ 			   INSN_CODE (insn),
+ 			   INSN_BB (insn),
+ 			   INSN_DEP_COUNT (insn),
+ 			   INSN_PRIORITY (insn),
+ 			   insn_cost (insn, 0, 0));
+ 		  
+ 		  if (recog_memoized (insn) < 0)
+ 		    fprintf (sched_dump, "nothing");
+ 		  else
+ 		    print_reservation (sched_dump, insn);
+ 		}
+ 	      else
+ 		{
+ 		  int unit = insn_unit (insn);
+ 		  int range
+ 		    = (unit < 0
+ 		       || function_units[unit].blockage_range_function == 0
+ 		       ? 0
+ 		       : function_units[unit].blockage_range_function (insn));
+ 		  fprintf (sched_dump,
+ 			   ";;   %s%5d%6d%6d%6d%6d%6d  %3d -%3d   ",
+ 			   (SCHED_GROUP_P (insn) ? "+" : " "),
+ 			   INSN_UID (insn),
+ 			   INSN_CODE (insn),
+ 			   INSN_BB (insn),
+ 			   INSN_DEP_COUNT (insn),
+ 			   INSN_PRIORITY (insn),
+ 			   insn_cost (insn, 0, 0),
+ 			   (int) MIN_BLOCKAGE_COST (range),
+ 			   (int) MAX_BLOCKAGE_COST (range));
+ 		  insn_print_units (insn);
+ 		}
  
  	      fprintf (sched_dump, "\t: ");
  	      for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
  		fprintf (sched_dump, "%d ", INSN_UID (XEXP (link, 0)));
Index: sched-vis.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sched-vis.c,v
retrieving revision 1.8
diff -c -p -r1.8 sched-vis.c
*** sched-vis.c	2001/08/22 14:35:40	1.8
--- sched-vis.c	2001/08/26 21:23:46
*************** Free Software Foundation, 59 Temple Plac
*** 31,36 ****
--- 31,37 ----
  #include "basic-block.h"
  #include "insn-attr.h"
  #include "sched-int.h"
+ #include "target.h"
  
  #ifdef INSN_SCHEDULING
  /* target_units bitmask has 1 for each unit in the cpu.  It should be
*************** Free Software Foundation, 59 Temple Plac
*** 38,44 ****
     But currently it is computed by examining the insn list.  Since
     this is only needed for visualization, it seems an acceptable
     solution.  (For understanding the mapping of bits to units, see
!    definition of function_units[] in "insn-attrtab.c".)  */
  
  static int target_units = 0;
  
--- 39,46 ----
     But currently it is computed by examining the insn list.  Since
     this is only needed for visualization, it seems an acceptable
     solution.  (For understanding the mapping of bits to units, see
!    definition of function_units[] in "insn-attrtab.c".)  The scheduler
!    using only DFA description should never use the following variable.  */
  
  static int target_units = 0;
  
*************** get_visual_tbl_length ()
*** 122,127 ****
--- 124,136 ----
    int n, n1;
    char *s;
  
+   if (targetm.sched.use_dfa_pipeline_interface)
+     {
+       visual_tbl_line_length = 1;
+       return 1; /* Can't return 0 because that will cause problems
+                    with alloca.  */
+     }
+ 
    /* Compute length of one field in line.  */
    s = (char *) alloca (INSN_LEN + 6);
    sprintf (s, "  %33s", "uname");
*************** print_insn (buf, x, verbose)
*** 809,815 ****
      }
  }				/* print_insn */
  
! /* Print visualization debugging info.  */
  
  void
  print_block_visualization (s)
--- 818,825 ----
      }
  }				/* print_insn */
  
! /* Print visualization debugging info.  The scheduler using only DFA
!    description should never use the following function.  */
  
  void
  print_block_visualization (s)
Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.729
diff -c -p -r1.729 Makefile.in
*** Makefile.in	2001/08/22 14:34:40	1.729
--- Makefile.in	2001/08/26 21:23:47
*************** INTL_SUBDIRS = intl $(POSUB)
*** 346,351 ****
--- 346,355 ----
  # system library.
  OBSTACK=obstack.o
  
+ # The following object files is used by genautomata.
+ GETRUNTIME = getruntime.o
+ HASHTAB = hashtab.o
+ 
  # The GC method to be used on this system.
  GGC=@GGC@.o
  
*************** HOST_CPPFLAGS=$(ALL_CPPFLAGS)
*** 482,487 ****
--- 486,493 ----
  HOST_OBSTACK=$(OBSTACK)
  HOST_VFPRINTF=$(VFPRINTF)
  HOST_DOPRINT=$(DOPRINT)
+ HOST_GETRUNTIME=$(GETRUNTIME)
+ HOST_HASHTAB=$(HASHTAB)
  HOST_STRSTR=$(STRSTR)
  
  # Actual name to use when installing a native compiler.
*************** ALL_CPPFLAGS = $(CPPFLAGS) $(X_CPPFLAGS)
*** 609,614 ****
--- 615,622 ----
  USE_HOST_OBSTACK= ` case "${HOST_OBSTACK}" in ?*) echo ${HOST_PREFIX}${HOST_OBSTACK} ;; esac `
  USE_HOST_VFPRINTF= ` case "${HOST_VFPRINTF}" in ?*) echo ${HOST_PREFIX}${HOST_VFPRINTF} ;; esac `
  USE_HOST_DOPRINT= ` case "${HOST_DOPRINT}" in ?*) echo ${HOST_PREFIX}${HOST_DOPRINT} ;; esac `
+ USE_HOST_GETRUNTIME= ` case "${HOST_GETRUNTIME}" in ?*) echo ${HOST_PREFIX}${HOST_GETRUNTIME} ;; esac `
+ USE_HOST_HASHTAB= ` case "${HOST_HASHTAB}" in ?*) echo ${HOST_PREFIX}${HOST_HASHTAB} ;; esac `
  USE_HOST_STRSTR= ` case "${HOST_STRSTR}" in ?*) echo ${HOST_PREFIX}${HOST_STRSTR} ;; esac `
  
  # Dependency on obstack or whatever library facilities
*************** HOST_RTL = $(HOST_PREFIX)rtl.o read-rtl.
*** 637,642 ****
--- 645,651 ----
  
  HOST_PRINT = $(HOST_PREFIX)print-rtl.o
  HOST_ERRORS = $(HOST_PREFIX)errors.o
+ HOST_VARRAY = $(HOST_PREFIX)varray.o
  
  # Specify the directories to be searched for header files.
  # Both . and srcdir are used, in that order,
*************** obstack.o: $(srcdir)/../libiberty/obstac
*** 1333,1338 ****
--- 1342,1352 ----
  	$(CC) -c $(ALL_CFLAGS) -DGENERATOR_FILE $(ALL_CPPFLAGS) $(INCLUDES) \
  		obstack.c $(OUTPUT_OPTION)
  
+ getruntime.o: $(srcdir)/../libiberty/getruntime.c $(CONFIG_H)
+ 	rm -f getruntime.c
+ 	$(LN_S) $(srcdir)/../libiberty/getruntime.c getruntime.c
+ 	$(CC) -c $(ALL_CFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) getruntime.c
+ 
  prefix.o: prefix.c $(CONFIG_H) $(SYSTEM_H) Makefile prefix.h
  	$(CC) $(ALL_CFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
  	-DPREFIX=\"$(prefix)\" \
*************** sched-deps.o : sched-deps.c $(CONFIG_H) 
*** 1533,1544 ****
     $(INSN_ATTR_H) toplev.h $(RECOG_H) except.h cselib.h $(PARAMS_H) $(TM_P_H)
  sched-rgn.o : sched-rgn.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) sched-int.h \
     $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h flags.h insn-config.h function.h \
!    $(INSN_ATTR_H) toplev.h $(RECOG_H) except.h $(TM_P_H)
  sched-ebb.o : sched-ebb.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) sched-int.h \
     $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h flags.h insn-config.h function.h \
     $(INSN_ATTR_H) toplev.h $(RECOG_H) except.h $(TM_P_H)
  sched-vis.o : sched-vis.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) sched-int.h \
!    hard-reg-set.h $(BASIC_BLOCK_H) $(INSN_ATTR_H) $(REGS_H) $(TM_P_H)
  final.o : final.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) $(TREE_H) flags.h intl.h \
     $(REGS_H) $(RECOG_H) conditions.h insn-config.h $(INSN_ATTR_H) function.h \
     real.h output.h hard-reg-set.h except.h debug.h xcoffout.h \
--- 1547,1559 ----
     $(INSN_ATTR_H) toplev.h $(RECOG_H) except.h cselib.h $(PARAMS_H) $(TM_P_H)
  sched-rgn.o : sched-rgn.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) sched-int.h \
     $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h flags.h insn-config.h function.h \
!    $(INSN_ATTR_H) toplev.h $(RECOG_H) except.h $(TM_P_H) $(TARGET_H)
  sched-ebb.o : sched-ebb.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) sched-int.h \
     $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h flags.h insn-config.h function.h \
     $(INSN_ATTR_H) toplev.h $(RECOG_H) except.h $(TM_P_H)
  sched-vis.o : sched-vis.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) sched-int.h \
!    hard-reg-set.h $(BASIC_BLOCK_H) $(INSN_ATTR_H) $(REGS_H) $(TM_P_H) \
!    $(TARGET_H)
  final.o : final.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) $(TREE_H) flags.h intl.h \
     $(REGS_H) $(RECOG_H) conditions.h insn-config.h $(INSN_ATTR_H) function.h \
     real.h output.h hard-reg-set.h except.h debug.h xcoffout.h \
*************** genattr$(build_exeext) : genattr.o $(HOS
*** 1842,1855 ****
  genattr.o : genattr.c $(RTL_H) $(HCONFIG_H) $(SYSTEM_H) errors.h gensupport.h
  	$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(srcdir)/genattr.c
  
! genattrtab$(build_exeext) : genattrtab.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_LIBDEPS)
  	$(HOST_CC) $(HOST_CFLAGS) $(HOST_LDFLAGS) -o $@ \
! 	 genattrtab.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_LIBS)
  
  genattrtab.o : genattrtab.c $(RTL_H) $(OBSTACK_H) $(HCONFIG_H) \
!   $(SYSTEM_H) errors.h $(GGC_H) gensupport.h
  	$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(srcdir)/genattrtab.c
  
  genoutput$(build_exeext) : genoutput.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_LIBDEPS)
  	$(HOST_CC) $(HOST_CFLAGS) $(HOST_LDFLAGS) -o $@ \
  	 genoutput.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_LIBS)
--- 1857,1874 ----
  genattr.o : genattr.c $(RTL_H) $(HCONFIG_H) $(SYSTEM_H) errors.h gensupport.h
  	$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(srcdir)/genattr.c
  
! genattrtab$(build_exeext) : genattrtab.o genautomata.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_VARRAY) $(HOST_PREFIX)$(HOST_GETRUNTIME) $(HOST_LIBDEPS)
  	$(HOST_CC) $(HOST_CFLAGS) $(HOST_LDFLAGS) -o $@ \
! 	 genattrtab.o genautomata.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_VARRAY) $(USE_HOST_GETRUNTIME) $(HOST_LIBS) -lm
  
  genattrtab.o : genattrtab.c $(RTL_H) $(OBSTACK_H) $(HCONFIG_H) \
!   $(SYSTEM_H) errors.h $(GGC_H) gensupport.h genattrtab.h
  	$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(srcdir)/genattrtab.c
  
+ genautomata.o : genautomata.c $(RTL_H) $(OBSTACK_H) $(HCONFIG_H) \
+   $(SYSTEM_H) errors.h varray.h hash.h genattrtab.h
+ 	$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(srcdir)/genautomata.c
+ 
  genoutput$(build_exeext) : genoutput.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_LIBDEPS)
  	$(HOST_CC) $(HOST_CFLAGS) $(HOST_LDFLAGS) -o $@ \
  	 genoutput.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_LIBS)
*************** $(HOST_PREFIX_1)obstack.o: $(srcdir)/../
*** 1899,1904 ****
--- 1918,1933 ----
  	rm -f $(HOST_PREFIX)obstack.c
  	sed -e 's/config[.]h/hconfig.h/' $(srcdir)/../libiberty/obstack.c > $(HOST_PREFIX)obstack.c
  	$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(HOST_PREFIX)obstack.c
+ 
+ $(HOST_PREFIX_1)getruntime.o: $(srcdir)/../libiberty/getruntime.c
+ 	rm -f $(HOST_PREFIX)getruntime.c
+ 	sed -e 's/config[.]h/hconfig.h/' $(srcdir)/../libiberty/getruntime.c > $(HOST_PREFIX)getruntime.c
+ 	$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(HOST_PREFIX)getruntime.c
+ 
+ $(HOST_PREFIX_1)hashtab.o: $(srcdir)/../libiberty/hashtab.c
+ 	rm -f $(HOST_PREFIX)hashtab.c
+ 	sed -e 's/config[.]h/hconfig.h/' $(srcdir)/../libiberty/hashtab.c > $(HOST_PREFIX)hashtab.c
+ 	$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(HOST_PREFIX)hashtab.c
  
  $(HOST_PREFIX_1)vfprintf.o: $(srcdir)/../libiberty/vfprintf.c $(HCONFIG_H)
  	rm -f $(HOST_PREFIX)vfprintf.c
Index: doc/md.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/md.texi,v
retrieving revision 1.21
diff -c -p -r1.21 md.texi
*** md.texi	2001/08/18 21:02:43	1.21
--- md.texi	2001/08/26 21:23:48
*************** in the compiler.
*** 3676,3688 ****
  @cindex instruction splitting
  @cindex splitting instructions
  
! There are two cases where you should specify how to split a pattern into
! multiple insns.  On machines that have instructions requiring delay
! slots (@pxref{Delay Slots}) or that have instructions whose output is
! not available for multiple cycles (@pxref{Function Units}), the compiler
! phases that optimize these cases need to be able to move insns into
! one-instruction delay slots.  However, some insns may generate more than one
! machine instruction.  These insns cannot be placed into a delay slot.
  
  Often you can rewrite the single insn as a list of individual insns,
  each corresponding to one machine instruction.  The disadvantage of
--- 3676,3689 ----
  @cindex instruction splitting
  @cindex splitting instructions
  
! There are two cases where you should specify how to split a pattern
! into multiple insns.  On machines that have instructions requiring
! delay slots (@pxref{Delay Slots}) or that have instructions whose
! output is not available for multiple cycles (@pxref{Processor pipeline
! description}), the compiler phases that optimize these cases need to
! be able to move insns into one-instruction delay slots.  However, some
! insns may generate more than one machine instruction.  These insns
! cannot be placed into a delay slot.
  
  Often you can rewrite the single insn as a list of individual insns,
  each corresponding to one machine instruction.  The disadvantage of
*************** to track the condition codes.
*** 4227,4233 ****
  * Insn Lengths::        Computing the length of insns.
  * Constant Attributes:: Defining attributes that are constant.
  * Delay Slots::         Defining delay slots required for a machine.
! * Function Units::      Specifying information for insn scheduling.
  @end menu
  
  @node Defining Attributes
--- 4228,4234 ----
  * Insn Lengths::        Computing the length of insns.
  * Constant Attributes:: Defining attributes that are constant.
  * Delay Slots::         Defining delay slots required for a machine.
! * Processor pipeline description:: Specifying information for insn scheduling.
  @end menu
  
  @node Defining Attributes
*************** branch is true, we might represent this 
*** 4857,4870 ****
  @end smallexample
  @c the above is *still* too long.  --mew 4feb93
  
! @node Function Units
! @subsection Specifying Function Units
  @cindex function units, for scheduling
  
! On most RISC machines, there are instructions whose results are not
! available for a specific number of cycles.  Common cases are instructions
! that load data from memory.  On many machines, a pipeline stall will result
! if the data is referenced too soon after the load instruction.
  
  In addition, many newer microprocessors have multiple function units, usually
  one for integer and one for floating point, and often will incur pipeline
--- 4858,4958 ----
  @end smallexample
  @c the above is *still* too long.  --mew 4feb93
  
! @node Processor pipeline description
! @subsection Specifying processor pipeline description
! @cindex processor pipeline description
! @cindex processor functional units
! @cindex instruction latency time
! @cindex interlock delays
! @cindex data dependence delays
! @cindex reservation delays
! @cindex pipeline hazard recognizer
! @cindex automaton based pipeline description
! @cindex regular expressions
! @cindex deterministic finite state automaton
! @cindex automaton based scheduler
! @cindex RISC
! @cindex VLIW
! 
! To achieve better productivity most modern processors
! (super-pipelined, superscalar @acronym{RISC}, and @acronym{VLIW}
! processors) have many @dfn{functional units} on which several
! instructions can be executed simultaneously.  An instruction starts
! execution if its issue conditions are satisfied.  If not, the
! instruction is interlocked until its conditions are satisfied.  Such
! @dfn{interlock (pipeline) delay} causes interruption of the fetching
! of successor instructions (or demands nop instructions, e.g. for some
! MIPS processors).
! 
! There are two major kinds of interlock delays in modern processors.
! The first one is a data dependence delay determining @dfn{instruction
! latency time}.  The instruction execution is not started until all
! source data have been evaluated by prior instructions (there are more
! complex cases when the instruction execution starts even when the data
! are not availaible but will be ready in given time after the
! instruction execution start).  Taking the data dependence delays into
! account is simple.  The data dependence (true, output, and
! anti-dependence) delay between two instructions is given by a
! constant.  In most cases this approach is adequate.  The second kind
! of interlock delays is a reservation delay.  The reservation delay
! means that two instructions under execution will be in need of shared
! processors resources, i.e. buses, internal registers, and/or
! functional units, which are reserved for some time.  Taking this kind
! of delay into account is complex especially for modern @acronym{RISC}
! processors.
! 
! The task of exploiting more processor parallelism is solved by an
! instruction scheduler.  For better solution of this problem, the
! instruction scheduler has to have an adequate description of the
! processor parallelism (or @dfn{pipeline description}).  Currently GCC
! has two ways to describe processor parallelism.  The first one is old
! and originated from instruction scheduler written by Michael Tiemann
! and described in the first subsequent section.  The second one was
! created later.  It is based on description of functional unit
! reservations by processor instructions with the aid of @dfn{regular
! expressions}.  This is so called @dfn{automaton based description}.
! 
! Gcc instruction scheduler uses a @dfn{pipeline hazard recognizer} to
! figure out the possibility of the instruction issue by the processor
! on given simulated processor cycle.  The pipeline hazard recognizer is
! a code generated from the processor pipeline description.  The
! pipeline hazard recognizer generated from the automaton based
! description is more sophisticated and based on deterministic finite
! state automaton (@acronym{DFA}) and therefore faster than one
! generated from the old description.  Also its speed is not depended on
! processor complexity.  The instruction issue is possible if there is
! a transition from one automaton state to another one.
! 
! You can use any model to describe processor pipeline characteristics
! or even a mix of them.  You could use the old description for some
! processor submodels and the @acronym{DFA}-based one for the rest
! processor submodels.
! 
! In general, the usage of the automaton based description is more
! preferable.  Its model is more rich.  It permits to describe more
! accurately pipeline characteristics of processors which results in
! improving code quality (although sometimes only on several percent
! fractions).  It will be also used as an infrastructure to implement
! sophisticated and practical insn scheduling which will try many
! instruction sequences to choose the best one.
! 
! 
! @menu
! * Old pipeline description:: Specifying information for insn scheduling.
! * Automaton pipeline description:: Describing insn pipeline characteristics.
! * Comparison of the two descriptions:: Drawbacks of the old pipeline description
! @end menu
! 
! @node Old pipeline description
! @subsubsection Specifying Function Units
! @cindex old pipeline description
  @cindex function units, for scheduling
  
! On most @acronym{RISC} machines, there are instructions whose results
! are not available for a specific number of cycles.  Common cases are
! instructions that load data from memory.  On many machines, a pipeline
! stall will result if the data is referenced too soon after the load
! instruction.
  
  In addition, many newer microprocessors have multiple function units, usually
  one for integer and one for floating point, and often will incur pipeline
*************** due to function unit conflicts.
*** 4878,4890 ****
  
  For the purposes of the specifications in this section, a machine is
  divided into @dfn{function units}, each of which execute a specific
! class of instructions in first-in-first-out order.  Function units that
! accept one instruction each cycle and allow a result to be used in the
! succeeding instruction (usually via forwarding) need not be specified.
! Classic RISC microprocessors will normally have a single function unit,
! which we can call @samp{memory}.  The newer ``superscalar'' processors
! will often have function units for floating point operations, usually at
! least a floating point adder and multiplier.
  
  @findex define_function_unit
  Each usage of a function units by a class of insns is specified with a
--- 4966,4979 ----
  
  For the purposes of the specifications in this section, a machine is
  divided into @dfn{function units}, each of which execute a specific
! class of instructions in first-in-first-out order.  Function units
! that accept one instruction each cycle and allow a result to be used
! in the succeeding instruction (usually via forwarding) need not be
! specified.  Classic @acronym{RISC} microprocessors will normally have
! a single function unit, which we can call @samp{memory}.  The newer
! ``superscalar'' processors will often have function units for floating
! point operations, usually at least a floating point adder and
! multiplier.
  
  @findex define_function_unit
  Each usage of a function units by a class of insns is specified with a
*************** Typical uses of this vector are where a 
*** 4947,4956 ****
  pipeline either single- or double-precision operations, but not both, or
  where a memory unit can pipeline loads, but not stores, etc.
  
! As an example, consider a classic RISC machine where the result of a
! load instruction is not available for two cycles (a single ``delay''
! instruction is required) and where only one load instruction can be executed
! simultaneously.  This would be specified as:
  
  @smallexample
  (define_function_unit "memory" 1 1 (eq_attr "type" "load") 2 0)
--- 5036,5045 ----
  pipeline either single- or double-precision operations, but not both, or
  where a memory unit can pipeline loads, but not stores, etc.
  
! As an example, consider a classic @acronym{RISC} machine where the
! result of a load instruction is not available for two cycles (a single
! ``delay'' instruction is required) and where only one load instruction
! can be executed simultaneously.  This would be specified as:
  
  @smallexample
  (define_function_unit "memory" 1 1 (eq_attr "type" "load") 2 0)
*************** units.  These insns will cause a potenti
*** 4975,4980 ****
--- 5064,5437 ----
  used during their execution and there is no way of representing that
  conflict.  We welcome any examples of how function unit conflicts work
  in such processors and suggestions for their representation.
+ 
+ @node Automaton pipeline description
+ @subsubsection Describing instruction pipeline characteristics
+ @cindex automaton based pipeline description
+ 
+ This section describes constructions of the automaton based processor
+ pipeline description.  The order of all mentioned below constructions
+ in the machine description file is not important.
+ 
+ @findex define_automaton
+ @cindex pipeline hazard recognizer
+ The following optional construction describes names of automata
+ generated and used for the pipeline hazards recognition.  Sometimes
+ the generated finite state automaton used by the pipeline hazard
+ recognizer is large.  If we use more one automaton and bind functional
+ units to the automata, the summary size of the automata usually is
+ less than the size of the single automaton.  If there is no one such
+ construction, only one finite state automaton is generated.
+ 
+ @smallexample
+ (define_automaton @var{automata-names})
+ @end smallexample
+ 
+ @var{automata-names} is a string giving names of the automata.  The
+ names are separated by commas.  All the automata should have unique names.
+ The automaton name is used in construction @code{define_cpu_unit} and
+ @code{define_query_cpu_unit}.
+ 
+ @findex define_cpu_unit
+ @cindex processor functional units
+ Each processor functional unit used in description of instruction
+ reservations should be described by the following construction.
+ 
+ @smallexample
+ (define_cpu_unit @var{unit-names} [@var{automaton-name}])
+ @end smallexample
+ 
+ @var{unit-names} is a string giving the names of the functional units
+ separated by commas.  Don't use name @samp{nothing}, it is reserved
+ for other goals.
+ 
+ @var{automaton-name} is a string giving the name of automaton with
+ which the unit is bound.  The automaton should be described in
+ construction @code{define_automaton}.  You should give
+ @dfn{automaton-name}, if there is a defined automaton.
+ 
+ @findex define_query_cpu_unit
+ @cindex querying function unit reservations
+ The following construction describes CPU functional units analogously
+ to @code{define_cpu_unit}.  If we use automata without their
+ minimization, the reservation of such units can be queried for an
+ automaton state.  The instruction scheduler never queries reservation
+ of functional units for given automaton state.  So as a rule, you
+ don't need this construction.  This construction could be used for
+ future code generation goals (e.g. to generate @acronym{VLIW} insn
+ templates).
+ 
+ @smallexample
+ (define_query_cpu_unit @var{unit-names} [@var{automaton-name}])
+ @end smallexample
+ 
+ @var{unit-names} is a string giving names of the functional units
+ separated by commas.
+ 
+ @var{automaton-name} is a string giving name of the automaton with
+ which the unit is bound.
+ 
+ @findex define_insn_reservation
+ @cindex instruction latency time
+ @cindex regular expressions
+ @cindex data bypass
+ The following construction is major one to describe pipeline
+ characteristics of an instruction.
+ 
+ @smallexample
+ (define_insn_reservation @var{insn-name} @var{default_latency}
+                          @var{condition} @var{regexp})
+ @end smallexample
+ 
+ @var{default_latency} is a number giving latency time of the
+ instruction.
+ 
+ @var{insn-names} is a string giving internal name of the insn.  The
+ internal names are used in constructions @code{define_bypass} and in
+ the automaton description file generated for debugging.  The internal
+ name has nothing common with the names in @code{define_insn}.  It is a
+ good practice to use insn classes described in the processor manual.
+ 
+ @var{condition} defines what RTL insns are described by this
+ construction.
+ 
+ @var{regexp} is a string describing reservation of the cpu functional
+ units by the instruction.  The reservations are described by a regular
+ expression according to the following syntax:
+ 
+ @smallexample
+        regexp = regexp "," oneof
+               | oneof
+ 
+        oneof = oneof "|" allof
+              | allof
+ 
+        allof = allof "+" repeat
+              | repeat
+  
+        repeat = element "*" number
+               | element
+ 
+        element = cpu_function_unit_name
+                | reservation_name
+                | result_name
+                | "nothing"
+                | "(" regexp ")"
+ @end smallexample
+ 
+ @itemize @bullet
+ @item
+ @samp{,} is used for describing the start of the next cycle in
+ the reservation.
+ 
+ @item
+ @samp{|} is used for describing a reservation described by the first
+ regular expression @strong{or} a reservation described by the second
+ regular expression @strong{or} etc.
+ 
+ @item
+ @samp{+} is used for describing a reservation described by the first
+ regular expression @strong{and} a reservation described by the
+ second regular expression @strong{and} etc.
+ 
+ @item
+ @samp{*} is used for convenience and simply means a sequence in which
+ the regular expression are repeated @var{number} times with cycle
+ advancing (see @samp{,}).
+ 
+ @item
+ @samp{cpu_function_unit_name} denotes reservation of the named
+ functional unit.
+ 
+ @item
+ @samp{reservation_name} --- see description of construction
+ @samp{define_reservation}.
+ 
+ @item
+ @samp{nothing} denotes no unit reservations.
+ @end itemize
+ 
+ @findex define_reservation
+ Sometimes unit reservations for different insns contain common parts.
+ In such case, you can simplify the pipeline description by describing
+ the common part by the following construction
+ 
+ @smallexample
+ (define_reservation @var{reservation-name} @var{regexp})
+ @end smallexample
+ 
+ @var{reservation-name} is a string giving name of @var{regexp}.
+ Functional unit names and reservation names are in the same name
+ space.  So the reservation names should be different from the
+ functional unit names and can not be reserved name @samp{nothing}.
+ 
+ @findex define_bypass
+ @cindex instruction latency time
+ @cindex data bypass
+ The following construction is used to describe exceptions in the
+ latency time for given instruction pair.  This is so called bypasses.
+ 
+ @smallexample
+ (define_bypass @var{number} @var{out_insn_names} @var{in_insn_names}
+                [@var{guard}])
+ @end smallexample
+ 
+ @var{number} defines when the result generated by the instructions
+ given in string @var{out_insn_names} will be ready for the
+ instructions given in string @var{in_insn_names}.  The instructions in
+ the string are separated by commas.
+ 
+ @var{guard} is an optional string giving name of a C function which
+ defines an additional guard for the bypass.  The function will get the
+ two insns as parameters.  If the function returns zero the bypass will
+ be ignored for this case.  The additional guard is necessary to
+ recognize complicated bypasses, e.g. when consumer is only an address
+ of insn @samp{store} (not a stored value).
+ 
+ @findex exclusion_set
+ @findex presence_set
+ @findex absence_set
+ @cindex VLIW
+ @cindex RISC
+ Usually the following three constructions are used to describe
+ @acronym{VLIW} processors (more correctly to describe a placement of
+ small insns into @acronym{VLIW} insn slots).  Although they can be
+ used for @acronym{RISC} processors too.
+ 
+ @smallexample
+ (exclusion_set @var{unit-names} @var{unit-names})
+ (presence_set @var{unit-names} @var{unit-names})
+ (absence_set @var{unit-names} @var{unit-names})
+ @end smallexample
+ 
+ @var{unit-names} is a string giving names of functional units
+ separated by commas.
+ 
+ The first construction (@samp{exclusion_set}) means that each
+ functional unit in the first string can not be reserved simultaneously
+ with a unit whose name is in the second string and vice versa.  For
+ example, the construction is useful for describing processors
+ (e.g. some SPARC processors) with a fully pipelined floating point
+ functional unit which can execute simultaneously only single floating
+ point insns or only double floating point insns.
+ 
+ The second construction (@samp{presence_set}) means that each
+ functional unit in the first string can not be reserved unless at
+ least one of units whose names are in the second string is reserved.
+ This is an asymmetric relation.  For example, it is useful for
+ description that @acronym{VLIW} @samp{slot1} is reserved after
+ @samp{slot0} reservation.
+ 
+ The third construction (@samp{absence_set}) means that each functional
+ unit in the first string can be reserved only if each unit whose name
+ is in the second string is not reserved.  This is an asymmetric
+ relation (actually @samp{exclusion_set} is analogous to this one but
+ it is symmetric).  For example, it is useful for description that
+ @acronym{VLIW} @samp{slot0} can not be reserved after @samp{slot1} or
+ @samp{slot2} reservation.
+ 
+ @findex automata_option
+ @cindex deterministic finite state automaton
+ @cindex nondeterministic finite state automaton
+ @cindex finite state automaton minimization
+ You can control the generator of the pipeline hazard recognizer with
+ the following construction.
+ 
+ @smallexample
+ (automata_option @var{options})
+ @end smallexample
+ 
+ @var{options} is a string giving options which affect the generated
+ code.  Currently there are the following options:
+ 
+ @itemize @bullet
+ @item
+ @dfn{no-minimization} makes no minimization of the automaton.  This is
+ only worth to do when we are going to query CPU functional unit
+ reservations in an automaton state.
+ 
+ @item
+ @dfn{w} means a generation of the file describing the result
+ automaton.  The file can be used to verify the description.
+ 
+ @item
+ @dfn{ndfa} makes nondeterministic finite state automata.  This affects
+ the treatment of operator @samp{|} in the regular expressions.  The
+ usual treatment of the operator is to try the first alternative and,
+ if the reservation is not possible, the second alternative.  The
+ nondeterministic treatment means trying all alternatives, some of them
+ may be rejected by reservations in the subsequent insns.  You can not
+ query functional unit reservations in nondeterministic automaton
+ states.
+ @end itemize
+ 
+ As an example, consider a superscalar @acronym{RISC} machine which can
+ issue three insns (two integer insns and one floating point insn) on
+ the cycle but can finish only two insns.  To describe this, we define
+ the following functional units.
+ 
+ @smallexample
+ (define_cpu_unit "i0_pipeline, i1_pipeline, f_pipeline")
+ (define_cpu_unit "port_0, port1")
+ @end smallexample
+ 
+ All simple integer insns can be executed in any integer pipeline and
+ their result is ready in two cycles.  The simple integer insns are
+ issued into the first pipeline unless it is reserved, otherwise they
+ are issued into the second pipeline.  Integer division and
+ multiplication insns can be executed only in the second integer
+ pipeline and their results are ready correspondingly in 8 and 4
+ cycles.  The integer division is not pipelined, i.e. the subsequent
+ integer division insn can not be issued until the current division
+ insn finished.  Floating point insns are fully pipelined and their
+ results are ready in 3 cycles.  There is also additional one cycle
+ delay in the usage by integer insns of result produced by floating
+ point insns.  To describe all of this we could specify
+ 
+ @smallexample
+ (define_cpu_unit "div")
+ 
+ (define_insn_reservation "simple" 2 (eq_attr "cpu" "int")
+                          "(i0_pipeline | i1_pipeline), (port_0 | port1)")
+ 
+ (define_insn_reservation "mult" 4 (eq_attr "cpu" "mult")
+                          "i1_pipeline, nothing*3, (port_0 | port1)")
+ 
+ (define_insn_reservation "div" 8 (eq_attr "cpu" "div")
+                          "i1_pipeline, div*7, (port_0 | port1)")
+ 
+ (define_insn_reservation "float" 3 (eq_attr "cpu" "float")
+                          "f_pipeline, nothing, (port_0 | port1))
+ 
+ (define_bypass 4 "float" "simple,mut,div")
+ @end smallexample
+ 
+ To simplify the description we could describe the following reservation
+ 
+ @smallexample
+ (define_reservation "finish" "port0|port1")
+ @end smallexample
+ 
+ and use it in all @code{define_insn_reservation} as in the following
+ construction
+ 
+ @smallexample
+ (define_insn_reservation "simple" 2 (eq_attr "cpu" "int")
+                          "(i0_pipeline | i1_pipeline), finish")
+ @end smallexample
+ 
+ 
+ @node Comparison of the two descriptions
+ @subsubsection Drawbacks of the old pipeline description
+ @cindex old pipeline description
+ @cindex automaton based pipeline description
+ @cindex processor functional units
+ @cindex interlock delays
+ @cindex instruction latency time
+ @cindex pipeline hazard recognizer
+ @cindex data bypass
+ 
+ The old instruction level parallelism description and the pipeline
+ hazards recognizer based on it have the following drawbacks in
+ comparison with the @acronym{DFA}-based ones:
+   
+ @itemize @bullet
+ @item
+ Each functional unit is believed to be reserved at the instruction
+ execution start.  This is a very inaccurate model for modern
+ processors.
+ 
+ @item
+ An inadequate description of instruction latency times.  The latency
+ time is bound with a functional unit reserved by an instruction not
+ with the instruction itself.  In other words, the description is
+ oriented to describe at most one unit reservation by each instruction.
+ It also does not permit to describe special bypasses between
+ instruction pairs.
+ 
+ @item
+ The implementation of the pipeline hazard recognizer interface has
+ constraints on number of functional units.  This is a number of bits
+ in integer on the host machine.
+ 
+ @item
+ The interface to the pipeline hazard recognizer is more complex than
+ one to the automaton based pipeline recognizer.
+ 
+ @item
+ An unnatural description when you write an unit and a condition which
+ selects instructions using the unit.  Writing all unit reservations
+ for an instruction (an instruction class) is more natural.
+ 
+ @item
+ The recognition of the interlock delays has slow implementation.  GCC
+ scheduler supports structures which describe the unit reservations.
+ The more processor has functional units, the slower pipeline hazard
+ recognizer.  Such implementation would become slower when we enable to
+ reserve functional units not only at the instruction execution start.
+ The automaton based pipeline hazard recognizer speed is not depended
+ on processor complexity.
+ @end itemize
  @end ifset
  
  @node Conditional Execution
Index: doc/tm.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/tm.texi,v
retrieving revision 1.53
diff -c -p -r1.53 tm.texi
*** tm.texi	2001/08/19 23:46:10	1.53
--- tm.texi	2001/08/26 21:23:49
*************** hooks for this purpose.  It is usually e
*** 5446,5456 ****
  them: try the first ones in this list first.
  
  @deftypefn {Target Hook} int TARGET_SCHED_ISSUE_RATE (void)
! This hook returns the maximum number of instructions that can ever issue
! at the same time on the target machine.  The default is one.  This value
! must be constant over the entire compilation.  If you need it to vary
! depending on what the instructions are, you must use
  @samp{TARGET_SCHED_VARIABLE_ISSUE}.
  @end deftypefn
  
  @deftypefn {Target Hook} int TARGET_SCHED_VARIABLE_ISSUE (FILE *@var{file}, int @var{verbose}, rtx @var{insn}, int @var{more})
--- 5446,5464 ----
  them: try the first ones in this list first.
  
  @deftypefn {Target Hook} int TARGET_SCHED_ISSUE_RATE (void)
! This hook returns the maximum number of instructions that can ever
! issue at the same time on the target machine.  The default is one.
! Although the insn scheduler can define itself the possibility of issue
! an insn on the same cycle, the value can serve as an additional
! constraint to issue insns on the same simulated processor cycle (see
! hooks @samp{TARGET_SCHED_REORDER} and @samp{TARGET_SCHED_REORDER2}).
! This value must be constant over the entire compilation.  If you need
! it to vary depending on what the instructions are, you must use
  @samp{TARGET_SCHED_VARIABLE_ISSUE}.
+ 
+ You could use the value of macro @samp{MAX_DFA_ISSUE_RATE} to return
+ the value of the hook @samp{TARGET_SCHED_ISSUE_RATE} for the automaton
+ based pipeline interface.
  @end deftypefn
  
  @deftypefn {Target Hook} int TARGET_SCHED_VARIABLE_ISSUE (FILE *@var{file}, int @var{verbose}, rtx @var{insn}, int @var{more})
*************** instruction that was scheduled.
*** 5466,5477 ****
  @end deftypefn
  
  @deftypefn {Target Hook} int TARGET_SCHED_ADJUST_COST (rtx @var{insn}, rtx @var{link}, rtx @var{dep_insn}, int @var{cost})
! This function corrects the value of @var{cost} based on the relationship
! between @var{insn} and @var{dep_insn} through the dependence @var{link}.
! It should return the new value.  The default is to make no adjustment to
! @var{cost}.  This can be used for example to specify to the scheduler
  that an output- or anti-dependence does not incur the same cost as a
! data-dependence.
  @end deftypefn
  
  @deftypefn {Target Hook} int TARGET_SCHED_ADJUST_PRIORITY (rtx @var{insn}, int @var{priority})
--- 5474,5490 ----
  @end deftypefn
  
  @deftypefn {Target Hook} int TARGET_SCHED_ADJUST_COST (rtx @var{insn}, rtx @var{link}, rtx @var{dep_insn}, int @var{cost})
! This function corrects the value of @var{cost} based on the
! relationship between @var{insn} and @var{dep_insn} through the
! dependence @var{link}.  It should return the new value.  The default
! is to make no adjustment to @var{cost}.  This can be used for example
! to specify to the scheduler using the traditional pipeline description
  that an output- or anti-dependence does not incur the same cost as a
! data-dependence.  If the scheduler using the automaton based pipeline
! description, the cost of anti-dependence is zero and the cost of
! output-dependence is maximum of one and the difference of latency
! times of the first and the second insns.  If these values are not
! acceptable, you could use the hook to modify them too.
  @end deftypefn
  
  @deftypefn {Target Hook} int TARGET_SCHED_ADJUST_PRIORITY (rtx @var{insn}, int @var{priority})
*************** over a basic block.  It should insert an
*** 5536,5541 ****
--- 5549,5688 ----
  RTL dumps and assembly output.  Define this hook only if you need this
  level of detail about what the scheduler is doing.
  @end deftypefn
+ 
+ @deftypefn {Target Hook} int TARGET_SCHED_USE_DFA_PIPELINE_INTERFACE (void)
+ This hook is called many times during insn scheduling.  If the hook
+ returns nonzero, the automaton based pipeline description is used for
+ insn scheduling.  Otherwise the traditional pipeline description is
+ used.  The default is usage of the traditional pipeline description.
+ 
+ You should also remember that to simplify the insn scheduler sources
+ an empty traditional pipeline description interface is generated even
+ if there is no a traditional pipeline description in the @file{.md}
+ file.  The same is true for the automaton based pipeline description.
+ That means that you should be accurate in defining the hook.
+ @end deftypefn
+ 
+ @deftypefn {Target Hook} int TARGET_SCHED_DFA_PRE_CYCLE_INSN (void)
+ The hook returns an RTL insn.  The automaton state used in the
+ pipeline hazard recognizer is changed as if the insn were scheduled
+ when the new simulated processor cycle starts.  Usage of the hook may
+ simplify the automaton pipeline description for some @acronym{VLIW}
+ processors.  If the hook is defined, it is used only for the automaton
+ based pipeline description.  The default is not to change the state
+ when the new simulated processor cycle starts.
+ @end deftypefn
+ 
+ @deftypefn {Target Hook} void TARGET_SCHED_INIT_DFA_PRE_CYCLE_INSN (void)
+ The hook can be used to initialize data used by the previous hook.
+ @end deftypefn
+ 
+ @deftypefn {Target Hook} int TARGET_SCHED_DFA_POST_CYCLE_INSN (void)
+ The hook is analogous to @samp{TARGET_SCHED_DFA_PRE_CYCLE_INSN} but used
+ to changed the state as if the insn were scheduled when the new
+ simulated processor cycle finishes.
+ @end deftypefn
+ 
+ @deftypefn {Target Hook} void TARGET_SCHED_INIT_DFA_POST_CYCLE_INSN (void)
+ The hook is analogous to @samp{TARGET_SCHED_INIT_DFA_PRE_CYCLE_INSN} but
+ used to initialize data used by the previous hook.
+ @end deftypefn
+ 
+ @deftypefn {Target Hook} int TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD (void)
+ This hook controls better choosing an insn from the ready insn queue
+ for the @acronym{DFA}-based insn scheduler.  Usually the scheduler
+ chooses the first insn from the queue.  If the hook returns a positive
+ value, an additional scheduler code tries all permutations of
+ @samp{TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD ()}
+ subsequent ready insns to choose an insn whose issue will result in
+ maximal number of issued insns on the same cycle.  For the
+ @acronym{VLIW} processor, the code could actually solve the problem of
+ packing simple insns into the @acronym{VLIW} insn.  Of course, if the
+ rules of @acronym{VLIW} packing are described in the automaton.
+ 
+ This code also could be used for superscalar @acronym{RISC}
+ processors.  Let us consider a superscalar @acronym{RISC} processor
+ with 3 pipelines.  Some insns can be executed in pipelines @var{A} or
+ @var{B}, some insns can be executed only in pipelines @var{B} or
+ @var{C}, and one insn can be executed in pipeline @var{B}.  The
+ processor may issue the 1st insn into @var{A} and the 2nd one into
+ @var{B}.  In this case, the 3rd insn will wait for freeing @var{B}
+ until the next cycle.  If the scheduler issues the 3rd insn the first,
+ the processor could issue all 3 insns per cycle.
+ 
+ Actually this code demonstrates advantages of the automaton based
+ pipeline hazard recognizer.  We try quickly and easy many insn
+ schedules to choose the best one.
+ 
+ The default is no multipass scheduling.
+ @end deftypefn
+ 
+ @deftypefn {Target Hook} void TARGET_SCHED_INIT_DFA_BUBBLES (void)
+ The @acronym{DFA}-based scheduler could take the insertion of nop
+ operations for better insn scheduling into account.  It can be done
+ only if the multi-pass insn scheduling works (see hook
+ @samp{TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD}).
+ 
+ Let us consider a @acronym{VLIW} processor insn with 3 slots.  Each
+ insn can be placed only in one of the three slots.  We have 3 ready
+ insns @var{A}, @var{B}, and @var{C}.  @var{A} and @var{C} can be
+ placed only in the 1st slot, @var{B} can be placed only in the 3rd
+ slot.  We described the automaton which does not permit empty slot
+ gaps between insns (usually such description is simpler).  Without
+ this code the scheduler would place each insn in 3 separate
+ @acronym{VLIW} insns.  If the scheduler places a nop insn into the 2nd
+ slot, it could place the 3 insns into 2 @acronym{VLIW} insns.  What is
+ the nop insn is returned by hook @samp{TARGET_SCHED_DFA_BUBBLE}.  Hook
+ @samp{TARGET_SCHED_INIT_DFA_BUBBLES} can be used to initialize or
+ create the nop insns.
+ 
+ You should remember that the scheduler does not insert the nop insns.
+ It is not wise because of the following optimizations.  The scheduler
+ only considers such possibility to improve the result schedule.  The
+ nop insns should be inserted lately, e.g. on the final phase.
+ @end deftypefn
+ 
+ @deftypefn {Target Hook} rtx TARGET_SCHED_DFA_BUBBLE (int @var{index})
+ This hook @samp{FIRST_CYCLE_MULTIPASS_SCHEDULING} is used to insert
+ nop operations for better insn scheduling when @acronym{DFA}-based
+ scheduler makes multipass insn scheduling (see also description of
+ hook @samp{TARGET_SCHED_INIT_DFA_BUBBLES}).  This hook
+ returns a nop insn with given @var{index}.  The indexes start with
+ zero.  The hook should return @code{NULL} if there are no more nop
+ insns with indexes greater than given index.
+ @end deftypefn
+ 
+ Macros in the following table are generated by the program
+ @file{genattr} and can be useful for writing the hooks.
+ 
+ @table @code
+ @findex TRADITIONAL_PIPELINE_INTERFACE
+ @item TRADITIONAL_PIPELINE_INTERFACE
+ The macro definition is generated if there is a traditional pipeline
+ description in @file{.md} file. You should also remember that to
+ simplify the insn scheduler sources an empty traditional pipeline
+ description interface is generated even if there is no a traditional
+ pipeline description in the @file{.md} file.  The macro can be used to
+ distinguish the two types of the traditional interface.
+ 
+ @findex DFA_PIPELINE_INTERFACE
+ @item DFA_PIPELINE_INTERFACE
+ The macro definition is generated if there is an automaton pipeline
+ description in @file{.md} file.  You should also remember that to
+ simplify the insn scheduler sources an empty automaton pipeline
+ description interface is generated even if there is no an automaton
+ pipeline description in the @file{.md} file.  The macro can be used to
+ distinguish the two types of the automaton interface.
+ 
+ @findex MAX_DFA_ISSUE_RATE
+ @item MAX_DFA_ISSUE_RATE
+ The macro definition is generated in the automaton based pipeline
+ description interface.  Its value is calculated from the automaton
+ based pipeline description and is equal to maximal number of all insns
+ described in constructions @samp{define_insn_reservation} which can be
+ issued on the same processor cycle.
+ 
+ @end table
  
  @node Sections
  @section Dividing the Output into Sections (Texts, Data, @dots{})
Index: doc/contrib.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/contrib.texi,v
retrieving revision 1.12
diff -c -p -r1.12 contrib.texi
*** contrib.texi	2001/08/03 01:19:19	1.12
--- contrib.texi	2001/08/26 21:23:49
*************** Andrew MacLeod for his ongoing work in b
*** 313,321 ****
  various code generation improvements, work on the global optimizer, etc.
  
  @item
! Vladimir Makarov for hacking some ugly i960 problems, PowerPC
! hacking improvements to compile-time performance and overall knowledge
! and direction in the area of instruction scheduling.
  
  @item
  Bob Manson for his behind the scenes work on dejagnu.
--- 313,322 ----
  various code generation improvements, work on the global optimizer, etc.
  
  @item
! Vladimir Makarov for hacking some ugly i960 problems, PowerPC hacking
! improvements to compile-time performance, overall knowledge and
! direction in the area of instruction scheduling, and design and
! implementation of the automaton based instruction scheduler.
  
  @item
  Bob Manson for his behind the scenes work on dejagnu.
Index: doc/gcc.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/gcc.texi,v
retrieving revision 1.34
diff -c -p -r1.34 gcc.texi
*** gcc.texi	2001/08/20 06:14:53	1.34
--- gcc.texi	2001/08/26 21:23:50
*************** Several passes use instruction attribute
*** 3826,3833 ****
  attributes defined for a particular machine is in file
  @file{insn-attr.h}, which is generated from the machine description by
  the program @file{genattr}.  The file @file{insn-attrtab.c} contains
! subroutines to obtain the attribute values for insns.  It is generated
! from the machine description by the program @file{genattrtab}.
  @end itemize
  @end ifset
  
--- 3826,3835 ----
  attributes defined for a particular machine is in file
  @file{insn-attr.h}, which is generated from the machine description by
  the program @file{genattr}.  The file @file{insn-attrtab.c} contains
! subroutines to obtain the attribute values for insns and information
! about processor pipeline characteristics for the instruction scheduler.
! It is generated from the machine description by the program
! @file{genattrtab}.
  @end itemize
  @end ifset
  


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]