This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Patch for automaton based pipeline hazard recognizer (part #1).
- To: gcc-patches at gcc dot gnu dot org
- Subject: Patch for automaton based pipeline hazard recognizer (part #1).
- From: Vladimir Makarov <vmakarov at toke dot toronto dot redhat dot com>
- Date: Wed, 31 Jan 2001 15:14:15 -0500
Hello, we'd like to contribute new code for more accurate
description of pipeline behavior of processors and for fast
recognition of pipeline hazards. Currently gcc has only one
construction `define_function_unit' for this. The proposed model is
based on describing processor functional unit reservations by
instruction with the aid of regular expressions. The patch translates
new description into code for fast pipeline hazard recognition based
on deterministic finite state automaton.
To feel the automaton description, let us consider automaton based
description of a hypothetic superscalar RISC machine which can issue
three insns (two integer insns and one floating point insn) on cycle
but finish only two insns. To describe this, we define the following
functional units.
(define_cpu_unit "i0_pipeline, i1_pipeline, f_pipeline")
(define_cpu_unit "port_0, port1")
All simple integer insns can be executed in any integer pipeline and
their result is ready in two cycles. The simple integer insns are
issued into the first pipeline unless it is reserved, otherwise they
are issued into the second pipeline. Integer division and
multiplication insns can be executed only in the second integer
pipeline and their results are ready correspondingly in 8 and 4
cycles. Integer division is not pipelined, i.e. subsequent integer
division insn can not be issued until current division insn finished.
Floating point insns are fully pipelined and their results are ready
in 3 cycles. There is also additional one cycle delay in usage by
integer insns of result produced by floating point insns. To describe
all of this we could specify
(define_cpu_unit "div")
(define_insn_reservation "simple" 2 (eq_attr "cpu" "int")
"(i0_pipeline | i1_pipeline), (port_0 | port1)")
(define_insn_reservation "mult" 4 (eq_attr "cpu" "mult")
"i1_pipeline, nothing*3, (port_0 | port1)")
(define_insn_reservation "div" 8 (eq_attr "cpu" "div")
"i1_pipeline, div*7, (port_0 | port1)")
(define_insn_reservation "float" 3 (eq_attr "cpu" "float")
"f_pipeline, nothing, (port_0 | port1))
(define_bypass 4 "float" "simple,mut,div")
To understand the reasons of writing the patch, I'd like to say
about drawbacks of the current model of processor pipeline
descriptions and pipeline hazard recognizer (PHR) based on it in
comparison with the proposed one.
1. Each functional unit is believed to be reserved at the
instruction execution start. This is very inaccurate model for
modern processors.
2. Inadequate description of instruction latency times. Latency
time is bound with functional unit reserved by instruction not with
instruction itself. In other words, the description is oriented
to describe at most one unit reservation by each instruction. It
also does not permit to describe special bypasses between
instruction pair.
3. Implementation of the pipeline hazard recognizer interface has
constraints on number of functional units. This is number of
bits in integer on the host machine.
4. Interface to the pipeline hazard recognizer is more complex than
one to automaton based pipeline recognizer.
5. Unnatural description when you write a unit and condition which
selects instructions using the unit. Writing all unit
reservations for an instruction (an instruction class) is more
natural.
6. Recognition of interlock delays has slow implementation. GCC
scheduler supports structures which describe the unit
reservations. The more processor has functional units, the
slower pipeline hazard recognizer. Such implementation would
become slower when we enable to reserve functional units not only
at the instruction execution start. The automaton based pipeline
hazard recognizer speed is not depended on processor complexity.
Because transition from old descriptions to new ones is assumed to
be long process (or will never finish fully), the patch permits to use
old description and PHR code based on it as new one too. You could
use old description for one processor submodels and new one for the
rest processor submodels. You only need to define correctly macro
USE_AUTOMATON_PIPELINE_INTERFACE. By default if there are old and new
one description in md file, the new one is used.
So the patch is safe, it will no affect the current ports unless
pipeline description in .md is rewritten.
In general, the usage of automaton based description is more
preferable. The model is more rich. It permits to describe more
accurately pipeline characteristics of processors which results in
improving code quality (although sometimes only on several percent
fractions). It could be also used as infrastructure to implement
sophisticated and practical insn scheduling which will try many
instruction sequences to choose the best one.
The new code (and description model) has been already used in
several projects (ports for a VLIW processor and for a few superscalar
RISC processors).
If the patch is approved, we could contribute a software pipeliner
for GCC. Without approving the patch, it has no sense because
the software pipeliner requires automaton based pipeline hazard
recognizer.
Best regards,
Vladimir Makarov
2001-01-31 Vladimir Makarov <vmakarov@touchme.toronto.redhat.com>
* rtl.def (DEFINE_CPU_UNIT, DEFINE_QUERY_CPU_UNIT, EXCLUSION_SET,
PRESENCE_SET, ABSENCE_SET, DEFINE_BYPASS, DEFINE_AUTOMATON,
AUTOMATA_OPTION, DEFINE_RESERVATION, DEFINE_INSN_RESERVATION): New
RTL constructions.
* genattr.c (main): New variable num_insn_reservations. Increase
it if there is DEFINE_INSN_RESERVATION. Output automaton based
pipeline hazard recognizer interface. Use macro preprocessor
conditionals to brace old pipeline hazard recognizer interface.
* genattrtab.c (main): Process DEFINE_CPU_UNIT,
DEFINE_QUERY_CPU_UNIT, DEFINE_BYPASS, EXCLUSION_SET, PRESENCE_SET,
ABSENCE_SET, DEFINE_AUTOMATON, AUTOMATA_OPTION,
DEFINE_RESERVATION, DEFINE_INSN_RESERVATION. Call expand_automata
and write_automata if there are automaton descriptions. Include
file `genautomata.c'. Use macro preprocessor conditionals to
brace old pipeline hazard recognizer code.
* genautomata.c: New file.
* sched-int.h: (OLD_PIPELINE_INTERFACE,
AUTOMATON_PIPELINE_INTERFACE, FIRST_CYCLE_MULTIPASS_SCHEDULING,
FIRST_CYCLE_MULTIPASS_SCHEDULING_LOOKAHEAD,
OLD_PIPELINE_INTERFACE, USE_AUTOMATON_PIPELINE_INTERFACE): Define
the default macro values.
(FUNCTION_UNITS_SIZE, BLOCKAGE_BITS, MAX_MULTIPLICITY,
MIN_BLOCKAGE, MAX_BLOCKAGE): Undefined these macros for automaton
pipeline interface.
(curr_state): Add the external definition for automaton pipeline
interface.
(reg_known_equiv_p, reg_known_value): Add external definitions.
(haifa_insn_data): Brace definitions `blockage' and `units' by
preprocessor conditionals with OLD_PIPELINE_INTERFACE.
(INSN_BLOCKAGE, UNIT_BITS, BLOCKAGE_MASK, MIN_BLOCKAGE_COST,
MAX_BLOCKAGE_COST, init_target_units, insn_print_units,
print_block_visualization, visualize_scheduled_insns,
visualize_no_unit, visualize_stall_cycles, insn_issue_delay,
insn_unit, get_unit_last_insn, actual_hazard_this_instance): Brace
them by preprocessor conditionals with OLD_PIPELINE_INTERFACE.
* haifa-sched.c (issue_rate, ISSUE_RATE): Brace them by
preprocessor conditionals with OLD_PIPELINE_INTERFACE.
(MAX_INSN_QUEUE_INDEX): New macro.
(insn_queue):
(NEXT_Q, NEXT_Q_AFTER): Use MAX_INSN_QUEUE_INDEX instead of
INSN_QUEUE_SIZE.
(max_insn_queue_index): New variable for old pipeline interface.
(insert_schedule_bubbles_p, curr_state, dfa_state_size,
ready_try): New varaibles for automaton interface.
(blockage_range, clear_units, schedule_unit, actual_hazard,
potential_hazard): Brace them by preprocessor conditionals with
OLD_PIPELINE_INTERFACE.
(ready_element, ready_remove, max_issue, choose_ready): New
function prototypes for automaton interface.
(insn_unit, blockage_range, unit_last_insn, unit_tick,
unit_n_insns, get_unit_last_insn, clear_units, insn_issue_delay,
actual_hazard_this_instance, schedule_unit, actual_hazard,
potential_hazard): Brace them by preprocessor conditionals with
OLD_PIPELINE_INTERFACE.
(insn_cost): Brace old code by preprocessor conditionals with
OLD_PIPELINE_INTERFACE. Add new code for automaton pipeline
interface.
(priority): Use 0 and -1 as undefined priority for correspondingly
old and automaton pipeline interface.
(ready_element): New function for automaton interface.
(schedule_insn): Brace old code by preprocessor conditionals with
OLD_PIPELINE_INTERFACE. Add new code for automaton pipeline
interface.
(queue_to_ready): Ditto. Use MAX_INSN_QUEUE_INDEX instead of
INSN_QUEUE_SIZE.
(MD_SCHED_INIT, MD_SCHED_REORDER, MD_SCHED_VARIABLE_ISSUE):
Undefine for automaton pipeline interface.
(MD_AUTOMATON_SCHED_INIT, MD_AUTOMATON_SCHED_REORDER): Undefine
for old pipeline interface.
(max_issue, choose_ready): New functions for automaton pipeline
interface.
(schedule_block): Brace old code by preprocessor conditionals with
OLD_PIPELINE_INTERFACE. Add new code for automaton pipeline
interface.
(sched_init): Ditto.
(sched_finish): Free the current automaton state and finalize
automaton pipeline interface.
* sched-rgn.c (remove_new_cpu_cycle_marks): New function
prototypes for automaton interface.
(init_ready_list, new_ready, debug_dependencies): Brace old code
by preprocessor conditionals with OLD_PIPELINE_INTERFACE. Add new
code for automaton pipeline interface.
(remove_new_cpu_cycle_marks): New function for automaton
interface.
(schedule_region): Add call of remove_new_cpu_cycle_marks.
* sched-vis.c (target_units, insn_print_units, init_target_units,
print_block_visualization, visualize_no_unit,
visualize_scheduled_insns, visualize_stall_cycles): Brace them by
preprocessor conditionals with OLD_PIPELINE_INTERFACE.
(get_visual_tbl_length): Add code for automaton interface. Brace
old code by preprocessor conditionals with OLD_PIPELINE_INTERFACE.
* Makefile.in (GETRUNTIME, HASHTAB, HOST_GETRUNTIME, HOST_HASHTAB,
HOST_VARRAY): New variables.
(getruntime.o): New entry.
(genattrtab.o): Add new dependency files.
(genattrtab): Ditto. Link it with `libm.a'.
(getruntime.o, hashtab.o): New entries for canadian cross.
* md.texi: Description of automaton based model.
* tm.texi (USE_AUTOMATON_PIPELINE_INTERFACE,
MD_AUTOMATON_SCHED_INIT, MD_AUTOMATON_SCHED_REORDER,
DFA_SCHEDULER_PRE_CYCLE_INSN, DFA_SCHEDULER_POST_CYCLE_INSN): The
new macro descriptions.
(ISSUE_RATE, MD_SCHED_INIT, MD_SCHED_REORDER, MD_SCHED_REORDER2,
MD_SCHED_VARIABLE_ISSUE): Add a comment.
Index: rtl.def
===================================================================
RCS file: /cvs/gcc/egcs/gcc/rtl.def,v
retrieving revision 1.43
diff -c -p -r1.43 rtl.def
*** rtl.def 2000/12/29 17:35:57 1.43
--- rtl.def 2001/01/31 18:47:20
*************** DEF_RTL_EXPR(DEFINE_DELAY, "define_delay
*** 308,313 ****
--- 308,450 ----
unit.) */
DEF_RTL_EXPR(DEFINE_FUNCTION_UNIT, "define_function_unit", "siieiiV", 'x')
+ /*--Start of constructions for CPU pipeline description described by NDFAs.--*/
+
+ /* (define_cpu_unit string [string]) describes cpu functional
+ units (separated by comma).
+
+ 1st operand: Names of cpu function units.
+ 2nd operand: Name of automaton (see comments for DEFINE_AUTOMATON).
+
+ All define_reservations, define_cpu_units, and
+ define_query_cpu_units should have unique names which can not be
+ "nothing". */
+ DEF_RTL_EXPR(DEFINE_CPU_UNIT, "define_cpu_unit", "sS", 'x')
+
+ /* (define_query_cpu_unit string [string]) describes cpu functional
+ units analogously to define_cpu_unit. If we use automaton without
+ minimization, the reservation of such units can be queried for
+ automaton state. */
+ DEF_RTL_EXPR(DEFINE_QUERY_CPU_UNIT, "define_query_cpu_unit", "sS", 'x')
+
+ /* (exclusion_set string string) means that each CPU function unit in
+ the first string can not be reserved simultaneously with each unit
+ whose name is in the second string and vise versa. CPU units in
+ the string are separated by commas. For example, it is useful for
+ description CPU with fully pipelined floating point functional unit
+ which can execute simultaneously only single floating point insns
+ or only double floating point insns. */
+ DEF_RTL_EXPR(EXCLUSION_SET, "exclusion_set", "ss", 'x')
+
+ /* (presence_set string string) means that each CPU function unit in
+ the first string can not be reserved unless at least one of units
+ whose names are in the second string is reserved. This is an
+ asymmetric relation. CPU units in the string are separated by
+ commas. For example, it is useful for description that slot1 is
+ reserved after slot0 reservation for a VLIW processor. */
+ DEF_RTL_EXPR(PRESENCE_SET, "presence_set", "ss", 'x')
+
+ /* (absence_set string string) means that each CPU function unit in
+ the first string can not be reserved only if each unit whose name
+ is in the second string is not reserved. This is an asymmetric
+ relation (actually exclusion set is analogous to this one but it is
+ symmetric). CPU units in the string are separated by commas. For
+ example, it is useful for description that slot0 can not be
+ reserved after slot1 or slot2 reservation for a VLIW processor. */
+ DEF_RTL_EXPR(ABSENCE_SET, "absence_set", "ss", 'x')
+
+ /* (define_bypass number out_insn_names in_insn_names) names bypass
+ with given latency (the first number) from insns given by the first
+ string (see define_insn_reservation) into insns given by the second
+ string. Insn names in the strings are separated by commas. The
+ third operand is optional name of function which is additional
+ guard for the bypass. The function will get the two insns as
+ parameters. If the function returns zero the bypass will be
+ ignored for this case. Additional guard is necessary to recognize
+ complicated bypasses, e.g. when consumer is load address. */
+ DEF_RTL_EXPR(DEFINE_BYPASS, "define_bypass", "issS", 'x')
+
+ /* (define_automaton string) describes names of automata generated and
+ used for pipeline hazards recognition. The names are separated by
+ comma. Actually it is possibly to generate the single automaton
+ but unfortunately it can be very large. If we use more one
+ automata, the summary size of the automata usually is less than the
+ single one. The automaton name is used in define_cpu_unit and
+ define_query_cpu_unit. All automata should have unique names. */
+ DEF_RTL_EXPR(DEFINE_AUTOMATON, "define_automaton", "s", 'x')
+
+ /* (automata_option string) describes option for generation of
+ automata. Currently there are the following options:
+
+ o "no-minimization" which makes no minimization of automata. This
+ is only worth to do when we are going to query CPU functional
+ unit reservations in an automaton state.
+
+ o "w" which means generation of file describing the result
+ automaton. The file can be used to the description verification.
+
+ o "ndfa" which makes nondeterministic finite state automata. */
+ DEF_RTL_EXPR(AUTOMATA_OPTION, "automata_option", "s", 'x')
+
+ /* (define_reservation string string) names reservation (the first
+ string) of cpu functional units (the 2nd string). Sometimes unit
+ reservations for different insns contain common parts. In such
+ case, you describe common part and use one its name (the 1st
+ parameter) in regular expression in define_insn_reservation. All
+ define_reservations, define_cpu_units, and define_query_cpu_units
+ should have unique names which can not be "nothing". */
+ DEF_RTL_EXPR(DEFINE_RESERVATION, "define_reservation", "ss", 'x')
+
+ /* (define_insn_reservation name default_latency condition regexpr)
+ describes reservation of cpu functional units (the 3nd operand) for
+ instruction which is selected by the condition (the 2nd parameter).
+ The first parameter is used for output of debugging information.
+ The reservations are described by a regular expression according
+ the following syntax:
+
+ regexp = regexp "," oneof
+ | oneof
+
+ oneof = oneof "|" allof
+ | allof
+
+ allof = allof "+" repeat
+ | repeat
+
+ repeat = element "*" number
+ | element
+
+ element = cpu_function_unit_name
+ | reservation_name
+ | result_name
+ | "nothing"
+ | "(" regexp ")"
+
+ 1. "," is used for describing start of the next cycle in
+ reservation.
+
+ 2. "|" is used for describing the reservation described by the
+ first regular expression *or* the reservation described by the
+ second regular expression *or* etc.
+
+ 3. "+" is used for describing the reservation described by the
+ first regular expression *and* the reservation described by the
+ second regular expression *and* etc.
+
+ 4. "*" is used for convinience and simply means sequence in
+ which the regular expression are repeated NUMBER times with
+ cycle advancing (see ",").
+
+ 5. cpu function unit name which means reservation.
+
+ 6. reservation name -- see define_reservation.
+
+ 7. string "nothing" means no units reservation. */
+
+ DEF_RTL_EXPR(DEFINE_INSN_RESERVATION, "define_insn_reservation", "sies", 'x')
+
+ /*---End of constructions for CPU pipeline description described by NDFAs.---*/
+
/* Define attribute computation for `asm' instructions. */
DEF_RTL_EXPR(DEFINE_ASM_ATTRIBUTES, "define_asm_attributes", "V", 'x' )
Index: genattr.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/genattr.c,v
retrieving revision 1.38
diff -c -p -r1.38 genattr.c
*** genattr.c 2000/11/10 16:01:14 1.38
--- genattr.c 2001/01/31 18:47:20
*************** main (argc, argv)
*** 193,198 ****
--- 193,199 ----
int have_delay = 0;
int have_annul_true = 0;
int have_annul_false = 0;
+ int num_insn_reservations = 0;
int num_units = 0;
struct range all_simultaneity, all_multiplicity;
struct range all_ready_cost, all_issue_delay, all_blockage;
*************** from the machine description file `md'.
*** 308,317 ****
extend_range (&all_issue_delay,
unit->issue_delay.min, unit->issue_delay.max);
}
}
! if (num_units > 0)
{
/* Compute the range of blockage cost values. See genattrtab.c
for the derivation. BLOCKAGE (E,C) when SIMULTANEITY is zero is
--- 309,379 ----
extend_range (&all_issue_delay,
unit->issue_delay.min, unit->issue_delay.max);
}
+ else if (GET_CODE (desc) == DEFINE_INSN_RESERVATION)
+ num_insn_reservations++;
}
! if (num_insn_reservations == 0)
{
+ printf ("\n#ifndef AUTOMATON_PIPELINE_INTERFACE\n");
+ printf ("#define AUTOMATON_PIPELINE_INTERFACE 0\n");
+ printf ("#endif\n\n");
+ }
+ else
+ {
+ /* Output interface for pipeline hazards recognition based on
+ DFA (deterministic finite state automata. */
+ printf ("\n#ifndef AUTOMATON_PIPELINE_INTERFACE\n");
+ printf ("#define AUTOMATON_PIPELINE_INTERFACE 1\n");
+ printf ("#endif\n\n");
+ printf ("#ifndef AUTOMATON_STATE_ALTS\n");
+ printf ("#define AUTOMATON_STATE_ALTS 0\n");
+ printf ("#endif\n\n");
+ printf ("#ifndef CPU_UNITS_QUERY\n");
+ printf ("#define CPU_UNITS_QUERY 0\n");
+ printf ("#endif\n\n");
+ /* Interface itself: */
+ printf ("#if AUTOMATON_PIPELINE_INTERFACE\n\n");
+ printf ("#define INSN_SCHEDULING\n\n");
+ printf ("extern int insn_default_latency PARAMS ((rtx));\n\n");
+ printf ("extern int bypass_p PARAMS ((rtx));\n\n");
+ printf ("extern int insn_latency PARAMS ((rtx, rtx));\n\n");
+ printf ("extern int insn_alts PARAMS ((rtx));\n\n");
+ printf ("extern int max_insn_queue_index;\n\n");
+ printf ("typedef void *state_t;\n\n");
+ printf ("extern int state_size PARAMS ((void));\n\n");
+ printf ("extern void state_reset PARAMS ((state_t));\n");
+ printf ("extern int state_transition PARAMS ((state_t, rtx));\n");
+ printf ("\n#if AUTOMATON_STATE_ALTS\n");
+ printf ("extern int state_alts PARAMS ((state_t, rtx));\n");
+ printf ("#endif\n");
+ printf ("extern int min_issue_delay PARAMS ((state_t, rtx));\n");
+ printf ("extern int state_dead_lock_p PARAMS ((state_t));\n");
+ printf
+ ("extern int min_insn_conflict_delay PARAMS ((state_t, rtx, rtx));\n");
+ printf ("extern void print_reservation PARAMS ((FILE *, rtx));\n");
+ printf ("\n#if CPU_UNITS_QUERY\n");
+ printf ("extern int get_cpu_unit_code PARAMS ((const char *));\n");
+ printf ("extern int cpu_unit_reservation_p PARAMS ((state_t, int));\n");
+ printf ("#endif\n");
+ printf ("extern void dfa_start PARAMS ((void));\n");
+ printf ("extern void dfa_finish PARAMS ((void));\n");
+ printf ("#endif /* #if AUTOMATON_PIPELINE_INTERFACE */\n\n");
+ }
+
+ if (num_units == 0)
+ {
+ printf ("\n#ifndef OLD_PIPELINE_INTERFACE\n");
+ printf ("#define OLD_PIPELINE_INTERFACE 0\n");
+ printf ("#endif\n\n");
+ }
+ else if (num_units > 0)
+ {
+ printf ("\n#ifndef OLD_PIPELINE_INTERFACE\n");
+ printf ("#define OLD_PIPELINE_INTERFACE 1\n");
+ printf ("#endif\n\n");
+ printf ("#if OLD_PIPELINE_INTERFACE\n\n");
+
/* Compute the range of blockage cost values. See genattrtab.c
for the derivation. BLOCKAGE (E,C) when SIMULTANEITY is zero is
*************** from the machine description file `md'.
*** 348,353 ****
--- 410,416 ----
write_units (num_units, &all_multiplicity, &all_simultaneity,
&all_ready_cost, &all_issue_delay, &all_blockage);
+ printf ("#endif /* #if OLD_PIPELINE_INTERFACE */\n\n");
}
/* Output flag masks for use by reorg.
Index: genattrtab.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/genattrtab.c,v
retrieving revision 1.87
diff -c -p -r1.87 genattrtab.c
*** genattrtab.c 2000/12/16 19:00:57 1.87
--- genattrtab.c 2001/01/31 18:47:21
*************** static int length_used;
*** 302,307 ****
--- 302,308 ----
static int num_delays;
static int have_annul_true, have_annul_false;
static int num_units, num_unit_opclasses;
+ static int num_dfa_decls;
static int num_insn_ents;
/* Used as operand to `operate_exp': */
*************** static const char *attr_numeral PARAMS (
*** 461,466 ****
--- 462,480 ----
static int attr_equal_p PARAMS ((rtx, rtx));
static rtx attr_copy_rtx PARAMS ((rtx));
static int attr_rtx_cost PARAMS ((rtx));
+ static void gen_cpu_unit PARAMS ((rtx));
+ static void gen_query_cpu_unit PARAMS ((rtx));
+ static void gen_bypass PARAMS ((rtx));
+ static void gen_excl_set PARAMS ((rtx));
+ static void gen_presence_set PARAMS ((rtx));
+ static void gen_absence_set PARAMS ((rtx));
+ static void gen_automaton PARAMS ((rtx));
+ static void gen_automata_option PARAMS ((rtx));
+ static void gen_reserv PARAMS ((rtx));
+ static void gen_insn_reserv PARAMS ((rtx));
+ static void initiate_automaton_gen PARAMS ((int, char **));
+ static void expand_automata PARAMS ((void));
+ static void write_automata PARAMS ((void));
#define oballoc(size) obstack_alloc (hash_obstack, size)
*************** from the machine description file `md'.
*** 6081,6086 ****
--- 6095,6101 ----
/* Read the machine description. */
+ initiate_automaton_gen (argc, argv);
while (1)
{
int lineno;
*************** from the machine description file `md'.
*** 6109,6114 ****
--- 6124,6169 ----
gen_unit (desc, lineno);
break;
+ case DEFINE_CPU_UNIT:
+ gen_cpu_unit (desc);
+ break;
+
+ case DEFINE_QUERY_CPU_UNIT:
+ gen_query_cpu_unit (desc);
+ break;
+
+ case DEFINE_BYPASS:
+ gen_bypass (desc);
+ break;
+
+ case EXCLUSION_SET:
+ gen_excl_set (desc);
+ break;
+
+ case PRESENCE_SET:
+ gen_presence_set (desc);
+ break;
+
+ case ABSENCE_SET:
+ gen_absence_set (desc);
+ break;
+
+ case DEFINE_AUTOMATON:
+ gen_automaton (desc);
+ break;
+
+ case AUTOMATA_OPTION:
+ gen_automata_option (desc);
+ break;
+
+ case DEFINE_RESERVATION:
+ gen_reserv (desc);
+ break;
+
+ case DEFINE_INSN_RESERVATION:
+ gen_insn_reserv (desc);
+ break;
+
default:
break;
}
*************** from the machine description file `md'.
*** 6137,6142 ****
--- 6192,6202 ----
if (num_units)
expand_units ();
+ /* Build DFA, output some functions and expand DFA information into
+ new attributes. */
+ if (num_dfa_decls)
+ expand_automata ();
+
printf ("#include \"config.h\"\n");
printf ("#include \"system.h\"\n");
printf ("#include \"rtl.h\"\n");
*************** from the machine description file `md'.
*** 6211,6217 ****
/* Write out information about function units. */
if (num_units)
! write_function_unit_info ();
/* Write out constant delay slot info */
write_const_num_delay_slots ();
--- 6271,6290 ----
/* Write out information about function units. */
if (num_units)
! {
! printf ("#if OLD_PIPELINE_INTERFACE\n");
! write_function_unit_info ();
! printf ("#endif /* #if OLD_PIPELINE_INTERFACE */\n\n");
! }
!
! if (num_dfa_decls)
! {
! /* Output code for pipeline hazards recognition based on
! DFA (deterministic finite state automata. */
! printf ("#if AUTOMATON_PIPELINE_INTERFACE\n");
! write_automata ();
! printf ("#endif /* #if AUTOMATON_PIPELINE_INTERFACE */\n\n");
! }
/* Write out constant delay slot info */
write_const_num_delay_slots ();
*************** get_insn_name (code)
*** 6229,6231 ****
--- 6302,6306 ----
{
return NULL;
}
+
+ #include "genautomata.c"
Index: sched-int.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/sched-int.h,v
retrieving revision 1.7
diff -c -p -r1.7 sched-int.h
*** sched-int.h 2001/01/12 18:00:49 1.7
--- sched-int.h 2001/01/31 18:47:21
*************** along with GNU CC; see the file COPYING.
*** 20,25 ****
--- 20,81 ----
the Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA
02111-1307, USA. */
+ /* The following for old genattrtab which has no code for generation
+ of automata. */
+ #ifndef OLD_PIPELINE_INTERFACE
+ #define OLD_PIPELINE_INTERFACE 1
+ #endif
+
+ #ifndef AUTOMATON_PIPELINE_INTERFACE
+ #define AUTOMATON_PIPELINE_INTERFACE 0
+ #endif
+
+ /* If the following macro value is nonzero, we will make multi-pass
+ scheduling for the first cycle. */
+ #ifndef FIRST_CYCLE_MULTIPASS_SCHEDULING
+ #define FIRST_CYCLE_MULTIPASS_SCHEDULING 0
+ #endif
+
+ #ifndef FIRST_CYCLE_MULTIPASS_SCHEDULING_LOOKAHEAD
+ #define FIRST_CYCLE_MULTIPASS_SCHEDULING_LOOKAHEAD 0
+ #endif
+
+ #if !OLD_PIPELINE_INTERFACE && !AUTOMATON_PIPELINE_INTERFACE
+ #undef OLD_PIPELINE_INTERFACE
+ #define OLD_PIPELINE_INTERFACE 1
+ #endif
+
+ #if OLD_PIPELINE_INTERFACE && !AUTOMATON_PIPELINE_INTERFACE
+ #undef USE_AUTOMATON_PIPELINE_INTERFACE
+ #define USE_AUTOMATON_PIPELINE_INTERFACE 0
+ #elif !OLD_PIPELINE_INTERFACE && AUTOMATON_PIPELINE_INTERFACE
+ #undef USE_AUTOMATON_PIPELINE_INTERFACE
+ #define USE_AUTOMATON_PIPELINE_INTERFACE 1
+ #else
+ #ifndef USE_AUTOMATON_PIPELINE_INTERFACE
+ #define USE_AUTOMATON_PIPELINE_INTERFACE 1
+ #endif
+ #endif
+
+ /* The following is for safety to support the
+ two pipeline description interfaces. */
+ #if !OLD_PIPELINE_INTERFACE
+
+ #undef FUNCTION_UNITS_SIZE
+ #undef BLOCKAGE_BITS
+ #undef MAX_MULTIPLICITY
+ #undef MIN_BLOCKAGE
+ #undef MAX_BLOCKAGE
+
+ #endif /* #if !OLD_PIPELINE_INTERFACE */
+
+ #if AUTOMATON_PIPELINE_INTERFACE
+ extern state_t curr_state;
+ #endif
+
+ extern char *reg_known_equiv_p;
+ extern rtx *reg_known_value;
+
/* Forward declaration. */
struct ready_list;
*************** struct haifa_insn_data
*** 171,179 ****
--- 227,237 ----
the ready queue when its counter reaches zero. */
int dep_count;
+ #if OLD_PIPELINE_INTERFACE
/* An encoding of the blockage range function. Both unit and range
are coded. */
unsigned int blockage;
+ #endif
/* Number of instructions referring to this insn. */
int ref_count;
*************** struct haifa_insn_data
*** 184,191 ****
--- 242,251 ----
short cost;
+ #if OLD_PIPELINE_INTERFACE
/* An encoding of the function units used. */
short units;
+ #endif
/* This weight is an estimation of the insn's contribution to
register pressure. */
*************** extern struct haifa_insn_data *h_i_d;
*** 213,218 ****
--- 273,280 ----
#define INSN_UNIT(INSN) (h_i_d[INSN_UID (INSN)].units)
#define INSN_REG_WEIGHT(INSN) (h_i_d[INSN_UID (INSN)].reg_weight)
+ #if OLD_PIPELINE_INTERFACE
+
#define INSN_BLOCKAGE(INSN) (h_i_d[INSN_UID (INSN)].blockage)
#define UNIT_BITS 5
#define BLOCKAGE_MASK ((1 << BLOCKAGE_BITS) - 1)
*************** extern struct haifa_insn_data *h_i_d;
*** 229,234 ****
--- 291,298 ----
#define MIN_BLOCKAGE_COST(R) ((R) >> (HOST_BITS_PER_INT / 2))
#define MAX_BLOCKAGE_COST(R) ((R) & ((1 << (HOST_BITS_PER_INT / 2)) - 1))
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
extern FILE *sched_dump;
extern int sched_verbose;
*************** extern int sched_verbose;
*** 241,253 ****
--- 305,325 ----
#endif
/* Functions in sched-vis.c. */
+
+ #if OLD_PIPELINE_INTERFACE
extern void init_target_units PARAMS ((void));
extern void insn_print_units PARAMS ((rtx));
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
extern void init_block_visualization PARAMS ((void));
+
+ #if OLD_PIPELINE_INTERFACE
extern void print_block_visualization PARAMS ((const char *));
extern void visualize_scheduled_insns PARAMS ((int));
extern void visualize_no_unit PARAMS ((rtx));
extern void visualize_stall_cycles PARAMS ((int));
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
extern void visualize_alloc PARAMS ((void));
extern void visualize_free PARAMS ((void));
*************** extern void restore_line_notes PARAMS ((
*** 276,282 ****
--- 348,357 ----
extern void rm_redundant_line_notes PARAMS ((void));
extern void rm_other_notes PARAMS ((rtx, rtx));
+ #if OLD_PIPELINE_INTERFACE
extern int insn_issue_delay PARAMS ((rtx));
+ #endif
+
extern int set_priorities PARAMS ((rtx, rtx));
extern void schedule_block PARAMS ((int, int));
*************** extern void ready_add PARAMS ((struct re
*** 287,294 ****
/* The following are exported for the benefit of debugging functions. It
would be nicer to keep them private to haifa-sched.c. */
extern int insn_unit PARAMS ((rtx));
extern int insn_cost PARAMS ((rtx, rtx, rtx));
extern rtx get_unit_last_insn PARAMS ((int));
extern int actual_hazard_this_instance PARAMS ((int, int, rtx, int, int));
!
--- 362,375 ----
/* The following are exported for the benefit of debugging functions. It
would be nicer to keep them private to haifa-sched.c. */
+
+ #if OLD_PIPELINE_INTERFACE
extern int insn_unit PARAMS ((rtx));
+ #endif
+
extern int insn_cost PARAMS ((rtx, rtx, rtx));
+
+ #if OLD_PIPELINE_INTERFACE
extern rtx get_unit_last_insn PARAMS ((int));
extern int actual_hazard_this_instance PARAMS ((int, int, rtx, int, int));
! #endif /* #if OLD_PIPELINE_INTERFACE */
Index: haifa-sched.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/haifa-sched.c,v
retrieving revision 1.177
diff -c -p -r1.177 haifa-sched.c
*** haifa-sched.c 2001/01/12 18:00:48 1.177
--- haifa-sched.c 2001/01/31 18:47:21
*************** the Free Software Foundation, 59 Temple
*** 152,157 ****
--- 152,159 ----
#ifdef INSN_SCHEDULING
+ #if OLD_PIPELINE_INTERFACE
+
/* issue_rate is the number of insns that can be scheduled in the same
machine cycle. It can be defined in the config/mach/mach.h file,
otherwise we set it to 1. */
*************** static int issue_rate;
*** 162,167 ****
--- 164,179 ----
#define ISSUE_RATE 1
#endif
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
+ #if AUTOMATON_PIPELINE_INTERFACE
+ /* If the following variable value is non zero, the scheduler inserts
+ bubbles (nop insns). The value of variable affects on scheduler
+ behavior only if automaton pipeline interface with multipass
+ scheduling is used and macro SCHEDULER_BUBBLE is defined. */
+ int insert_schedule_bubbles_p = 0;
+ #endif
+
/* sched-verbose controls the amount of debugging output the
scheduler prints. It is controlled by -fsched-verbose=N:
N>0 and no -DSR : the output is directed to stderr.
*************** static rtx note_list;
*** 258,272 ****
passes or stalls are introduced. */
/* Implement a circular buffer to delay instructions until sufficient
! time has passed. INSN_QUEUE_SIZE is a power of two larger than
! MAX_BLOCKAGE and MAX_READY_COST computed by genattr.c. This is the
! longest time an isnsn may be queued. */
! static rtx insn_queue[INSN_QUEUE_SIZE];
static int q_ptr = 0;
static int q_size = 0;
! #define NEXT_Q(X) (((X)+1) & (INSN_QUEUE_SIZE-1))
! #define NEXT_Q_AFTER(X, C) (((X)+C) & (INSN_QUEUE_SIZE-1))
/* Describe the ready list of the scheduler.
VEC holds space enough for all insns in the current region. VECLEN
says how many exactly.
--- 270,311 ----
passes or stalls are introduced. */
/* Implement a circular buffer to delay instructions until sufficient
! time has passed. For the old pipeline description interface,
! INSN_QUEUE_SIZE is a power of two larger than MAX_BLOCKAGE and
! MAX_READY_COST computed by genattr.c. For the new pipeline
! description interface, MAX_INSN_QUEUE_INDEX is a power of two minus
! one which is larger than maximal time of instruction execution
! computed by genattr.c on the base maximal time of functional unit
! reservations and geting a result. This is the longest time an
! insn may be queued. */
!
! #define MAX_INSN_QUEUE_INDEX max_insn_queue_index
!
! static rtx *insn_queue;
static int q_ptr = 0;
static int q_size = 0;
! #define NEXT_Q(X) (((X)+1) & MAX_INSN_QUEUE_INDEX)
! #define NEXT_Q_AFTER(X, C) (((X)+C) & MAX_INSN_QUEUE_INDEX)
+
+ #if !AUTOMATON_PIPELINE_INTERFACE
+ static int max_insn_queue_index;
+ #else
+
+ /* The following variable value refers for all current and future
+ reservations of the proccesor units. */
+ state_t curr_state;
+ /* The following variable value is size of memory representing all
+ current and future reservations of the processor units. */
+ static size_t dfa_state_size;
+
+ #if FIRST_CYCLE_MULTIPASS_SCHEDULING
+ /* The following array is used to find the best insn from ready. */
+ static char *ready_try;
+ #endif
+
+ #endif /* #if !AUTOMATON_PIPELINE_INTERFACE */
+
/* Describe the ready list of the scheduler.
VEC holds space enough for all insns in the current region. VECLEN
says how many exactly.
*************** struct ready_list
*** 284,294 ****
--- 323,337 ----
};
/* Forward declarations. */
+
+ #if OLD_PIPELINE_INTERFACE
static unsigned int blockage_range PARAMS ((int, rtx));
static void clear_units PARAMS ((void));
static void schedule_unit PARAMS ((int, rtx, int));
static int actual_hazard PARAMS ((int, rtx, int, int));
static int potential_hazard PARAMS ((int, rtx, int));
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
static int priority PARAMS ((rtx));
static int rank_for_schedule PARAMS ((const PTR, const PTR));
static void swap_sort PARAMS ((rtx *, int));
*************** static rtx *ready_lastpos PARAMS ((struc
*** 328,333 ****
--- 371,381 ----
static void ready_sort PARAMS ((struct ready_list *));
static rtx ready_remove_first PARAMS ((struct ready_list *));
+ #if AUTOMATON_PIPELINE_INTERFACE && FIRST_CYCLE_MULTIPASS_SCHEDULING
+ static rtx ready_element PARAMS ((struct ready_list *, int));
+ static rtx ready_remove PARAMS ((struct ready_list *, int));
+ #endif
+
static void queue_to_ready PARAMS ((struct ready_list *));
static void debug_ready_list PARAMS ((struct ready_list *));
*************** static void debug_ready_list PARAMS ((st
*** 335,340 ****
--- 383,398 ----
static rtx move_insn1 PARAMS ((rtx, rtx));
static rtx move_insn PARAMS ((rtx, rtx));
+ #if AUTOMATON_PIPELINE_INTERFACE
+
+ #if FIRST_CYCLE_MULTIPASS_SCHEDULING
+ static int max_issue PARAMS ((struct ready_list *, state_t, int *, int *));
+ #endif
+
+ static rtx choose_ready PARAMS ((struct ready_list *));
+
+ #endif
+
#endif /* INSN_SCHEDULING */
/* Point to state used for the current scheduling pass. */
*************** schedule_insns (dump_file)
*** 354,359 ****
--- 412,419 ----
static rtx last_scheduled_insn;
+ #if OLD_PIPELINE_INTERFACE
+
/* Compute the function units used by INSN. This caches the value
returned by function_units_used. A function unit is encoded as the
unit number if the value is non-negative and the compliment of a
*************** potential_hazard (unit, insn, cost)
*** 642,647 ****
--- 702,709 ----
return cost;
}
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
/* Compute cost of executing INSN given the dependence LINK on the insn USED.
This is the number of cycles between instruction issue and
instruction results. */
*************** insn_cost (insn, link, used)
*** 652,715 ****
{
register int cost = INSN_COST (insn);
! if (cost == 0)
{
recog_memoized (insn);
! /* A USE insn, or something else we don't need to understand.
! We can't pass these directly to result_ready_cost because it will
! trigger a fatal error for unrecognizable insns. */
! if (INSN_CODE (insn) < 0)
{
! INSN_COST (insn) = 1;
! return 1;
}
! else
! {
! cost = result_ready_cost (insn);
! if (cost < 1)
! cost = 1;
! INSN_COST (insn) = cost;
}
}
/* In this case estimate cost without caring how insn is used. */
if (link == 0 && used == 0)
return cost;
! /* A USE insn should never require the value used to be computed. This
! allows the computation of a function's result and parameter values to
! overlap the return and call. */
! recog_memoized (used);
! if (INSN_CODE (used) < 0)
! LINK_COST_FREE (link) = 1;
!
! /* If some dependencies vary the cost, compute the adjustment. Most
! commonly, the adjustment is complete: either the cost is ignored
! (in the case of an output- or anti-dependence), or the cost is
! unchanged. These values are cached in the link as LINK_COST_FREE
! and LINK_COST_ZERO. */
! if (LINK_COST_FREE (link))
! cost = 0;
! #ifdef ADJUST_COST
! else if (!LINK_COST_ZERO (link))
{
int ncost = cost;
! ADJUST_COST (used, link, insn, ncost);
! if (ncost < 1)
{
! LINK_COST_FREE (link) = 1;
! ncost = 0;
}
if (cost == ncost)
LINK_COST_ZERO (link) = 1;
cost = ncost;
}
#endif
return cost;
}
--- 714,863 ----
{
register int cost = INSN_COST (insn);
! if ((!USE_AUTOMATON_PIPELINE_INTERFACE && cost == 0)
! || (USE_AUTOMATON_PIPELINE_INTERFACE && cost < 0))
{
recog_memoized (insn);
! #if OLD_PIPELINE_INTERFACE
!
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
{
! /* A USE insn, or something else we don't need to
! understand. We can't pass these directly to
! result_ready_cost because it will trigger a fatal error
! for unrecognizable insns. */
! if (INSN_CODE (insn) < 0)
! {
! INSN_COST (insn) = 1;
! return 1;
! }
! else
! {
! cost = result_ready_cost (insn);
!
! if (cost < 1)
! cost = 1;
!
! INSN_COST (insn) = cost;
! }
}
!
! #endif /* #if OLD_PIPELINE_INTERFACE */
! #if AUTOMATON_PIPELINE_INTERFACE
! if (USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! /* A USE insn, or something else we don't need to
! understand. We can't pass these directly to
! result_ready_cost or insn_default_latency because it will
! trigger a fatal error for unrecognizable insns. */
! if (INSN_CODE (insn) < 0)
! {
! INSN_COST (insn) = 0;
! return 0;
! }
! else
! {
! cost = insn_default_latency (insn);
!
! if (cost < 0)
! cost = 0;
!
! INSN_COST (insn) = cost;
! }
}
+
+ #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
+
}
/* In this case estimate cost without caring how insn is used. */
if (link == 0 && used == 0)
return cost;
! #if AUTOMATON_PIPELINE_INTERFACE
! if (USE_AUTOMATON_PIPELINE_INTERFACE && !LINK_COST_ZERO (link))
{
int ncost = cost;
! /* A USE insn, or something else we don't need to understand.
! We can't pass these directly to bypass_p because it will
! trigger a fatal error for unrecognizable insns. */
! recog_memoized (insn);
!
! if (INSN_CODE (insn) >= 0 && used != 0)
{
! recog_memoized (used);
!
! if (INSN_CODE (used) < 0)
! ncost = 0;
! else if (REG_NOTE_KIND (link) == REG_DEP_ANTI)
! ncost = 0;
! else if (REG_NOTE_KIND (link) == REG_DEP_OUTPUT)
! {
! ncost
! = insn_default_latency (insn) - insn_default_latency (used);
! if (ncost <= 0)
! ncost = 1;
! }
! else if (bypass_p (insn))
! ncost = insn_latency (insn, used);
! else
! ncost = cost;
! #ifdef ADJUST_DFA_DEPENDENCY_COST
! ADJUST_DFA_DEPENDENCY_COST (used, link, insn, ncost);
! #endif
}
+
if (cost == ncost)
LINK_COST_ZERO (link) = 1;
+
cost = ncost;
}
+
+ #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
+
+ #if OLD_PIPELINE_INTERFACE
+
+ if (!USE_AUTOMATON_PIPELINE_INTERFACE)
+ {
+ /* A USE insn should never require the value used to be
+ computed. This allows the computation of a function's result
+ and parameter values to overlap the return and call. */
+ recog_memoized (used);
+ if (INSN_CODE (used) < 0)
+ LINK_COST_FREE (link) = 1;
+
+ /* If some dependencies vary the cost, compute the adjustment.
+ Most commonly, the adjustment is complete: either the cost is
+ ignored (in the case of an output- or anti-dependence), or
+ the cost is unchanged. These values are cached in the link
+ as LINK_COST_FREE and LINK_COST_ZERO. */
+
+ if (LINK_COST_FREE (link))
+ cost = 0;
+ #ifdef ADJUST_COST
+ else if (!LINK_COST_ZERO (link))
+ {
+ int ncost = cost;
+
+ ADJUST_COST (used, link, insn, ncost);
+ if (ncost < 1)
+ {
+ LINK_COST_FREE (link) = 1;
+ ncost = 0;
+ }
+ if (cost == ncost)
+ LINK_COST_ZERO (link) = 1;
+ cost = ncost;
+ }
#endif
+ }
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
return cost;
}
*************** priority (insn)
*** 725,731 ****
if (! INSN_P (insn))
return 0;
! if ((this_priority = INSN_PRIORITY (insn)) == 0)
{
if (INSN_DEPEND (insn) == 0)
this_priority = insn_cost (insn, 0, 0);
--- 873,881 ----
if (! INSN_P (insn))
return 0;
! this_priority = INSN_PRIORITY (insn);
! if ((this_priority == 0 && !USE_AUTOMATON_PIPELINE_INTERFACE)
! || (this_priority < 0 && USE_AUTOMATON_PIPELINE_INTERFACE))
{
if (INSN_DEPEND (insn) == 0)
this_priority = insn_cost (insn, 0, 0);
*************** ready_remove_first (ready)
*** 931,936 ****
--- 1081,1127 ----
return t;
}
+ #if AUTOMATON_PIPELINE_INTERFACE && FIRST_CYCLE_MULTIPASS_SCHEDULING
+
+ /* Return a pointer to the element INDEX from the ready. INDEX for
+ insn with the highest priority is 0, and the lowest priority has
+ N_READY - 1. */
+
+ HAIFA_INLINE static rtx
+ ready_element (ready, index)
+ struct ready_list *ready;
+ int index;
+ {
+ if (ready->n_ready == 0 || index >= ready->n_ready)
+ abort ();
+ return ready->vec[ready->first - index];
+ }
+
+ /* Remove the element INDEX from the ready list and return it. INDEX
+ for insn with the highest priority is 0, and the lowest priority
+ has N_READY - 1. */
+
+ HAIFA_INLINE static rtx
+ ready_remove (ready, index)
+ struct ready_list *ready;
+ int index;
+ {
+ rtx t;
+ int i;
+
+ if (index == 0)
+ return ready_remove_first (ready);
+ if (ready->n_ready == 0 || index >= ready->n_ready)
+ abort ();
+ t = ready->vec[ready->first - index];
+ ready->n_ready--;
+ for (i = index; i < ready->n_ready; i++)
+ ready [ready->first - i] = ready [ready->first - i - 1];
+ return t;
+ }
+
+ #endif /*#if AUTOMATON_PIPELINE_INTERFACE && FIRST_CYCLE_MULTIPASS_SCHEDULING*/
+
/* Sort the ready list READY by ascending priority, using the SCHED_SORT
macro. */
*************** schedule_insn (insn, ready, clock)
*** 977,1002 ****
int clock;
{
rtx link;
int unit;
! unit = insn_unit (insn);
if (sched_verbose >= 2)
{
! fprintf (sched_dump, ";;\t\t--> scheduling insn <<<%d>>> on unit ",
! INSN_UID (insn));
! insn_print_units (insn);
fprintf (sched_dump, "\n");
}
! if (sched_verbose && unit == -1)
! visualize_no_unit (insn);
- if (MAX_BLOCKAGE > 1 || issue_rate > 1 || sched_verbose)
- schedule_unit (unit, insn, clock);
! if (INSN_DEPEND (insn) == 0)
! return;
for (link = INSN_DEPEND (insn); link != 0; link = XEXP (link, 1))
{
--- 1168,1227 ----
int clock;
{
rtx link;
+ #if OLD_PIPELINE_INTERFACE
int unit;
+ #endif
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! unit = insn_unit (insn);
! #endif
if (sched_verbose >= 2)
{
!
! #if AUTOMATON_PIPELINE_INTERFACE
! if (USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! fprintf (sched_dump,
! ";;\t\t--> scheduling insn <<<%d>>>:reservation ",
! INSN_UID (insn));
!
! recog_memoized (insn);
!
! if (INSN_CODE (insn) < 0)
! fprintf (sched_dump, "nothing");
! else
! print_reservation (sched_dump, insn);
! }
! #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
!
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! fprintf (sched_dump, ";;\t\t--> scheduling insn <<<%d>>> on unit ",
! INSN_UID (insn));
! insn_print_units (insn);
! }
! #endif /* #if OLD_PIPELINE_INTERFACE */
!
fprintf (sched_dump, "\n");
}
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! if (sched_verbose && unit == -1)
! visualize_no_unit (insn);
! if (MAX_BLOCKAGE > 1 || issue_rate > 1 || sched_verbose)
! schedule_unit (unit, insn, clock);
!
! if (INSN_DEPEND (insn) == 0)
! return;
! }
! #endif /* #if OLD_PIPELINE_INTERFACE */
for (link = INSN_DEPEND (insn); link != 0; link = XEXP (link, 1))
{
*************** schedule_insn (insn, ready, clock)
*** 1033,1048 ****
}
}
! /* Annotate the instruction with issue information -- TImode
! indicates that the instruction is expected not to be able
! to issue on the same cycle as the previous insn. A machine
! may use this information to decide how the instruction should
! be aligned. */
! if (reload_completed && issue_rate > 1)
{
PUT_MODE (insn, clock > last_clock_var ? TImode : VOIDmode);
last_clock_var = clock;
}
}
/* Functions for handling of notes. */
--- 1258,1294 ----
}
}
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE && reload_completed && issue_rate > 1)
{
+ /* Annotate the instruction with issue information -- TImode
+ indicates that the instruction is expected not to be able to
+ issue on the same cycle as the previous insn. A machine may
+ use this information to decide how the instruction should be
+ aligned. */
PUT_MODE (insn, clock > last_clock_var ? TImode : VOIDmode);
last_clock_var = clock;
}
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
+ #if AUTOMATON_PIPELINE_INTERFACE
+ if (USE_AUTOMATON_PIPELINE_INTERFACE
+ && GET_CODE (PATTERN (insn)) != USE
+ && GET_CODE (PATTERN (insn)) != CLOBBER)
+ {
+ /* Annotate insn by using mode if it is issued on new processor
+ cycle. */
+ #ifdef ANNOTATE_INSNS_BY_DELAY_IN_CYCLES
+ PUT_MODE (insn, (clock > last_clock_var
+ ? (reload_completed ? clock - last_clock_var : TImode)
+ : VOIDmode));
+ #else
+ PUT_MODE (insn, (clock > last_clock_var ? TImode : VOIDmode));
+ #endif
+ last_clock_var = clock;
+ }
+ #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
+
}
/* Functions for handling of notes. */
*************** queue_to_ready (ready)
*** 1466,1472 ****
{
register int stalls;
! for (stalls = 1; stalls < INSN_QUEUE_SIZE; stalls++)
{
if ((link = insn_queue[NEXT_Q_AFTER (q_ptr, stalls)]))
{
--- 1712,1718 ----
{
register int stalls;
! for (stalls = 1; stalls <= MAX_INSN_QUEUE_INDEX; stalls++)
{
if ((link = insn_queue[NEXT_Q_AFTER (q_ptr, stalls)]))
{
*************** queue_to_ready (ready)
*** 1485,1497 ****
}
insn_queue[NEXT_Q_AFTER (q_ptr, stalls)] = 0;
if (ready->n_ready)
break;
}
}
! if (sched_verbose && stalls)
visualize_stall_cycles (stalls);
q_ptr = NEXT_Q_AFTER (q_ptr, stalls);
clock_var += stalls;
}
--- 1731,1764 ----
}
insn_queue[NEXT_Q_AFTER (q_ptr, stalls)] = 0;
+ #if AUTOMATON_PIPELINE_INTERFACE
+
+ /* Advance time on one cycle. */
+ if (USE_AUTOMATON_PIPELINE_INTERFACE)
+ {
+ #ifdef DFA_SCHEDULER_PRE_CYCLE_INSN
+ state_transition (curr_state, DFA_SCHEDULER_PRE_CYCLE_INSN);
+ #endif
+
+ state_transition (curr_state, NULL);
+
+ #ifdef DFA_SCHEDULER_POST_CYCLE_INSN
+ state_transition (curr_state, DFA_SCHEDULER_POST_CYCLE_INSN);
+ #endif
+ }
+
+ #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
+
if (ready->n_ready)
break;
}
}
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE && sched_verbose && stalls)
visualize_stall_cycles (stalls);
+ #endif
+
q_ptr = NEXT_Q_AFTER (q_ptr, stalls);
clock_var += stalls;
}
*************** move_insn (insn, last)
*** 1626,1631 ****
--- 1893,2026 ----
return retval;
}
+ /* The following is for safety to support the
+ two pipeline description interfaces. */
+ #if !OLD_PIPELINE_INTERFACE
+ #undef MD_SCHED_INIT
+ #undef MD_SCHED_REORDER
+ #undef MD_SCHED_VARIABLE_ISSUE
+ #endif
+
+ #if !AUTOMATON_PIPELINE_INTERFACE
+ #undef MD_AUTOMATON_SCHED_INIT
+ #undef MD_AUTOMATON_SCHED_REORDER
+ #endif
+
+ #if AUTOMATON_PIPELINE_INTERFACE
+
+ #if FIRST_CYCLE_MULTIPASS_SCHEDULING
+
+ /* The following function returns maximal (or close to maximal) number
+ of insns which can be issued on the same cycle and one of which
+ insns is insns with the best rank (the last insn in READY). To
+ make this function tries different samples of ready insns. READY
+ is current queue `ready'. Global array READY_TRY reflects what
+ insns are already issued in this try. STATE is current processor
+ state. If the function returns nonzero, INDEX will contain index
+ of the best insn in READY. *LAST_P is nonzero if the insn with the
+ highest rank is in the current sample. */
+
+ static int
+ max_issue (ready, state, index, last_p)
+ struct ready_list *ready;
+ state_t state;
+ int *index;
+ int *last_p;
+
+ {
+ int i, best, n, temp_index, delay;
+ state_t temp_state;
+ rtx insn;
+ int max_lookahead = FIRST_CYCLE_MULTIPASS_SCHEDULING_LOOKAHEAD;
+
+ if (state_dead_lock_p (state))
+ return 0;
+
+ temp_state = alloca (dfa_state_size);
+ best = 0;
+
+ for (i = 0; i < ready->n_ready; i++)
+ if (!ready_try [i])
+ {
+ insn = ready_element (ready, i);
+
+ if (INSN_CODE (insn) < 0)
+ continue;
+
+ memcpy (temp_state, state, dfa_state_size);
+
+ delay = state_transition (temp_state, insn);
+
+ if (delay == 0)
+ {
+ #ifdef SCHEDULER_BUBBLE
+ int j;
+ rtx bubble;
+
+ for (j = 0; (bubble = SCHEDULER_BUBBLE (j)) != NULL_RTX; j++)
+ if (state_transition (temp_state, bubble) < 0
+ && state_transition (temp_state, insn) < 0)
+ break;
+
+ if (bubble == NULL_RTX)
+ #endif
+ continue;
+ }
+ else if (delay > 0)
+ continue;
+
+ --max_lookahead;
+
+ if (max_lookahead < 0)
+ break;
+
+ ready_try [i] = 1;
+ *last_p = 0;
+
+ n = max_issue (ready, temp_state, &temp_index, last_p) + 1;
+
+ if (best < n && (ready_try [0] || *last_p))
+ {
+ best = n;
+ *index = i;
+ *last_p = 1;
+ }
+ ready_try [i] = 0;
+ }
+
+ return best;
+ }
+
+ #endif /* #if FIRST_CYCLE_MULTIPASS_SCHEDULING */
+
+ /* The following function chooses insn from READY and modifies *N_READY
+ and READY. */
+
+ static rtx
+ choose_ready (ready)
+ struct ready_list *ready;
+ {
+ #if FIRST_CYCLE_MULTIPASS_SCHEDULING
+ if (FIRST_CYCLE_MULTIPASS_SCHEDULING_LOOKAHEAD <= 0)
+ #endif
+ return ready_remove_first (ready);
+ #if FIRST_CYCLE_MULTIPASS_SCHEDULING
+ else
+ {
+ /* Try to choose the better insn. */
+ int index;
+ int last_p = 0;
+
+ if (max_issue (ready, curr_state, &index, &last_p) == 0)
+ return ready_remove_first (ready);
+ else
+ return ready_remove (ready, index);
+ }
+ #endif /* #if FIRST_CYCLE_MULTIPASS_SCHEDULING */
+ }
+
+ #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
+
/* Use forward list scheduling to rearrange insns of block B in region RGN,
possibly bringing insns from subsequent blocks in the same region. */
*************** schedule_block (b, rgn_n_insns)
*** 1636,1643 ****
--- 2031,2047 ----
{
rtx last;
struct ready_list ready;
+ #if OLD_PIPELINE_INTERFACE
int can_issue_more;
+ #endif
+ #if AUTOMATON_PIPELINE_INTERFACE
+ int first_cycle_insn_p;
+ #if FIRST_CYCLE_MULTIPASS_SCHEDULING && defined (SCHEDULER_BUBBLE)
+ state_t temp_state = alloca (dfa_state_size);
+ #endif
+ #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
+
/* Head/tail info for this block. */
rtx prev_head = current_sched_info->prev_head;
rtx next_tail = current_sched_info->next_tail;
*************** schedule_block (b, rgn_n_insns)
*** 1669,1675 ****
init_block_visualization ();
}
! clear_units ();
/* Allocate the ready list. */
ready.veclen = rgn_n_insns + 1 + ISSUE_RATE;
--- 2073,2087 ----
init_block_visualization ();
}
! #if AUTOMATON_PIPELINE_INTERFACE
! if (USE_AUTOMATON_PIPELINE_INTERFACE)
! state_reset (curr_state);
! #endif
!
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! clear_units ();
! #endif
/* Allocate the ready list. */
ready.veclen = rgn_n_insns + 1 + ISSUE_RATE;
*************** schedule_block (b, rgn_n_insns)
*** 1677,1688 ****
ready.vec = (rtx *) xmalloc (ready.veclen * sizeof (rtx));
ready.n_ready = 0;
(*current_sched_info->init_ready_list) (&ready);
! #ifdef MD_SCHED_INIT
! MD_SCHED_INIT (sched_dump, sched_verbose, ready.veclen);
#endif
/* No insns scheduled in this block yet. */
last_scheduled_insn = 0;
--- 2089,2114 ----
ready.vec = (rtx *) xmalloc (ready.veclen * sizeof (rtx));
ready.n_ready = 0;
+ #if AUTOMATON_PIPELINE_INTERFACE && FIRST_CYCLE_MULTIPASS_SCHEDULING
+ if (USE_AUTOMATON_PIPELINE_INTERFACE)
+ {
+ ready_try = (char *) xmalloc ((rgn_n_insns + 1) * sizeof (char));
+ memset (ready_try, 0, (rgn_n_insns + 1) * sizeof (char));
+ }
+ #endif
+
(*current_sched_info->init_ready_list) (&ready);
! #if AUTOMATON_PIPELINE_INTERFACE && defined (MD_AUTOMATON_SCHED_INIT)
! if (USE_AUTOMATON_PIPELINE_INTERFACE)
! MD_AUTOMATON_SCHED_INIT (sched_dump, sched_verbose);
#endif
+ #if OLD_PIPELINE_INTERFACE && defined (MD_SCHED_INIT)
+ if (!USE_AUTOMATON_PIPELINE_INTERFACE)
+ MD_SCHED_INIT (sched_dump, sched_verbose, ready.veclen);
+ #endif
+
/* No insns scheduled in this block yet. */
last_scheduled_insn = 0;
*************** schedule_block (b, rgn_n_insns)
*** 1690,1697 ****
queue. */
q_ptr = 0;
q_size = 0;
! last_clock_var = 0;
! memset ((char *) insn_queue, 0, sizeof (insn_queue));
/* Start just before the beginning of time. */
clock_var = -1;
--- 2116,2130 ----
queue. */
q_ptr = 0;
q_size = 0;
!
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! max_insn_queue_index = INSN_QUEUE_SIZE - 1;
! #endif
!
! insn_queue = (rtx *) alloca ((MAX_INSN_QUEUE_INDEX + 1) * sizeof (rtx));
! memset ((char *) insn_queue, 0, (MAX_INSN_QUEUE_INDEX + 1) * sizeof (rtx));
! last_clock_var = (USE_AUTOMATON_PIPELINE_INTERFACE ? -1 : 0);
/* Start just before the beginning of time. */
clock_var = -1;
*************** schedule_block (b, rgn_n_insns)
*** 1704,1709 ****
--- 2137,2158 ----
{
clock_var++;
+ #if AUTOMATON_PIPELINE_INTERFACE
+ if (USE_AUTOMATON_PIPELINE_INTERFACE)
+ {
+ #ifdef DFA_SCHEDULER_PRE_CYCLE_INSN
+ state_transition (curr_state, DFA_SCHEDULER_PRE_CYCLE_INSN);
+ #endif
+
+ /* Advance time on one cycle. */
+ state_transition (curr_state, NULL);
+
+ #ifdef DFA_SCHEDULER_POST_CYCLE_INSN
+ state_transition (curr_state, DFA_SCHEDULER_POST_CYCLE_INSN);
+ #endif
+ }
+ #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
+
/* Add to the ready list all pending insns that can be issued now.
If there are no ready insns, increment clock until one
is ready and add all pending insns at that point to the ready
*************** schedule_block (b, rgn_n_insns)
*** 1724,1755 ****
debug_ready_list (&ready);
}
! /* Sort the ready list based on priority. */
! ready_sort (&ready);
/* Allow the target to reorder the list, typically for
better instruction bundling. */
#ifdef MD_SCHED_REORDER
! MD_SCHED_REORDER (sched_dump, sched_verbose, ready_lastpos (&ready),
! ready.n_ready, clock_var, can_issue_more);
#else
! can_issue_more = issue_rate;
#endif
! if (sched_verbose)
! {
! fprintf (sched_dump, "\n;;\tReady list (t =%3d): ", clock_var);
! debug_ready_list (&ready);
}
! /* Issue insns from ready list. */
! while (ready.n_ready != 0
! && can_issue_more
! && (*current_sched_info->schedule_more_p) ())
! {
! /* Select and remove the insn from the ready list. */
! rtx insn = ready_remove_first (&ready);
! int cost = actual_hazard (insn_unit (insn), insn, clock_var, 0);
if (cost >= 1)
{
--- 2173,2343 ----
debug_ready_list (&ready);
}
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! /* Sort the ready list based on priority. */
! ready_sort (&ready);
! #endif
/* Allow the target to reorder the list, typically for
better instruction bundling. */
+ #if AUTOMATON_PIPELINE_INTERFACE && defined (MD_AUTOMATON_SCHED_REORDER)
+ if (USE_AUTOMATON_PIPELINE_INTERFACE)
+ MD_AUTOMATON_SCHED_REORDER (sched_dump, sched_verbose, ready,
+ ready.n_ready, clock_var);
+ #endif
+
+ #if OLD_PIPELINE_INTERFACE
+
+ if (!USE_AUTOMATON_PIPELINE_INTERFACE)
+ {
#ifdef MD_SCHED_REORDER
! MD_SCHED_REORDER (sched_dump, sched_verbose, ready_lastpos (&ready),
! ready.n_ready, clock_var, can_issue_more);
#else
! can_issue_more = issue_rate;
#endif
! if (sched_verbose)
! {
! fprintf (sched_dump, "\n;;\tReady list (t =%3d): ", clock_var);
! debug_ready_list (&ready);
! }
}
+
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
+ #if AUTOMATON_PIPELINE_INTERFACE
+ first_cycle_insn_p = 1;
+ #endif
+ for (;;)
+ {
+ rtx insn;
+ int cost;
+
+ #if AUTOMATON_PIPELINE_INTERFACE
+
+ if (USE_AUTOMATON_PIPELINE_INTERFACE)
+ {
+ if (ready.n_ready == 0 || state_dead_lock_p (curr_state)
+ || !(*current_sched_info->schedule_more_p) ())
+ break;
+
+ /* Sort the ready list based on priority. We make it
+ here because there are some processors which permits
+ to issue some depended insns even on the same
+ cycle. */
+ ready_sort (&ready);
+
+ if (sched_verbose)
+ {
+ fprintf (sched_dump, "\n;;\tReady list (t =%3d): ",
+ clock_var);
+ debug_ready_list (&ready);
+ }
+
+ /* Select and remove the insn from the ready list. */
+ insn = choose_ready (&ready);
+
+ recog_memoized (insn);
+
+ if (INSN_CODE (insn) < 0)
+ {
+ if (!first_cycle_insn_p
+ && (GET_CODE (PATTERN (insn)) == ASM_INPUT
+ || asm_noperands (PATTERN (insn)) >= 0))
+ /* This is asm insn which is tryed to be issued on the
+ cycle not first. Issue it on the next cycle. */
+ cost = 1;
+ else
+ /* A USE insn, or something else we don't need to
+ understand. We can't pass these directly to
+ state_transition because it will trigger a
+ fatal error for unrecognizable insns. */
+ cost = 0;
+ }
+ else
+ {
+ cost = state_transition (curr_state, insn);
+
+ #if FIRST_CYCLE_MULTIPASS_SCHEDULING && defined (SCHEDULER_BUBBLE)
+
+ if (cost == 0)
+ {
+ int j;
+ rtx bubble;
+
+ for (j = 0;
+ (bubble = SCHEDULER_BUBBLE (j)) != NULL_RTX;
+ j++)
+ {
+ memcpy (temp_state, curr_state, dfa_state_size);
+
+ if (state_transition (temp_state, bubble) < 0
+ && state_transition (temp_state, insn) < 0)
+ break;
+ }
+
+ if (bubble != NULL_RTX)
+ {
+ memcpy (curr_state, temp_state, dfa_state_size);
+
+ if (insert_schedule_bubbles_p)
+ {
+ rtx copy;
+
+ copy = copy_rtx (PATTERN (bubble));
+ emit_insn_after (copy, last);
+ last = NEXT_INSN (last);
+ INSN_CODE (last) = INSN_CODE (bubble);
+
+ /* Annotate the same for the first insns
+ scheduling by using mode. */
+ PUT_MODE (last, (clock_var > last_clock_var
+ ? clock_var - last_clock_var
+ : VOIDmode));
+ last_clock_var = clock_var;
+
+ if (sched_verbose >= 2)
+ {
+ fprintf (sched_dump,
+ ";;\t\t--> scheduling bubble insn <<<%d>>>:reservation ",
+ INSN_UID (last));
+
+ recog_memoized (last);
+
+ if (INSN_CODE (last) < 0)
+ fprintf (sched_dump, "nothing");
+ else
+ print_reservation (sched_dump, last);
+
+ fprintf (sched_dump, "\n");
+ }
+ }
+ cost = -1;
+ }
+ }
+ #endif /* #if FIRST_CYCLE_MULTIPASS_SCHEDULING && defined (SCHEDULER_BUBBLE) */
+
+ if (cost < 0)
+ cost = 0;
+ else if (cost == 0)
+ cost = 1;
+ }
+ }
+
+ #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! if (ready.n_ready == 0 || !can_issue_more
! || !(*current_sched_info->schedule_more_p) ())
! break;
! insn = ready_remove_first (&ready);
! cost = actual_hazard (insn_unit (insn), insn, clock_var, 0);
! }
! #endif /* #if OLD_PIPELINE_INTERFACE */
if (cost >= 1)
{
*************** schedule_block (b, rgn_n_insns)
*** 1763,1792 ****
last_scheduled_insn = insn;
last = move_insn (insn, last);
#ifdef MD_SCHED_VARIABLE_ISSUE
! MD_SCHED_VARIABLE_ISSUE (sched_dump, sched_verbose, insn,
! can_issue_more);
#else
! can_issue_more--;
#endif
schedule_insn (insn, &ready, clock_var);
next:
! ;
! #ifdef MD_SCHED_REORDER2
! /* Sort the ready list based on priority. */
! if (ready.n_ready > 0)
! ready_sort (&ready);
! MD_SCHED_REORDER2 (sched_dump, sched_verbose,
! ready.n_ready ? ready_lastpos (&ready) : NULL,
! ready.n_ready, clock_var, can_issue_more);
! #endif
}
! /* Debug info. */
! if (sched_verbose)
visualize_scheduled_insns (clock_var);
}
#ifdef MD_SCHED_FINISH
--- 2351,2393 ----
last_scheduled_insn = insn;
last = move_insn (insn, last);
+ #if OLD_PIPELINE_INTERFACE
+ if (!USE_AUTOMATON_PIPELINE_INTERFACE)
+ {
#ifdef MD_SCHED_VARIABLE_ISSUE
! MD_SCHED_VARIABLE_ISSUE (sched_dump, sched_verbose, insn,
! can_issue_more);
#else
! can_issue_more--;
#endif
+ }
+ #endif /* #if OLD_PIPELINE_INTERFACE */
schedule_insn (insn, &ready, clock_var);
next:
! #if AUTOMATON_PIPELINE_INTERFACE
! first_cycle_insn_p = 0;
! #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
!
! #if OLD_PIPELINE_INTERFACE && defined (MD_SCHED_REORDER2)
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! /* Sort the ready list based on priority. */
! if (ready.n_ready > 0)
! ready_sort (&ready);
! MD_SCHED_REORDER2 (sched_dump, sched_verbose,
! ready.n_ready ? ready_lastpos (&ready) : NULL,
! ready.n_ready, clock_var, can_issue_more);
! }
! #endif /* #if OLD_PIPELINE_INTERFACE && defined (MD_SCHED_REORDER2) */
}
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE && sched_verbose)
! /* Debug info. */
visualize_scheduled_insns (clock_var);
+ #endif
}
#ifdef MD_SCHED_FINISH
*************** schedule_block (b, rgn_n_insns)
*** 1798,1804 ****
{
fprintf (sched_dump, ";;\tReady list (final): ");
debug_ready_list (&ready);
! print_block_visualization ("");
}
/* Sanity check -- queue must be empty now. Meaningless if region has
--- 2399,2408 ----
{
fprintf (sched_dump, ";;\tReady list (final): ");
debug_ready_list (&ready);
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! print_block_visualization ("");
! #endif /* #if OLD_PIPELINE_INTERFACE */
}
/* Sanity check -- queue must be empty now. Meaningless if region has
*************** schedule_block (b, rgn_n_insns)
*** 1843,1848 ****
--- 2447,2457 ----
current_sched_info->tail = tail;
free (ready.vec);
+
+ #if AUTOMATON_PIPELINE_INTERFACE && FIRST_CYCLE_MULTIPASS_SCHEDULING
+ if (USE_AUTOMATON_PIPELINE_INTERFACE)
+ free (ready_try);
+ #endif
}
/* Set_priorities: compute priority of each insn in the block. */
*************** sched_init (dump_file)
*** 1884,1889 ****
--- 2493,2501 ----
{
int luid, b;
rtx insn;
+ #if AUTOMATON_PIPELINE_INTERFACE
+ int i;
+ #endif
/* Disable speculative loads in their presence if cc0 defined. */
#ifdef HAVE_cc0
*************** sched_init (dump_file)
*** 1899,1906 ****
--- 2511,2520 ----
sched_dump = ((sched_verbose_param >= 10 || !dump_file)
? stderr : dump_file);
+ #if OLD_PIPELINE_INTERFACE
/* Initialize issue_rate. */
issue_rate = ISSUE_RATE;
+ #endif
split_all_insns (1);
*************** sched_init (dump_file)
*** 1910,1915 ****
--- 2524,2553 ----
h_i_d = (struct haifa_insn_data *) xcalloc (old_max_uid, sizeof (*h_i_d));
+ #if AUTOMATON_PIPELINE_INTERFACE
+
+ if (USE_AUTOMATON_PIPELINE_INTERFACE)
+ for (i = 0; i < old_max_uid; i++)
+ h_i_d [i].priority = h_i_d [i].cost = -1;
+
+ #ifdef INIT_DFA_SCHEDULER_PRE_CYCLE_INSN
+ INIT_DFA_SCHEDULER_PRE_CYCLE_INSN ();
+ #endif
+
+ #ifdef INIT_DFA_SCHEDULER_POST_CYCLE_INSN
+ INIT_DFA_SCHEDULER_POST_CYCLE_INSN ();
+ #endif
+
+ #if FIRST_CYCLE_MULTIPASS_SCHEDULING && defined (INIT_SCHEDULER_BUBBLES)
+ INIT_SCHEDULER_BUBBLES ();
+ #endif
+
+ dfa_start ();
+ dfa_state_size = state_size ();
+ curr_state = xmalloc (dfa_state_size);
+
+ #endif /* if AUTOMATON_PIPELINE_INTERFACE */
+
h_i_d[0].luid = 0;
luid = 1;
for (b = 0; b < n_basic_blocks; b++)
*************** sched_init (dump_file)
*** 1967,1975 ****
}
}
! /* Find units used in this fuction, for visualization. */
! if (sched_verbose)
init_target_units ();
/* ??? Add a NOTE after the last insn of the last basic block. It is not
known why this is done. */
--- 2605,2615 ----
}
}
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE && sched_verbose)
! /* Find units used in this fuction, for visualization. */
init_target_units ();
+ #endif /* if OLD_PIPELINE_INTERFACE */
/* ??? Add a NOTE after the last insn of the last basic block. It is not
known why this is done. */
*************** void
*** 1994,1999 ****
--- 2634,2645 ----
sched_finish ()
{
free (h_i_d);
+
+ #if AUTOMATON_PIPELINE_INTERFACE
+ free (curr_state);
+ dfa_finish ();
+ #endif /* if AUTOMATON_PIPELINE_INTERFACE */
+
free_dependency_caches ();
end_alias_analysis ();
if (write_symbols != NO_DEBUG)
Index: sched-rgn.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/sched-rgn.c,v
retrieving revision 1.11
diff -c -p -r1.11 sched-rgn.c
*** sched-rgn.c 2001/01/19 18:28:58 1.11
--- sched-rgn.c 2001/01/31 18:47:22
*************** static void compute_block_backward_depen
*** 303,308 ****
--- 303,313 ----
void debug_dependencies PARAMS ((void));
static void init_regions PARAMS ((void));
+
+ #if AUTOMATON_PIPELINE_INTERFACE
+ static void remove_new_cpu_cycle_marks PARAMS ((int));
+ #endif
+
static void schedule_region PARAMS ((int));
static void propagate_deps PARAMS ((int, struct deps *));
static void free_pending_lists PARAMS ((void));
*************** init_ready_list (ready)
*** 2137,2150 ****
for (insn = src_head; insn != src_next_tail; insn = NEXT_INSN (insn))
{
if (! INSN_P (insn))
continue;
! if (!CANT_MOVE (insn)
&& (!IS_SPECULATIVE_INSN (insn)
|| (insn_issue_delay (insn) <= 3
&& check_live (insn, bb_src)
&& is_exception_free (insn, bb_src, target_bb))))
{
rtx next;
--- 2142,2175 ----
for (insn = src_head; insn != src_next_tail; insn = NEXT_INSN (insn))
{
+ int move_p = 0;
+
if (! INSN_P (insn))
continue;
! #if AUTOMATON_PIPELINE_INTERFACE
! if (USE_AUTOMATON_PIPELINE_INTERFACE)
! recog_memoized (insn);
!
! if (USE_AUTOMATON_PIPELINE_INTERFACE && !CANT_MOVE (insn)
&& (!IS_SPECULATIVE_INSN (insn)
+ || (INSN_CODE (insn) >= 0
+ && min_insn_conflict_delay (curr_state, insn, insn) <= 3
+ && check_live (insn, bb_src)
+ && is_exception_free (insn, bb_src, target_bb))))
+ move_p = 0;
+ #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
+
+ #if OLD_PIPELINE_INTERFACE
+ if (!USE_AUTOMATON_PIPELINE_INTERFACE && !CANT_MOVE (insn)
+ && (!IS_SPECULATIVE_INSN (insn)
|| (insn_issue_delay (insn) <= 3
&& check_live (insn, bb_src)
&& is_exception_free (insn, bb_src, target_bb))))
+ move_p = 0;
+ #endif
+
+ if (move_p)
{
rtx next;
*************** new_ready (next)
*** 2246,2252 ****
{
/* For speculative insns, before inserting to ready/queue,
check live, exception-free, and issue-delay. */
! if (INSN_BB (next) != target_bb
&& (!IS_VALID (INSN_BB (next))
|| CANT_MOVE (next)
|| (IS_SPECULATIVE_INSN (next)
--- 2271,2296 ----
{
/* For speculative insns, before inserting to ready/queue,
check live, exception-free, and issue-delay. */
! #if AUTOMATON_PIPELINE_INTERFACE
! if (USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! recog_memoized (next);
!
! if (INSN_BB (next) != target_bb
! && (!IS_VALID (INSN_BB (next))
! || CANT_MOVE (next)
! || (IS_SPECULATIVE_INSN (next)
! && (INSN_CODE (next) < 0
! || min_insn_conflict_delay (curr_state, next, next) > 3
! || !check_live (next, INSN_BB (next))
! || !is_exception_free (next, INSN_BB (next),
! target_bb)))))
! return 0;
! }
! #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
!
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE && INSN_BB (next) != target_bb
&& (!IS_VALID (INSN_BB (next))
|| CANT_MOVE (next)
|| (IS_SPECULATIVE_INSN (next)
*************** new_ready (next)
*** 2254,2259 ****
--- 2298,2305 ----
|| !check_live (next, INSN_BB (next))
|| !is_exception_free (next, INSN_BB (next), target_bb)))))
return 0;
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
return 1;
}
*************** debug_dependencies ()
*** 2642,2655 ****
fprintf (sched_dump, "\n;; --- Region Dependences --- b %d bb %d \n",
BB_TO_BLOCK (bb), bb);
! fprintf (sched_dump, ";; %7s%6s%6s%6s%6s%6s%11s%6s\n",
! "insn", "code", "bb", "dep", "prio", "cost", "blockage", "units");
! fprintf (sched_dump, ";; %7s%6s%6s%6s%6s%6s%11s%6s\n",
! "----", "----", "--", "---", "----", "----", "--------", "-----");
for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
{
rtx link;
int unit, range;
if (! INSN_P (insn))
{
--- 2688,2721 ----
fprintf (sched_dump, "\n;; --- Region Dependences --- b %d bb %d \n",
BB_TO_BLOCK (bb), bb);
! #if AUTOMATON_PIPELINE_INTERFACE
! if (USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! fprintf (sched_dump, ";; %7s%6s%6s%6s%6s%6s%14s\n",
! "insn", "code", "bb", "dep", "prio", "cost",
! "reservation");
! fprintf (sched_dump, ";; %7s%6s%6s%6s%6s%6s%14s\n",
! "----", "----", "--", "---", "----", "----",
! "-----------");
! }
! #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
!
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! fprintf (sched_dump, ";; %7s%6s%6s%6s%6s%6s%11s%6s\n",
! "insn", "code", "bb", "dep", "prio", "cost", "blockage", "units");
! fprintf (sched_dump, ";; %7s%6s%6s%6s%6s%6s%11s%6s\n",
! "----", "----", "--", "---", "----", "----", "--------", "-----");
! }
! #endif /* #if OLD_PIPELINE_INTERFACE */
!
for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
{
rtx link;
+ #if OLD_PIPELINE_INTERFACE
int unit, range;
+ #endif
if (! INSN_P (insn))
{
*************** debug_dependencies ()
*** 2669,2690 ****
continue;
}
! unit = insn_unit (insn);
! range = (unit < 0
! || function_units[unit].blockage_range_function == 0) ? 0 :
! function_units[unit].blockage_range_function (insn);
! fprintf (sched_dump,
! ";; %s%5d%6d%6d%6d%6d%6d %3d -%3d ",
! (SCHED_GROUP_P (insn) ? "+" : " "),
! INSN_UID (insn),
! INSN_CODE (insn),
! INSN_BB (insn),
! INSN_DEP_COUNT (insn),
! INSN_PRIORITY (insn),
! insn_cost (insn, 0, 0),
! (int) MIN_BLOCKAGE_COST (range),
! (int) MAX_BLOCKAGE_COST (range));
! insn_print_units (insn);
fprintf (sched_dump, "\t: ");
for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
fprintf (sched_dump, "%d ", INSN_UID (XEXP (link, 0)));
--- 2735,2785 ----
continue;
}
! #if AUTOMATON_PIPELINE_INTERFACE
! if (USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! fprintf (sched_dump,
! ";; %s%5d%6d%6d%6d%6d%6d ",
! (SCHED_GROUP_P (insn) ? "+" : " "),
! INSN_UID (insn),
! INSN_CODE (insn),
! INSN_BB (insn),
! INSN_DEP_COUNT (insn),
! INSN_PRIORITY (insn),
! insn_cost (insn, 0, 0));
!
! recog_memoized (insn);
! if (INSN_CODE (insn) < 0)
! fprintf (sched_dump, "nothing");
! else
! print_reservation (sched_dump, insn);
! }
! #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
!
! #if OLD_PIPELINE_INTERFACE
! if (!USE_AUTOMATON_PIPELINE_INTERFACE)
! {
! unit = insn_unit (insn);
! range
! = (unit < 0
! || function_units[unit].blockage_range_function == 0
! ? 0
! : function_units[unit].blockage_range_function (insn));
! fprintf (sched_dump,
! ";; %s%5d%6d%6d%6d%6d%6d %3d -%3d ",
! (SCHED_GROUP_P (insn) ? "+" : " "),
! INSN_UID (insn),
! INSN_CODE (insn),
! INSN_BB (insn),
! INSN_DEP_COUNT (insn),
! INSN_PRIORITY (insn),
! insn_cost (insn, 0, 0),
! (int) MIN_BLOCKAGE_COST (range),
! (int) MAX_BLOCKAGE_COST (range));
! insn_print_units (insn);
! }
! #endif /* #if OLD_PIPELINE_INTERFACE */
!
fprintf (sched_dump, "\t: ");
for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
fprintf (sched_dump, "%d ", INSN_UID (XEXP (link, 0)));
*************** debug_dependencies ()
*** 2695,2700 ****
--- 2790,2821 ----
fprintf (sched_dump, "\n");
}
+
+ #if AUTOMATON_PIPELINE_INTERFACE
+
+ /* The function removes marks about start of new cycle made in the first
+ instruction scheduling. Although regmove may remove them too. */
+
+ static void
+ remove_new_cpu_cycle_marks (bb)
+ int bb;
+ {
+ rtx next_tail;
+ rtx tail;
+ rtx head;
+ rtx insn;
+
+ get_block_head_tail (BB_TO_BLOCK (bb), &head, &tail);
+
+ next_tail = NEXT_INSN (tail);
+
+ for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
+ if (INSN_P (insn) && GET_MODE (insn) == TImode)
+ PUT_MODE (insn, VOIDmode);
+ }
+
+ #endif /* #if AUTOMATON_PIPELINE_INTERFACE */
+
/* Schedule a region. A region is either an inner loop, a loop-free
subroutine, or a single basic block. Each bb in the region is
scheduled after its flow predecessors. */
*************** schedule_region (rgn)
*** 2712,2717 ****
--- 2833,2844 ----
current_blocks = RGN_BLOCKS (rgn);
init_deps_global ();
+
+ #if AUTOMATON_PIPELINE_INTERFACE
+ if (reload_completed && USE_AUTOMATON_PIPELINE_INTERFACE)
+ for (bb = 0; bb < current_nr_blocks; bb++)
+ remove_new_cpu_cycle_marks (bb);
+ #endif
/* Initializations for region data dependence analyisis. */
bb_deps = (struct deps *) xmalloc (sizeof (struct deps) * current_nr_blocks);
Index: sched-vis.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/sched-vis.c,v
retrieving revision 1.5
diff -c -p -r1.5 sched-vis.c
*** sched-vis.c 2000/12/22 12:27:36 1.5
--- sched-vis.c 2001/01/31 18:47:22
*************** the Free Software Foundation, 59 Temple
*** 33,38 ****
--- 33,40 ----
#include "sched-int.h"
#ifdef INSN_SCHEDULING
+
+ #if OLD_PIPELINE_INTERFACE
/* target_units bitmask has 1 for each unit in the cpu. It should be
possible to compute this variable from the machine description.
But currently it is computed by examining the insn list. Since
*************** the Free Software Foundation, 59 Temple
*** 41,46 ****
--- 43,49 ----
definition of function_units[] in "insn-attrtab.c".) */
static int target_units = 0;
+ #endif /* #if OLD_PIPELINE_INTERFACE */
static char *safe_concat PARAMS ((char *, char *, const char *));
static int get_visual_tbl_length PARAMS ((void));
*************** static void print_value PARAMS ((char *,
*** 49,54 ****
--- 52,59 ----
static void print_pattern PARAMS ((char *, rtx, int));
static void print_insn PARAMS ((char *, rtx, int));
+ #if OLD_PIPELINE_INTERFACE
+
/* Print names of units on which insn can/should execute, for debugging. */
void
*************** insn_print_units (insn)
*** 76,81 ****
--- 81,88 ----
}
}
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
/* MAX_VISUAL_LINES is the maximum number of lines in visualization table
of a basic block. If more lines are needed, table is splitted to two.
n_visual_lines is the number of lines printed so far for a block.
*************** char *visual_tbl;
*** 89,94 ****
--- 96,103 ----
int n_vis_no_unit;
rtx vis_no_unit[10];
+ #if OLD_PIPELINE_INTERFACE
+
/* Finds units that are in use in this fuction. Required only
for visualization. */
*************** init_target_units ()
*** 112,126 ****
--- 121,147 ----
}
}
+ #endif /* #if OLD_PIPELINE_INTERFACE */
+
/* Return the length of the visualization table. */
static int
get_visual_tbl_length ()
{
+ #if OLD_PIPELINE_INTERFACE
int unit, i;
int n, n1;
char *s;
+ #endif
+ if (USE_AUTOMATON_PIPELINE_INTERFACE)
+ {
+ visual_tbl_line_length = 1;
+ return 1; /* Can't return 0 because that will cause problems
+ with alloca. */
+ }
+
+ #if OLD_PIPELINE_INTERFACE
/* Compute length of one field in line. */
s = (char *) alloca (INSN_LEN + 6);
sprintf (s, " %33s", "uname");
*************** get_visual_tbl_length ()
*** 140,145 ****
--- 161,167 ----
/* Compute length of visualization string. */
return (MAX_VISUAL_LINES * n);
+ #endif /* #if OLD_PIPELINE_INTERFACE */
}
/* Init block visualization debugging info. */
*************** print_insn (buf, x, verbose)
*** 808,813 ****
--- 830,837 ----
}
} /* print_insn */
+ #if OLD_PIPELINE_INTERFACE
+
/* Print visualization debugging info. */
void
*************** visualize_stall_cycles (stalls)
*** 930,935 ****
--- 954,961 ----
strcpy (p, suffix);
}
+
+ #endif /* #if OLD_PIPELINE_INTERFACE */
/* Allocate data used for visualization during scheduling. */
Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/egcs/gcc/Makefile.in,v
retrieving revision 1.593
diff -c -p -r1.593 Makefile.in
*** Makefile.in 2001/01/29 01:48:06 1.593
--- Makefile.in 2001/01/31 18:47:22
*************** CLIB=
*** 364,369 ****
--- 364,373 ----
# system library.
OBSTACK=obstack.o
+ # The following object files is used by genattrtab.
+ GETRUNTIME = getruntime.o
+ HASHTAB = hashtab.o
+
# The GC method to be used on this system.
GGC=@GGC@.o
*************** HOST_MALLOC=$(MALLOC)
*** 519,524 ****
--- 523,530 ----
HOST_OBSTACK=$(OBSTACK)
HOST_VFPRINTF=$(VFPRINTF)
HOST_DOPRINT=$(DOPRINT)
+ HOST_GETRUNTIME=$(GETRUNTIME)
+ HOST_HASHTAB=$(HASHTAB)
HOST_STRSTR=$(STRSTR)
# Actual name to use when installing a native compiler.
*************** USE_HOST_MALLOC= ` case "${HOST_MALLOC}"
*** 616,621 ****
--- 622,629 ----
USE_HOST_OBSTACK= ` case "${HOST_OBSTACK}" in ?*) echo ${HOST_PREFIX}${HOST_OBSTACK} ;; esac `
USE_HOST_VFPRINTF= ` case "${HOST_VFPRINTF}" in ?*) echo ${HOST_PREFIX}${HOST_VFPRINTF} ;; esac `
USE_HOST_DOPRINT= ` case "${HOST_DOPRINT}" in ?*) echo ${HOST_PREFIX}${HOST_DOPRINT} ;; esac `
+ USE_HOST_GETRUNTIME= ` case "${HOST_GETRUNTIME}" in ?*) echo ${HOST_PREFIX}${HOST_GETRUNTIME} ;; esac `
+ USE_HOST_HASHTAB= ` case "${HOST_HASHTAB}" in ?*) echo ${HOST_PREFIX}${HOST_HASHTAB} ;; esac `
USE_HOST_STRSTR= ` case "${HOST_STRSTR}" in ?*) echo ${HOST_PREFIX}${HOST_STRSTR} ;; esac `
# Dependency on obstack, alloca, malloc or whatever library facilities
*************** HOST_RTL = $(HOST_PREFIX)rtl.o $(HOST_PR
*** 643,648 ****
--- 651,657 ----
HOST_PRINT = $(HOST_PREFIX)print-rtl.o
HOST_ERRORS = $(HOST_PREFIX)errors.o
+ HOST_VARRAY = $(HOST_PREFIX)varray.o
# Specify the directories to be searched for header files.
# Both . and srcdir are used, in that order,
*************** obstack.o: $(srcdir)/../libiberty/obstac
*** 1321,1326 ****
--- 1330,1340 ----
$(CC) -c $(ALL_CFLAGS) -DGENERATOR_FILE $(ALL_CPPFLAGS) $(INCLUDES) \
obstack.c $(OUTPUT_OPTION)
+ getruntime.o: $(srcdir)/../libiberty/getruntime.c $(CONFIG_H)
+ rm -f getruntime.c
+ $(LN_S) $(srcdir)/../libiberty/getruntime.c getruntime.c
+ $(CC) -c $(ALL_CFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) getruntime.c
+
prefix.o: prefix.c $(CONFIG_H) system.h Makefile prefix.h
$(CC) $(ALL_CFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
-DPREFIX=\"$(prefix)\" \
*************** genattr$(build_exeext) : genattr.o $(HOS
*** 1783,1794 ****
genattr.o : genattr.c $(RTL_H) $(build_xm_file) system.h errors.h gensupport.h
$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(srcdir)/genattr.c
! genattrtab$(build_exeext) : genattrtab.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_LIBDEPS)
$(HOST_CC) $(HOST_CFLAGS) $(HOST_LDFLAGS) -o $@ \
! genattrtab.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_LIBS)
genattrtab.o : genattrtab.c $(RTL_H) $(OBSTACK_H) $(build_xm_file) \
! system.h errors.h $(GGC_H) gensupport.h
$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(srcdir)/genattrtab.c
genoutput$(build_exeext) : genoutput.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_LIBDEPS)
--- 1797,1808 ----
genattr.o : genattr.c $(RTL_H) $(build_xm_file) system.h errors.h gensupport.h
$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(srcdir)/genattr.c
! genattrtab$(build_exeext) : genattrtab.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_VARRAY) $(HOST_PREFIX)$(HOST_GETRUNTIME) $(HOST_LIBDEPS)
$(HOST_CC) $(HOST_CFLAGS) $(HOST_LDFLAGS) -o $@ \
! genattrtab.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_VARRAY) $(USE_HOST_GETRUNTIME) $(HOST_LIBS) -lm
genattrtab.o : genattrtab.c $(RTL_H) $(OBSTACK_H) $(build_xm_file) \
! system.h errors.h $(GGC_H) gensupport.h $(srcdir)/genautomata.c
$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(srcdir)/genattrtab.c
genoutput$(build_exeext) : genoutput.o $(HOST_RTL) $(HOST_PRINT) $(HOST_ERRORS) $(HOST_LIBDEPS)
*************** $(HOST_PREFIX_1)obstack.o: $(srcdir)/../
*** 1842,1847 ****
--- 1856,1871 ----
rm -f $(HOST_PREFIX)obstack.c
sed -e 's/config[.]h/hconfig.h/' $(srcdir)/../libiberty/obstack.c > $(HOST_PREFIX)obstack.c
$(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(HOST_PREFIX)obstack.c
+
+ $(HOST_PREFIX_1)getruntime.o: $(srcdir)/../libiberty/getruntime.c
+ rm -f $(HOST_PREFIX)getruntime.c
+ sed -e 's/config[.]h/hconfig.h/' $(srcdir)/../libiberty/getruntime.c > $(HOST_PREFIX)getruntime.c
+ $(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(HOST_PREFIX)getruntime.c
+
+ $(HOST_PREFIX_1)hashtab.o: $(srcdir)/../libiberty/hashtab.c
+ rm -f $(HOST_PREFIX)hashtab.c
+ sed -e 's/config[.]h/hconfig.h/' $(srcdir)/../libiberty/hashtab.c > $(HOST_PREFIX)hashtab.c
+ $(HOST_CC) -c $(HOST_CFLAGS) $(HOST_CPPFLAGS) $(INCLUDES) $(HOST_PREFIX)hashtab.c
$(HOST_PREFIX_1)vfprintf.o: $(srcdir)/../libiberty/vfprintf.c
rm -f $(HOST_PREFIX)vfprintf.c
Index: md.texi
===================================================================
RCS file: /cvs/gcc/egcs/gcc/md.texi,v
retrieving revision 1.55
diff -c -p -r1.55 md.texi
*** md.texi 2001/01/02 02:56:01 1.55
--- md.texi 2001/01/31 18:47:23
*************** in the compiler.@refill
*** 3574,3580 ****
There are two cases where you should specify how to split a pattern into
multiple insns. On machines that have instructions requiring delay
slots (@pxref{Delay Slots}) or that have instructions whose output is
! not available for multiple cycles (@pxref{Function Units}), the compiler
phases that optimize these cases need to be able to move insns into
one-instruction delay slots. However, some insns may generate more than one
machine instruction. These insns cannot be placed into a delay slot.
--- 3574,3580 ----
There are two cases where you should specify how to split a pattern into
multiple insns. On machines that have instructions requiring delay
slots (@pxref{Delay Slots}) or that have instructions whose output is
! not available for multiple cycles (@pxref{Processor pipeline description}), the compiler
phases that optimize these cases need to be able to move insns into
one-instruction delay slots. However, some insns may generate more than one
machine instruction. These insns cannot be placed into a delay slot.
*************** to track the condition codes.
*** 4107,4113 ****
* Insn Lengths:: Computing the length of insns.
* Constant Attributes:: Defining attributes that are constant.
* Delay Slots:: Defining delay slots required for a machine.
! * Function Units:: Specifying information for insn scheduling.
@end menu
@node Defining Attributes
--- 4107,4113 ----
* Insn Lengths:: Computing the length of insns.
* Constant Attributes:: Defining attributes that are constant.
* Delay Slots:: Defining delay slots required for a machine.
! * Processor pipeline description:: Specifying information for insn scheduling.
@end menu
@node Defining Attributes
*************** branch is true, we might represent this
*** 4737,4744 ****
@end smallexample
@c the above is *still* too long. --mew 4feb93
! @node Function Units
! @subsection Specifying Function Units
@cindex function units, for scheduling
On most RISC machines, there are instructions whose results are not
--- 4737,4812 ----
@end smallexample
@c the above is *still* too long. --mew 4feb93
! @node Processor pipeline description
! @subsection Specifying processor pipeline description
!
! To achieve better productivity the most of modern processors
! (super-pipelined, superscalar RISC, and VLIW processors) have many
! @dfn{functional units} on which several instructions can be executed
! simultaneously. An instruction execution can be started only if its
! issue conditions are satisfied. If not, instruction is interlocked
! until its conditions are satisfied. Such an @dfn{interlock (pipeline)
! delay} causes interruption of the fetching of successor instructions
! (or demands @var{nop} instructions, e.g. for some MIPS processors).
!
! There are two major kind of interlock delays in modern processors.
! The first one is data dependence delay determining @dfn{instruction
! latency time}. The instruction execution is not started until all
! source data has been evaluated by previous instructions (there are
! more complex cases when the instruction execution starts even when the
! data are not evaluated but will be ready till given time after the
! instruction execution start). Taking into account of the data
! dependence delays is simple. Data dependence (true, output, and
! anti-dependence) delay between two instructions is given by constant.
! In the most cases this approach is adequate. The second kind of
! interlock delays is reservation delay. Two such way dependent
! instructions under execution will be in need of shared processors
! resources, i.e. buses, internal registers, and/or functional units,
! which are reserved for some time. Taking into account of this kind of
! delay is complex especially for modern RISC processors.
!
! The task of exploiting more processor parallelism is solved by
! instruction scheduler. For better solution of this problem, the
! instruction scheduler has to have adequate description of processor
! parallelism (or @dfn{pipeline description}). Currently GCC has two
! ways to describe processor parallelism. The first one is old and
! originated from instruction scheduler written by Michael Tiemann and
! described in the first subsequent section. The second one is new and
! based on description of functional unit reservations by processor
! instructions with the aid of @dfn{regular expressions}. This is so
! called @dfn{automaton based description}.
!
! Gcc instruction scheduler uses @dfn{pipeline hazard recognizer} to
! figure out possibility of instruction issue by processor on given
! simulated processor cycle. The pipeline hazard recognizer is code
! generated from processor pipeline description. The pipeline hazard
! recognizer generated from new description is more sophisticated and
! based on finite state automaton and therefore faster than one
! generated from the old description. Also its speed is not depended on
! processor complexity. The instruction issue is possible if there is
! transition from one automaton state to another one.
!
! You can use any model to describe processor pipeline characteristics or
! even mix of them. You could use old description for some processor
! submodels and new one for the rest processor submodels.
!
! In general, the usage of automaton based description is more
! preferable. The model is more rich. It permits to describe more
! accurately pipeline characteristics of processors which results in
! improving code quality (although sometimes only on several percent
! fractions). It will be also used as infrastructure to implement
! sophisticated and practical insn scheduling which will try many
! instruction sequences to choose the best one.
!
!
! @menu
! * Old pipeline description:: Specifying information for insn scheduling.
! * Automaton pipeline description:: Describing insn pipeline characteristics.
! * Comparison of two descriptions:: Drawbacks of old pipeline description
! @end menu
!
! @node Old pipeline description
! @subsubsection Specifying Function Units
@cindex function units, for scheduling
On most RISC machines, there are instructions whose results are not
*************** units. These insns will cause a potenti
*** 4855,4860 ****
--- 4923,5268 ----
used during their execution and there is no way of representing that
conflict. We welcome any examples of how function unit conflicts work
in such processors and suggestions for their representation.
+
+ @node Automaton pipeline description
+ @subsubsection Describing instruction pipeline characteristics
+ @cindex automaton based pipeline description
+
+ This section describes constructions of automaton based processor
+ pipeline description. The order of all mentioned below constructions
+ in machine description file is not important.
+
+ @findex define_automaton
+
+ The following optional construction describes names of automata
+ generated and used for pipeline hazards recognition. Sometimes the
+ generated finite state automaton used by pipeline hazard recognizer is
+ large. If we use more one automata and bind functional units to the
+ automata, the summary size of the automata usually is less than the
+ size of the single one. If there is no one such construction, only
+ one finite state automaton is generated.
+
+ @smallexample
+ (define_automaton @var{automata-names})
+ @end smallexample
+
+ @var{automata-names} is a string giving names of the automata. The
+ names are separated by comma. All automata should have unique names.
+ The automaton name is used in construction @code{define_cpu_unit} and
+ @code{define_query_cpu_unit}.
+
+ @findex define_cpu_unit
+ Each processor functional unit used in description of instruction
+ reservations should be described by the following construction.
+
+ @smallexample
+ (define_cpu_unit @var{unit-names} [@var{automaton-name}])
+ @end smallexample
+
+ @var{names} is a string giving the names of the functional units
+ separated by commas. Don't use name @dfn{nothing}, it is reserved for
+ other goals.
+
+ @var{automaton-name} is a string giving the name of automaton with
+ which the unit is bound. The automaton should be described in
+ construction @code{define_automaton}. You should give
+ @dfn{automaton-name}, if there is a defined automaton.
+
+ @findex define_query_cpu_unit
+
+ The following construction describes CPU functional units analogously
+ to @code{define_cpu_unit}. If we use automaton without its
+ minimization, the reservation of such units can be queried for
+ automaton state. Instruction scheduler never queries reservation of
+ functional units for given automaton state. So as rule, you don't
+ need this construction. This construction could be used to future
+ code generation goals (e.g. to generate VLIW insn templates).
+
+ @smallexample
+ (define_query_cpu_unit @var{unit-names} [@var{automaton-name}])
+ @end smallexample
+
+ @var{names} is a string giving the names of the functional units
+ separated by commas.
+
+ @var{automaton-name} is a string giving the name of automaton with
+ which the unit is bound.
+
+ @findex define_insn_reservation
+
+ The following construction is major one to describe pipeline
+ characteristics of instruction.
+
+ @smallexample
+ (define_insn_reservation @var{insn-name} @var{default_latency}
+ @var{condition} @var{regexp})
+ @end smallexample
+
+ @var{default_latency} is number giving latency time of the
+ instruction.
+
+ @var{insn-names} is string giving an internal name of insn. The
+ internal names are used in constructions @code{define_bypass} and in
+ automaton description file used for debugging. The internal name has
+ nothing common with names in @code{define_insn}. It is a good
+ practice to use insn classes described in the processor manual.
+
+ @var{condition} defines what RTL insns are described by this construction.
+
+ @var{regexp} is string describing reservation of cpu functional units
+ by the instruction. The reservations are described by a regular
+ expression according the following syntax:
+
+ @smallexample
+ regexp = regexp "," oneof
+ | oneof
+
+ oneof = oneof "|" allof
+ | allof
+
+ allof = allof "+" repeat
+ | repeat
+
+ repeat = element "*" number
+ | element
+
+ element = cpu_function_unit_name
+ | reservation_name
+ | result_name
+ | "nothing"
+ | "(" regexp ")"
+ @end smallexample
+
+ @itemize @bullet
+ @item
+ @samp{","} is used for describing start of the next cycle in
+ reservation.
+
+ @item
+ @samp{"|"} is used for describing the reservation described by the
+ first regular expression *or* the reservation described by the second
+ regular expression *or* etc.
+
+ @item
+ @samp{"+"} is used for describing the reservation described by the
+ first regular expression *and* the reservation described by the second
+ regular expression *and* etc.
+
+ @item
+ @samp{"*"} is used for convenience and simply means sequence in which
+ the regular expression are repeated @var{number} times with cycle
+ advancing (see @samp{","}).
+
+ @item
+ @samp{cpu_function_unit_name} denotes reservation of the named
+ functional unit.
+
+ @item
+ @samp{reservation_name} -- see description of construction
+ @samp{define_reservation}.
+
+ @item
+ @samp{"nothing"} denotes no unit reservations.
+ @end itemize
+
+ @findex define_reservation
+
+ Sometimes unit reservations for different insns contain common parts.
+ In such case, you can simplify pipeline description by describing the
+ common part by the following construction
+
+ @smallexample
+ (define_reservation @var{reservation-name} @var{regexp})
+ @end smallexample
+
+ @var{reservation-name} is string giving name of @var{regexp}. The
+ functional unit names and reservation names are in the same name
+ space. So the reservation names should be different from functional
+ unit names and can not be reserved name @dfn{nothing}.
+
+ @findex define_bypass
+ The following construction is used to describe exceptions in latency
+ time for given instruction pair. This is so called bypasses.
+
+ @smallexample
+ (define_reservation @var{number} @var{out_insn_names} @var{in_insn_names}
+ [@var{guard}])
+ @end smallexample
+
+ @var{number} gives when the result generated by instructions given in
+ string @var{out_insn_names} will be ready for instructions given in
+ string @var{in_insn_names}. Instructions in the string are separated
+ by comma.
+
+ @var{guard} is optional string given name of C function which defines
+ additional guard for the bypass. The function will get the two insns
+ as parameters. If the function returns zero the bypass will be
+ ignored for this case. Additional guard is necessary to recognize
+ complicated bypasses, e.g. when consumer is only address of insn
+ @samp{store} (not stored value).
+
+ Usually the following tree constructions are used to describe VLIW
+ processors (more correctly to describe placement of small insns into
+ VLIW insn slots). Although they can be used for RISC processor too.
+
+ @smallexample
+ (exclussion_set @var{unit-names} @var{unit-names})
+ (presense_set @var{unit-names} @var{unit-names})
+ (absense_set @var{unit-names} @var{unit-names})
+ @end smallexample
+
+ @var{unit-names} is string giving names of functional units separated by comma.
+
+ The first construction means that each functional unit in the first
+ string can not be reserved simultaneously with unit whose name is in
+ the second string and vise versa. For example, the construction is
+ useful for description processors (e.g. some SPARC processors) with
+ fully pipelined floating point functional unit which can execute
+ simultaneously only single floating point insns or only double
+ floating point insns.
+
+ The second construction means that each functional unit in the first
+ string can not be reserved unless at least one of units whose names
+ are in the second string is reserved. This is an asymmetric relation.
+ For example, it is useful for description that VLIW @samp{slot1} is
+ reserved after @samp{slot0} reservation.
+
+ The third construction means that each functional unit in the first
+ string can be reserved only if each unit whose name is in the second
+ string is not reserved. This is an asymmetric relation (actually
+ @samp{exclusion_set} is analogous to this one but it is symmetric).
+ For example, it is useful for description that VLIW @samp{slot0} can
+ not be reserved after @samp{slot1} or @samp{slot2} reservation.
+
+ You can control generator of pipeline hazard recognizer with the
+ following construction.
+
+ @smallexample
+ (automata_option @var{options})
+ @end smallexample
+
+ @var{options} is a string giving options which affect generated code.
+ Currently there are the following options:
+
+ @itemize @bullet
+ @item
+ @dfn{no-minimization} makes no minimization of automaton. This is only
+ worth to do when we are going to query CPU functional unit
+ reservations in an automaton state.
+
+ @item
+ @dfn{w} means generation of file describing the result automaton. The
+ file can be used to the description verification.
+
+ @item
+ @dfn{ndfa} makes nondeterministic finite state automata. This affects
+ treatment of operator `|' in the regular expressions. The usual
+ treatment of the operator is to try the first alternative and, if the
+ reservation is not possible, the second alternative. The
+ nondeterministic treatment means trying all alternatives, some of them
+ may be rejected by reservations in subsequent insns. You can not
+ query functional unit reservation in state of nondeterministic
+ automaton.
+ @end itemize
+
+ As an example, consider a superscalar RISC machine which can issue
+ three insns (two integer insns and one floating point insn) on cycle
+ but finish only two insns. To describe this, we define the following
+ functional units.
+
+ @smallexample
+ (define_cpu_unit "i0_pipeline, i1_pipeline, f_pipeline")
+ (define_cpu_unit "port_0, port1")
+ @end smallexample
+
+ All simple integer insns can be executed in any integer pipeline and
+ their result is ready in two cycles. The simple integer insns are
+ issued into the first pipeline unless it is reserved, otherwise they
+ are issued into the second pipeline. Integer division and
+ multiplication insns can be executed only in the second integer
+ pipeline and their results are ready correspondingly in 8 and 4
+ cycles. Integer division is not pipelined, i.e. subsequent integer
+ division insn can not be issued until current division insn finished.
+ Floating point insns are fully pipelined and their results are ready
+ in 3 cycles. There is also additional one cycle delay in usage by
+ integer insns of result produced by floating point insns. To describe
+ all of this we could specify
+
+ @smallexample
+ (define_cpu_unit "div")
+
+ (define_insn_reservation "simple" 2 (eq_attr "cpu" "int")
+ "(i0_pipeline | i1_pipeline), (port_0 | port1)")
+
+ (define_insn_reservation "mult" 4 (eq_attr "cpu" "mult")
+ "i1_pipeline, nothing*3, (port_0 | port1)")
+
+ (define_insn_reservation "div" 8 (eq_attr "cpu" "div")
+ "i1_pipeline, div*7, (port_0 | port1)")
+
+ (define_insn_reservation "float" 3 (eq_attr "cpu" "float")
+ "f_pipeline, nothing, (port_0 | port1))
+
+ (define_bypass 4 "float" "simple,mut,div")
+ @end smallexample
+
+ To simplify the description we could describe the following reservation
+
+ @smallexample
+ (define_reservation "finish" "port0|port1")
+ @end smallexample
+
+ and use it in all @code{define_insn_reservation} as in the following
+ construction
+
+ @smallexample
+ (define_insn_reservation "simple" 2 (eq_attr "cpu" "int")
+ "(i0_pipeline | i1_pipeline), finish")
+ @end smallexample
+
+
+ @node Comparison of two descriptions
+ @subsubsection Drawbacks of old pipeline description
+
+ The old instruction level parallelism description and pipeline hazards
+ recognizer based on it have the following drawbacks in comparisons
+ with new one:
+
+ @itemize @bullet
+ @item
+ Each functional unit is believed to be reserved at the instruction
+ execution start. This is very inaccurate model for modern processors.
+
+ @item
+ Inadequate description of instruction latency times. Latency time is
+ bound with functional unit reserved by instruction not with instruction
+ itself. In other words, the description is oriented to describe at
+ most one unit reservation by each instruction. It also does not
+ permit to describe special bypasses between instruction pairs.
+
+ @item
+ Implementation of the pipeline hazard recognizer interface has
+ constraints on number of functional units. This is number of bits in
+ integer on the host machine.
+
+ @item
+ Interface to the pipeline hazard recognizer is more complex than one
+ to automaton based pipeline recognizer.
+
+ @item
+ Unnatural description when you write a unit and condition which
+ selects instructions using the unit. Writing all unit reservations
+ for an instruction (an instruction class) is more natural.
+
+ @item
+ Recognition of interlock delays has slow implementation. GCC
+ scheduler supports structures which describe the unit reservations.
+ The more processor has functional units, the slower pipeline hazard
+ recognizer. Such implementation would become slower when we enable to
+ reserve functional units not only at the instruction execution start.
+ The automaton based pipeline hazard recognizer speed is not depended
+ on processor complexity.
+ @end itemize
@end ifset
@node Conditional Execution
Index: tm.texi
===================================================================
RCS file: /cvs/gcc/egcs/gcc/tm.texi,v
retrieving revision 1.168
diff -c -p -r1.168 tm.texi
*** tm.texi 2001/01/30 05:42:06 1.168
--- tm.texi 2001/01/31 18:47:24
*************** symbols must be explicitly imported from
*** 8244,8264 ****
A C statement that adds to @var{CLOBBERS} @code{STRING_CST} trees for
any hard regs the port wishes to automatically clobber for all asms.
@findex ISSUE_RATE
@item ISSUE_RATE
A C expression that returns how many instructions can be issued at the
same time if the machine is a superscalar machine.
@findex MD_SCHED_INIT
@item MD_SCHED_INIT (@var{file}, @var{verbose}, @var{max_ready})
! A C statement which is executed by the scheduler at the
! beginning of each block of instructions that are to be scheduled.
! @var{file} is either a null pointer, or a stdio stream to write any
! debug output to. @var{verbose} is the verbose level provided by
@samp{-fsched-verbose-}@var{n}. @var{max_ready} is the maximum number
of insns in the current scheduling region that can be live at the same
time. This can be used to allocate scratch space if it is needed.
@findex MD_SCHED_FINISH
@item MD_SCHED_FINISH (@var{file}, @var{verbose})
A C statement which is executed by the scheduler at the end of each block
--- 8244,8284 ----
A C statement that adds to @var{CLOBBERS} @code{STRING_CST} trees for
any hard regs the port wishes to automatically clobber for all asms.
+ @findex USE_AUTOMATON_PIPELINE_INTERFACE
+ @item USE_AUTOMATON_PIPELINE_INTERFACE
+ A C expression that is used only when the machine description file
+ contains old pipeline description and automaton based one
+ (@pxref{Processor pipeline description,,Specifying processor pipeline
+ description}). If the expression returns nonzero, the automaton based
+ pipeline description is used for insn scheduling, otherwise the old
+ pipeline description is used. The default value is one. In other
+ words, by default the automaton based pipeline description will be
+ always used.
+
@findex ISSUE_RATE
@item ISSUE_RATE
A C expression that returns how many instructions can be issued at the
same time if the machine is a superscalar machine.
+ This is used only for old pipeline description.
+
@findex MD_SCHED_INIT
@item MD_SCHED_INIT (@var{file}, @var{verbose}, @var{max_ready})
! A C statement which is executed by the scheduler at the beginning of
! each block of instructions that are to be scheduled. @var{file} is
! either a null pointer, or a stdio stream to write any debug output to.
! @var{verbose} is the verbose level provided by
@samp{-fsched-verbose-}@var{n}. @var{max_ready} is the maximum number
of insns in the current scheduling region that can be live at the same
time. This can be used to allocate scratch space if it is needed.
+ This macro is used only for old pipeline description.
+
+ @findex MD_AUTOMATON_SCHED_INIT
+ @item MD_AUTOMATON_SCHED_INIT (@var{file}, @var{verbose})
+ Like @samp{MD_SCHED_INIT} but used only for automaton based
+ pipeline description.
+
@findex MD_SCHED_FINISH
@item MD_SCHED_FINISH (@var{file}, @var{verbose})
A C statement which is executed by the scheduler at the end of each block
*************** is the timer tick of the scheduler. @va
*** 8284,8289 ****
--- 8304,8316 ----
parameter that is set to the number of insns that can issue this clock;
normally this is just @code{issue_rate}. See also @samp{MD_SCHED_REORDER2}.
+ This macro is used only for old pipeline description.
+
+ @findex MD_AUTOMATON_SCHED_REORDER
+ @item MD_AUTOMATON_SCHED_REORDER (@var{file}, @var{verbose}, @var{ready}, @var{n_ready}, @var{clock})
+ Like @samp{MD_SCHED_REORDER} but used only for automaton based
+ pipeline description.
+
@findex MD_SCHED_REORDER2
@item MD_SCHED_REORDER2 (@var{file}, @var{verbose}, @var{ready}, @var{n_ready}, @var{clock}, @var{can_issue_more})
Like @samp{MD_SCHED_REORDER}, but called at a different time. While the
*************** Defining this macro can be useful if the
*** 8295,8300 ****
--- 8322,8329 ----
scheduling one insn causes other insns to become ready in the same cycle,
these other insns can then be taken into account properly.
+ This macro is used only for old pipeline description.
+
@findex MD_SCHED_VARIABLE_ISSUE
@item MD_SCHED_VARIABLE_ISSUE (@var{file}, @var{verbose}, @var{insn}, @var{more})
A C statement which is executed by the scheduler after it
*************** is the verbose level provided by @samp{-
*** 8305,8310 ****
--- 8334,8355 ----
number of instructions that can be issued in the current cycle. The
@samp{MD_SCHED_VARIABLE_ISSUE} macro is responsible for updating the
value of @var{more} (typically by @var{more}--).
+
+ This macro is used only for old pipeline description.
+
+ @findex DFA_SCHEDULER_PRE_CYCLE_INSN
+ @item DFA_SCHEDULER_PRE_CYCLE_INSN
+ A C statement which returns an RTL insn. The automaton state used in
+ pipeline hazard recognizer is changed as if the insn were scheduled
+ when the new simulated processor cycle starts. Usage of the macro may
+ simplify automaton pipeline description for some VLIW processors. If
+ the macro is defined, it is used only for automaton based pipeline
+ description.
+
+ @findex DFA_SCHEDULER_POST_CYCLE_INSN
+ @item DFA_SCHEDULER_POST_CYCLE_INSN
+ Like @samp{DFA_SCHEDULER_PRE_CYCLE_INSN} but it is used at the end of
+ simulated processor cycle.
@findex MAX_INTEGER_COMPUTATION_MODE
@item MAX_INTEGER_COMPUTATION_MODE