This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
| Other format: | [Raw text] | |
Here is the initial patch. To use IRA, just add option -fira for gcc command-line. You can use additionaly option -fira-algorithm=[regional,CB,priority] to choose the regional allocation (default), Chaitin-Briggs, or Chow's priority-based allocation. I'd recommend to use -flower-subreg option too for 32-bit targets (there is no harm to use it for 64-bit machines too).
* ira.h, ira-int.h, ira.c, ira-build.c, ira-costs.c ira-conflicts.c, ira-color.c, ira-emit.c: New files.
* flags.h (ira_algorithm): New enumeration. (flag_ira_algorithm): New external variable declaration.
* toplev.c (ira.h): New include. (flag_ira_algorithm): New global variable. (backend_init): Call init_ira_once.
* toplev.h (flag_ira, flag_ira_biased_coloring): New external variable declarations.
* regs.h (contains_reg_of_mode, move_cost, may_move_in_cost, may_move_out_cost): New external variable declarations. * caller-save.c (no_caller_save_reg_set): New global variable. (init_caller_save): Set up no_caller_save_reg_set. * global.c (eliminable_regset): Make it external. (gate_handle_global_alloc): New function. (pass_global_alloc): Add the gate function.
* opts.c (decode_options): Print the warning for -fira. (common_handle_option): Process -fira-algorithm option.
* hard-reg-set.h (no_caller_save_reg_set): New external variable declaration.
* regmove.c (regmove_optimize): Don't do replacement of output
operands by input operands.
(rest_of_handle_regmove): Don't do CFG cleanup for IRA.* local-alloc.c (update_equiv_regs): Make it external. Return true if jump label rebuilding should be done. (gate_handle_local_alloc): New function. (pass_local_alloc): Add the gate function.
* alias.c (stack_addr_p): New function. (nonoverlapping_memrefs_p): Add code for IRA.
* common.opt (fira, fira-algorithm, fira-biased-coloring): New options.
* regclass.c (contains_reg_of_mode, move_cost, may_move_in_cost, may_move_out_cost): Make the variables external. * rtl.h (eliminable_regset): New external variable declaration. (update_equiv_regs): New external function definition.
* Makefile.in (IRA_INT_H): New definition. (OBJS-common): Add ira.o, ira-build.o, ira-costs.o, ira-conflicts.o, ira-color.o, and ira-emit.o. (reload1.o, toplev.o): Add dependence on ira.h. (ira.o, ira-build.o, ira-costs.o, ira-conflicts.o, ira-color.o, ira-emit.o): New entries.
* reload1.c (alter_reg): Add a new parameter. (pseudo_reg_compare): New function. (reload): Sort pseudos for IRA. Call alter_reg with the additional parameter. (count_spilled_pseudo): New variable freq. Use it. (alter_reg): Add code for IRA. (eliminate_regs_1, finish_spills, emit_input_reload_insns, delete_output_reload): Use additional parameter for alter_reg. (finish_spills, emit_input_reload_insns, delete_output_reload): Call mark_allocation_change. (finish_spills): Call retry_ira_color. * doc/invoke.texi: Describe new options -fira, -fira-biased-coloring, and -fira-algorithm. * doc/passes.texi: Decribe IRA.
* doc/tm.texi: Decribe macro IRA_COVER_CLASSES. * config/sparc/sparc.h (IRA_COVER_CLASSES): New macro.
* config/rs6000/rs6000.h (IRA_COVER_CLASSES): Ditto. Richard Henderson <rth@redhat.com>
* tree-pass.h (pass_lower_subreg): New external definition. * toplev.h (flag_lower_subreg): New external definition. * rtl.def (CONCATN): New rtl expression. * dwarf2out.c (concatn_loc_descriptor): New function. (loc_descriptor): Process CONCATN. * timevar.def (TV_LOWER_SUBREG): New definition. * emit-rtl.c (gen_reg_rtx_offset): New function. (gen_lowpart_common): Process CONCATN. * simplify-rtx.c (simplify_subreg): Process CONCATN. * common.opt (flower-subreg): New options. * rtl.h (gen_reg_rtx_offset): New external definition. * Makefile.in (OBJS-common): Add lower-subreg.o. (lower-subreg.o): New entry. * passes.c (pass_lower_subreg): Add new pass. * lower-subreg.c: New file.
Index: doc/passes.texi
===================================================================
--- doc/passes.texi (revision 119879)
+++ doc/passes.texi (working copy)
@@ -822,6 +822,17 @@ Global register allocation. This pass a
the remaining pseudo registers (those whose life spans are not
contained in one basic block). The pass is located in @file{global.c}.
+@item
+The integrated register allocator (@acronym{IRA}). It can be used
+instead of the local and global allocator. It is called integrated
+because coalescing and register live range splitting are done
+on-the-fly during coloring. Source files of the allocator are
+@file{ira.c}, @file{ira-build.c}, @file{ira-costs.c},
+@file{ira-conflicts.c}, @file{ira-color.c}, @file{ira-emit.c}, plus
+header files @file{ira.h} and @file{ira-int.h} used for the
+communication between the allocator and the rest of the compiler and
+between the IRA files.
+
@cindex reloading
@item
Reloading. This pass renumbers pseudo registers with the hardware
Index: doc/tm.texi
===================================================================
--- doc/tm.texi (revision 119879)
+++ doc/tm.texi (working copy)
@@ -2730,6 +2730,19 @@ as below:
@end smallexample
@end defmac
+@defmac IRA_COVER_CLASSES
+The macro defines cover classes for the Integrated Register Allocator
+(@acronym{IRA}). Cover classes is a set of non-intersected register
+classes covering all hard registers used for register allocation
+purpose. Any move between two registers of a cover class should be
+cheaper than load or store of the registers. The macro value is array
+of register classes with @code{LIM_REG_CLASSES} used as the end
+marker.
+
+If the macro is defined, the integrated register allocator can be used
+for the target.
+@end defmac
+
@node Old Constraints
@section Obsolete Macros for Defining Constraints
@cindex defining constraints, obsolete method
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi (revision 119879)
+++ doc/invoke.texi (working copy)
@@ -313,7 +313,8 @@ Objective-C and Objective-C++ Dialects}.
-fgcse -fgcse-lm -fgcse-sm -fgcse-las -fgcse-after-reload @gol
-fcrossjumping -fif-conversion -fif-conversion2 @gol
-finline-functions -finline-functions-called-once @gol
--finline-limit=@var{n} -fkeep-inline-functions @gol
+-finline-limit=@var{n} -fira -fira-alogirthm=@var{algorithm} @gol
+-fira-biased-coloring -fkeep-inline-functions @gol
-fkeep-static-consts -fmerge-constants -fmerge-all-constants @gol
-fmodulo-sched -fno-branch-count-reg @gol
-fno-default-inline -fno-defer-pop -fmove-loop-invariants @gol
@@ -4999,6 +5000,29 @@ optimization.
Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
+@item -fira
+@opindex fira
+If supported for the target machine, use the integrated register
+allocator (@acronym{IRA}) for the register allocation.
+
+@item -fira-biased-coloring
+@opindex fira-biased-coloring
+Use biased coloring for the integrated register allocator for possible
+improvement of the generated code. It makes register allocator
+slower.
+
+@item -fira-algorithm=@var{algorithm}
+Use specified algorithm for the integrated register allocator. The
+@var{algorithm} argument should be one of @code{regional}, @code{CB},
+or @code{priority}. The second algorithm specifies Chaitin-Briggs
+coloring, the third one specifies Chow's priority based coloring, and
+the third one which is the default specifies regional coloring based
+on Chaitin-Briggs coloring. The first algorithm is the best for the
+generated code quality especially for machines with small or moderate
+size register set, the second one is faster and generates decent code
+and the smallest size code, and the priority-based one is the fastest
+one.
+
@item -fdelayed-branch
@opindex fdelayed-branch
If supported for the target machine, attempt to reorder instructions
Index: flags.h
===================================================================
--- flags.h (revision 119879)
+++ flags.h (working copy)
@@ -192,6 +192,16 @@ extern int flag_dump_rtl_in_asm;
unused UIDs if there are a lot of instructions. If greater than
one, unconditionally renumber instruction UIDs. */
extern int flag_renumber_insns;
+
+/* The algorithm used for the integrated register allocator (IRA). */
+enum ira_algorithm {
+ IRA_ALGORITHM_REGIONAL,
+ IRA_ALGORITHM_CB,
+ IRA_ALGORITHM_PRIORITY
+};
+
+extern enum ira_algorithm flag_ira_algorithm;
+
/* Other basic status info about current function. */
Index: tree-pass.h
===================================================================
--- tree-pass.h (revision 119879)
+++ tree-pass.h (working copy)
@@ -332,6 +332,7 @@ extern struct tree_opt_pass pass_instant
extern struct tree_opt_pass pass_rtl_fwprop;
extern struct tree_opt_pass pass_rtl_fwprop_addr;
extern struct tree_opt_pass pass_jump2;
+extern struct tree_opt_pass pass_lower_subreg;
extern struct tree_opt_pass pass_cse;
extern struct tree_opt_pass pass_gcse;
extern struct tree_opt_pass pass_jump_bypass;
@@ -362,6 +363,7 @@ extern struct tree_opt_pass pass_sms;
extern struct tree_opt_pass pass_sched;
extern struct tree_opt_pass pass_local_alloc;
extern struct tree_opt_pass pass_global_alloc;
+extern struct tree_opt_pass pass_ira;
extern struct tree_opt_pass pass_postreload;
extern struct tree_opt_pass pass_clean_state;
extern struct tree_opt_pass pass_branch_prob;
Index: toplev.c
===================================================================
--- toplev.c (revision 119879)
+++ toplev.c (working copy)
@@ -67,6 +67,7 @@ Software Foundation, 51 Franklin Street,
#include "diagnostic.h"
#include "params.h"
#include "reload.h"
+#include "ira.h"
#include "dwarf2asm.h"
#include "integrate.h"
#include "real.h"
@@ -306,6 +307,10 @@ int flag_next_runtime = 0;
enum tls_model flag_tls_default = TLS_MODEL_GLOBAL_DYNAMIC;
+/* Set the default algorithm for the integrated register allocator. */
+
+enum ira_algorithm flag_ira_algorithm = IRA_ALGORITHM_REGIONAL;
+
/* Nonzero means change certain warnings into errors.
Usually these are warnings about failure to conform to some standard. */
@@ -1952,6 +1957,7 @@ backend_init (void)
init_alias_once ();
init_reload ();
init_varasm_once ();
+ init_ira_once ();
/* The following initialization functions need to generate rtl, so
provide a dummy function context for them. */
Index: toplev.h
===================================================================
--- toplev.h (revision 119879)
+++ toplev.h (working copy)
@@ -122,6 +122,7 @@ extern int flag_crossjumping;
extern int flag_if_conversion;
extern int flag_if_conversion2;
extern int flag_keep_static_consts;
+extern int flag_lower_subreg;
extern int flag_peel_loops;
extern int flag_rerun_cse_after_loop;
extern int flag_thread_jumps;
@@ -131,6 +132,8 @@ extern int flag_unroll_all_loops;
extern int flag_unswitch_loops;
extern int flag_cprop_registers;
extern int time_report;
+extern int flag_ira;
+extern int flag_ira_biased_coloring;
/* Things to do with target switches. */
extern void print_version (FILE *, const char *);
Index: regs.h
===================================================================
--- regs.h (revision 119879)
+++ regs.h (working copy)
@@ -237,7 +237,23 @@ extern int caller_save_needed;
/* Allocate reg_n_info tables */
extern void allocate_reg_info (size_t, int, int);
+/* 1 if the corresponding class does contain register of given
+ mode. */
+extern char contains_reg_of_mode [N_REG_CLASSES] [MAX_MACHINE_MODE];
+
+/* Maximum cost of moving from a register in one class to a register
+ in another class. */
+extern int move_cost[MAX_MACHINE_MODE][N_REG_CLASSES][N_REG_CLASSES];
+
/* Specify number of hard registers given machine mode occupy. */
extern unsigned char hard_regno_nregs[FIRST_PSEUDO_REGISTER][MAX_MACHINE_MODE];
+/* Similar, but here we don't have to move if the first index is a
+ subset of the second so in that case the cost is zero. */
+extern int may_move_in_cost[MAX_MACHINE_MODE][N_REG_CLASSES][N_REG_CLASSES];
+
+/* Similar, but here we don't have to move if the first index is a
+ superset of the second so in that case the cost is zero. */
+extern int may_move_out_cost[MAX_MACHINE_MODE][N_REG_CLASSES][N_REG_CLASSES];
+
#endif /* GCC_REGS_H */
Index: caller-save.c
===================================================================
--- caller-save.c (revision 119879)
+++ caller-save.c (working copy)
@@ -37,6 +37,10 @@ Software Foundation, 51 Franklin Street,
#include "tm_p.h"
#include "addresses.h"
+/* Call used hard registers which can not be saved because there is no
+ insn for this. */
+HARD_REG_SET no_caller_save_reg_set;
+
#ifndef MAX_MOVE_MAX
#define MAX_MOVE_MAX MOVE_MAX
#endif
@@ -117,6 +121,7 @@ init_caller_save (void)
rtx test_reg, test_mem;
rtx saveinsn, restinsn;
+ CLEAR_HARD_REG_SET (no_caller_save_reg_set);
/* First find all the registers that we need to deal with and all
the modes that they can have. If we can't find a mode to use,
we can't have the register live over calls. */
@@ -244,6 +249,8 @@ init_caller_save (void)
{
call_fixed_regs[i] = 1;
SET_HARD_REG_BIT (call_fixed_reg_set, i);
+ if (call_used_regs[i])
+ SET_HARD_REG_BIT (no_caller_save_reg_set, i);
}
}
}
Index: rtl.def
===================================================================
--- rtl.def (revision 119879)
+++ rtl.def (working copy)
@@ -411,6 +411,13 @@ DEF_RTL_EXPR(SYMBOL_REF, "symbol_ref", "
pretend to be looking at the entire value and comparing it. */
DEF_RTL_EXPR(CC0, "cc0", "", RTX_OBJ)
+/* (CONCATN [a1 a2 .. an]) represents the virtual concatenation of all
+ An to make a value. This is an extension of the CONCAT to larger
+ numbers of components. This is used for decomposing large values
+ into register sized components. Like CONCAT, it should not appear
+ in the insn chain. */
+DEF_RTL_EXPR (CONCATN, "concatn", "E", RTX_OBJ)
+
/* ----------------------------------------------------------------------
Expressions for operators in an rtl pattern
---------------------------------------------------------------------- */
Index: global.c
===================================================================
--- global.c (revision 119879)
+++ global.c (working copy)
@@ -292,7 +292,7 @@ static int n_regs_set;
/* All registers that can be eliminated. */
-static HARD_REG_SET eliminable_regset;
+HARD_REG_SET eliminable_regset;
static int allocno_compare (const void *, const void *);
static void global_conflicts (void);
@@ -2501,6 +2501,13 @@ make_accurate_live_analysis (void)
}
free_bb_info ();
}
+
+static bool
+gate_handle_global_alloc (void)
+{
+ return ! flag_ira;
+}
+
/* Run old register allocator. Return TRUE if we must exit
rest_of_compilation upon return. */
static unsigned int
@@ -2534,7 +2541,7 @@ rest_of_handle_global_alloc (void)
struct tree_opt_pass pass_global_alloc =
{
"greg", /* name */
- NULL, /* gate */
+ gate_handle_global_alloc, /* gate */
rest_of_handle_global_alloc, /* execute */
NULL, /* sub */
NULL, /* next */
Index: dwarf2out.c
===================================================================
--- dwarf2out.c (revision 119879)
+++ dwarf2out.c (working copy)
@@ -4149,6 +4149,7 @@ static dw_loc_descr_ref based_loc_descr
static int is_based_loc (rtx);
static dw_loc_descr_ref mem_loc_descriptor (rtx, enum machine_mode mode);
static dw_loc_descr_ref concat_loc_descriptor (rtx, rtx);
+static dw_loc_descr_ref concatn_loc_descriptor (rtx);
static dw_loc_descr_ref loc_descriptor (rtx);
static dw_loc_descr_ref loc_descriptor_from_tree_1 (tree, int);
static dw_loc_descr_ref loc_descriptor_from_tree (tree);
@@ -9045,6 +9046,31 @@ concat_loc_descriptor (rtx x0, rtx x1)
return cc_loc_result;
}
+/* Return a descriptor that describes the concatenation of N locations. */
+
+static dw_loc_descr_ref
+concatn_loc_descriptor (rtx concatn)
+{
+ dw_loc_descr_ref cc_loc_result = NULL;
+ unsigned int i, n = XVECLEN (concatn, 0);
+
+ for (i = 0; i < n; ++i)
+ {
+ dw_loc_descr_ref ref;
+ rtx x = XVECEXP (concatn, 0, i);
+
+ ref = loc_descriptor (x);
+ if (ref == NULL)
+ return NULL;
+
+ add_loc_descr (&cc_loc_result, ref);
+ ref = new_loc_descr (DW_OP_piece, GET_MODE_SIZE (GET_MODE (x)), 0);
+ add_loc_descr (&cc_loc_result, ref);
+ }
+
+ return cc_loc_result;
+}
+
/* Output a proper Dwarf location descriptor for a variable or parameter
which is either allocated in a register or in a memory location. For a
register, we just generate an OP_REG and the register number. For a
@@ -9082,6 +9108,10 @@ loc_descriptor (rtx rtl)
loc_result = concat_loc_descriptor (XEXP (rtl, 0), XEXP (rtl, 1));
break;
+ case CONCATN:
+ loc_result = concatn_loc_descriptor (rtl);
+ break;
+
case VAR_LOCATION:
/* Single part. */
if (GET_CODE (XEXP (rtl, 1)) != PARALLEL)
Index: opts.c
===================================================================
--- opts.c (revision 119879)
+++ opts.c (working copy)
@@ -628,6 +628,14 @@ decode_options (unsigned int argc, const
flag_reorder_blocks_and_partition = 0;
flag_reorder_blocks = 1;
}
+
+#ifndef IRA_COVER_CLASSES
+ if (flag_ira)
+ {
+ inform ("-fira does not work on this architecture");
+ flag_ira = 0;
+ }
+#endif
}
/* Handle target- and language-independent options. Return zero to
@@ -946,6 +954,17 @@ common_handle_option (size_t scode, cons
warning (0, "unknown tls-model \"%s\"", arg);
break;
+ case OPT_fira_algorithm_:
+ if (!strcmp (arg, "regional"))
+ flag_ira_algorithm = IRA_ALGORITHM_REGIONAL;
+ else if (!strcmp (arg, "CB"))
+ flag_ira_algorithm = IRA_ALGORITHM_CB;
+ else if (!strcmp (arg, "priority"))
+ flag_ira_algorithm = IRA_ALGORITHM_PRIORITY;
+ else
+ warning (0, "unknown ira algorithm \"%s\"", arg);
+ break;
+
case OPT_ftracer:
flag_tracer_set = true;
break;
Index: timevar.def
===================================================================
--- timevar.def (revision 119879)
+++ timevar.def (working copy)
@@ -128,6 +128,7 @@ DEFTIMEVAR (TV_OVERLOAD , "
DEFTIMEVAR (TV_TEMPLATE_INSTANTIATION, "template instantiation")
DEFTIMEVAR (TV_EXPAND , "expand")
DEFTIMEVAR (TV_VARCONST , "varconst")
+DEFTIMEVAR (TV_LOWER_SUBREG , "lower subreg")
DEFTIMEVAR (TV_JUMP , "jump")
DEFTIMEVAR (TV_FWPROP , "forward prop")
DEFTIMEVAR (TV_CSE , "CSE")
@@ -154,6 +155,7 @@ DEFTIMEVAR (TV_SMS , "sms modulo s
DEFTIMEVAR (TV_SCHED , "scheduling")
DEFTIMEVAR (TV_LOCAL_ALLOC , "local alloc")
DEFTIMEVAR (TV_GLOBAL_ALLOC , "global alloc")
+DEFTIMEVAR (TV_IRA , "integrated RA")
DEFTIMEVAR (TV_RELOAD_CSE_REGS , "reload CSE regs")
DEFTIMEVAR (TV_SEQABSTR , "sequence abstraction")
DEFTIMEVAR (TV_GCSE_AFTER_RELOAD , "load CSE after reload")
Index: hard-reg-set.h
===================================================================
--- hard-reg-set.h (revision 119879)
+++ hard-reg-set.h (working copy)
@@ -446,6 +446,11 @@ extern char global_regs[FIRST_PSEUDO_REG
extern HARD_REG_SET regs_invalidated_by_call;
+/* Call used hard registers which can not be saved because there is no
+ insn for this. */
+
+extern HARD_REG_SET no_caller_save_reg_set;
+
#ifdef REG_ALLOC_ORDER
/* Table of register numbers in the order in which to try to use them. */
Index: regmove.c
===================================================================
--- regmove.c (revision 119879)
+++ regmove.c (working copy)
@@ -1082,7 +1082,8 @@ regmove_optimize (rtx f, int nregs)
for (pass = 0; pass <= 2; pass++)
{
- if (! flag_regmove && pass >= flag_expensive_optimizations)
+ if ((! flag_regmove || flag_ira)
+ && pass >= flag_expensive_optimizations)
goto done;
if (dump_file)
@@ -1130,7 +1131,7 @@ regmove_optimize (rtx f, int nregs)
}
}
}
- if (! flag_regmove)
+ if (! flag_regmove || flag_ira)
continue;
if (! find_matches (insn, &match))
@@ -2486,7 +2487,8 @@ static unsigned int
rest_of_handle_regmove (void)
{
regmove_optimize (get_insns (), max_reg_num ());
- cleanup_cfg (CLEANUP_EXPENSIVE | CLEANUP_UPDATE_LIFE);
+ if (! flag_ira)
+ cleanup_cfg (CLEANUP_EXPENSIVE | CLEANUP_UPDATE_LIFE);
return 0;
}
Index: local-alloc.c
===================================================================
--- local-alloc.c (revision 119879)
+++ local-alloc.c (working copy)
@@ -81,6 +81,7 @@ Software Foundation, 51 Franklin Street,
#include "ggc.h"
#include "timevar.h"
#include "tree-pass.h"
+#include "params.h"
/* Next quantity number available for allocation. */
@@ -292,7 +293,6 @@ static int equiv_init_movable_p (rtx, in
static int contains_replace_regs (rtx);
static int memref_referenced_p (rtx, rtx);
static int memref_used_between_p (rtx, rtx, rtx);
-static void update_equiv_regs (void);
static void no_equiv (rtx, rtx, void *);
static void block_alloc (int);
static int qty_sugg_compare (int, int);
@@ -788,9 +788,11 @@ memref_used_between_p (rtx memref, rtx s
into the using insn. If it succeeds, we can eliminate the register
completely.
- Initialize the REG_EQUIV_INIT array of initializing insns. */
+ Initialize the REG_EQUIV_INIT array of initializing insns.
-static void
+ Return non-zero if jump label rebuilding should be done. */
+
+int
update_equiv_regs (void)
{
rtx insn;
@@ -1240,6 +1242,7 @@ update_equiv_regs (void)
end_alias_analysis ();
CLEAR_REG_SET (&cleared_regs);
free (reg_equiv);
+ return recorded_label_ref;
}
/* Mark REG as having no known equivalence.
@@ -2522,6 +2525,12 @@ dump_local_alloc (FILE *file)
fprintf (file, ";; Register %d in %d.\n", i, reg_renumber[i]);
}
+static bool
+gate_handle_local_alloc (void)
+{
+ return ! flag_ira;
+}
+
/* Run old register allocator. Return TRUE if we must exit
rest_of_compilation upon return. */
static unsigned int
@@ -2575,7 +2584,7 @@ rest_of_handle_local_alloc (void)
struct tree_opt_pass pass_local_alloc =
{
"lreg", /* name */
- NULL, /* gate */
+ gate_handle_local_alloc, /* gate */
rest_of_handle_local_alloc, /* execute */
NULL, /* sub */
NULL, /* next */
Index: alias.c
===================================================================
--- alias.c (revision 119879)
+++ alias.c (working copy)
@@ -162,6 +162,7 @@ static alias_set_entry get_alias_set_ent
static rtx fixed_scalar_and_varying_struct_p (rtx, rtx, rtx, rtx,
int (*) (rtx, int));
static int aliases_everything_p (rtx);
+static int stack_addr_p (rtx);
static bool nonoverlapping_component_refs_p (tree, tree);
static tree decl_for_component_ref (tree);
static rtx adjust_offset_for_component_ref (tree, rtx);
@@ -2008,6 +2009,23 @@ adjust_offset_for_component_ref (tree x,
return GEN_INT (ioffset);
}
+/* The function returns nonzero if X is a stack address. */
+static int
+stack_addr_p (rtx x)
+{
+ if (x == hard_frame_pointer_rtx || x == frame_pointer_rtx
+ || x == arg_pointer_rtx || x == stack_pointer_rtx)
+ return 1;
+ if (GET_CODE (x) == PLUS
+ && (XEXP (x, 0) == hard_frame_pointer_rtx
+ || XEXP (x, 0) == frame_pointer_rtx
+ || XEXP (x, 0) == arg_pointer_rtx
+ || XEXP (x, 0) == stack_pointer_rtx)
+ && CONSTANT_P (XEXP (x, 1)))
+ return 1;
+ return 0;
+}
+
/* Return nonzero if we can determine the exprs corresponding to memrefs
X and Y and they do not overlap. */
@@ -2017,9 +2035,24 @@ nonoverlapping_memrefs_p (rtx x, rtx y)
tree exprx = MEM_EXPR (x), expry = MEM_EXPR (y);
rtx rtlx, rtly;
rtx basex, basey;
+ rtx x_addr, y_addr;
rtx moffsetx, moffsety;
HOST_WIDE_INT offsetx = 0, offsety = 0, sizex, sizey, tem;
+ if (flag_ira)
+ {
+ /* We need this code for IRA because stack slot sharing. RTL in
+ decl can be different than RTL used in insns. It is safe
+ code although it can be conservative sometime. */
+ x_addr = canon_rtx (get_addr (XEXP (x, 0)));
+ y_addr = canon_rtx (get_addr (XEXP (y, 0)));
+
+ if (stack_addr_p (x_addr) && stack_addr_p (y_addr)
+ && memrefs_conflict_p (SIZE_FOR_MODE (y), y_addr,
+ SIZE_FOR_MODE (x), x_addr, 0))
+ return 0;
+ }
+
/* Unless both have exprs, we can't tell anything. */
if (exprx == 0 || expry == 0)
return 0;
Index: emit-rtl.c
===================================================================
--- emit-rtl.c (revision 119879)
+++ emit-rtl.c (working copy)
@@ -812,13 +812,12 @@ gen_reg_rtx (enum machine_mode mode)
return val;
}
-/* Generate a register with same attributes as REG, but offsetted by OFFSET.
+/* Update NEW with same attributes as REG, but offsetted by OFFSET.
Do the big endian correction if needed. */
-rtx
-gen_rtx_REG_offset (rtx reg, enum machine_mode mode, unsigned int regno, int offset)
+static void
+update_reg_offset (rtx new, rtx reg, int offset)
{
- rtx new = gen_rtx_REG (mode, regno);
tree decl;
HOST_WIDE_INT var_size;
@@ -860,7 +859,7 @@ gen_rtx_REG_offset (rtx reg, enum machin
if ((BYTES_BIG_ENDIAN || WORDS_BIG_ENDIAN)
&& decl != NULL
&& offset > 0
- && GET_MODE_SIZE (GET_MODE (reg)) > GET_MODE_SIZE (mode)
+ && GET_MODE_SIZE (GET_MODE (reg)) > GET_MODE_SIZE (GET_MODE (new))
&& ((var_size = int_size_in_bytes (TREE_TYPE (decl))) > 0
&& var_size < GET_MODE_SIZE (GET_MODE (reg))))
{
@@ -904,6 +903,27 @@ gen_rtx_REG_offset (rtx reg, enum machin
REG_ATTRS (new) = get_reg_attrs (REG_EXPR (reg),
REG_OFFSET (reg) + offset);
+}
+
+/* Generate a register with same attributes as REG, but offsetted by OFFSET. */
+
+rtx
+gen_rtx_REG_offset (rtx reg, enum machine_mode mode,
+ unsigned int regno, int offset)
+{
+ rtx new = gen_rtx_REG (mode, regno);
+ update_reg_offset (new, reg, offset);
+ return new;
+}
+
+/* Generate a new pseudo register with same attributes as REG, but
+ offsetted by OFFSET. */
+
+rtx
+gen_reg_rtx_offset (rtx reg, enum machine_mode mode, int offset)
+{
+ rtx new = gen_reg_rtx (mode);
+ update_reg_offset (new, reg, offset);
return new;
}
@@ -1153,8 +1173,9 @@ gen_lowpart_common (enum machine_mode mo
return gen_rtx_fmt_e (GET_CODE (x), mode, XEXP (x, 0));
}
else if (GET_CODE (x) == SUBREG || REG_P (x)
- || GET_CODE (x) == CONCAT || GET_CODE (x) == CONST_VECTOR
- || GET_CODE (x) == CONST_DOUBLE || GET_CODE (x) == CONST_INT)
+ || GET_CODE (x) == CONCAT || GET_CODE (x) == CONCATN
+ || GET_CODE (x) == CONST_VECTOR || GET_CODE (x) == CONST_DOUBLE
+ || GET_CODE (x) == CONST_INT)
return simplify_gen_subreg (mode, x, innermode, offset);
/* Otherwise, we can't do this. */
Index: simplify-rtx.c
===================================================================
--- simplify-rtx.c (revision 119879)
+++ simplify-rtx.c (working copy)
@@ -4646,13 +4646,23 @@ simplify_subreg (enum machine_mode outer
/* Handle complex values represented as CONCAT
of real and imaginary part. */
- if (GET_CODE (op) == CONCAT)
+ if (GET_CODE (op) == CONCAT || GET_CODE (op) == CONCATN)
{
unsigned int inner_size, final_offset;
rtx part, res;
- inner_size = GET_MODE_UNIT_SIZE (innermode);
- part = byte < inner_size ? XEXP (op, 0) : XEXP (op, 1);
+ if (GET_CODE (op) == CONCAT)
+ {
+ inner_size = GET_MODE_SIZE (innermode) / 2;
+ part = byte < inner_size ? XEXP (op, 0) : XEXP (op, 1);
+ }
+ else
+ {
+ /* ??? We've got room; perhaps we should store the inner size
+ of the CONCATN in one of the subsequent unused fields. */
+ inner_size = GET_MODE_SIZE (innermode) / XVECLEN (op, 0);
+ part = XVECEXP (op, 0, byte / inner_size);
+ }
final_offset = byte % inner_size;
if (final_offset + GET_MODE_SIZE (outermode) > inner_size)
return NULL_RTX;
Index: common.opt
===================================================================
--- common.opt (revision 119879)
+++ common.opt (working copy)
@@ -557,6 +557,18 @@ fipa-type-escape
Common Report Var(flag_ipa_type_escape) Init(0)
Type based escape and alias analysis
+fira
+Common Report Var(flag_ira)
+Use integrated register allocator.
+
+fira-algorithm=
+Common Joined RejectNegative
+-fira-algorithm=[regional|CB|priority] Set the used IRA algorithm
+
+fira-biased-coloring
+Common Report Var(flag_ira_biased_coloring)
+Use biased coloring for the integrated register allocator.
+
fivopts
Common Report Var(flag_ivopts) Init(1)
Optimize induction variables on trees
@@ -581,6 +593,10 @@ floop-optimize
Common
Does nothing. Preserved for backward compatibility.
+flower-subreg
+Common Report Var(flag_lower_subreg)
+Subreg lowering
+
fmath-errno
Common Report Var(flag_errno_math) Init(1)
Set errno after built-in math functions
Index: regclass.c
===================================================================
--- regclass.c (revision 119879)
+++ regclass.c (working copy)
@@ -206,22 +206,22 @@ bool have_regs_of_mode [MAX_MACHINE_MODE
/* 1 if class does contain register of given mode. */
-static char contains_reg_of_mode [N_REG_CLASSES] [MAX_MACHINE_MODE];
+char contains_reg_of_mode [N_REG_CLASSES] [MAX_MACHINE_MODE];
/* Maximum cost of moving from a register in one class to a register in
another class. Based on REGISTER_MOVE_COST. */
-static int move_cost[MAX_MACHINE_MODE][N_REG_CLASSES][N_REG_CLASSES];
+int move_cost[MAX_MACHINE_MODE][N_REG_CLASSES][N_REG_CLASSES];
/* Similar, but here we don't have to move if the first index is a subset
of the second so in that case the cost is zero. */
-static int may_move_in_cost[MAX_MACHINE_MODE][N_REG_CLASSES][N_REG_CLASSES];
+int may_move_in_cost[MAX_MACHINE_MODE][N_REG_CLASSES][N_REG_CLASSES];
/* Similar, but here we don't have to move if the first index is a superset
of the second so in that case the cost is zero. */
-static int may_move_out_cost[MAX_MACHINE_MODE][N_REG_CLASSES][N_REG_CLASSES];
+int may_move_out_cost[MAX_MACHINE_MODE][N_REG_CLASSES][N_REG_CLASSES];
#ifdef FORBIDDEN_INC_DEC_CLASSES
Index: rtl.h
===================================================================
--- rtl.h (revision 119879)
+++ rtl.h (working copy)
@@ -1474,6 +1474,7 @@ extern int rtx_equal_p (rtx, rtx);
extern rtvec gen_rtvec_v (int, rtx *);
extern rtx gen_reg_rtx (enum machine_mode);
extern rtx gen_rtx_REG_offset (rtx, enum machine_mode, unsigned int, int);
+extern rtx gen_reg_rtx_offset (rtx, enum machine_mode, int);
extern rtx gen_label_rtx (void);
extern rtx gen_lowpart_common (enum machine_mode, rtx);
@@ -2147,6 +2148,9 @@ extern bool can_copy_p (enum machine_mod
extern rtx fis_get_condition (rtx);
/* In global.c */
+#ifdef HARD_CONST
+extern HARD_REG_SET eliminable_regset;
+#endif
extern void mark_elimination (int, int);
extern void dump_global_regs (FILE *);
#ifdef HARD_CONST
@@ -2181,6 +2185,7 @@ extern void dbr_schedule (rtx);
/* In local-alloc.c */
extern void dump_local_alloc (FILE *);
+extern int update_equiv_regs (void);
/* In reload1.c */
extern int function_invariant_p (rtx);
Index: Makefile.in
===================================================================
--- Makefile.in (revision 119879)
+++ Makefile.in (working copy)
@@ -814,6 +814,7 @@ TREE_DATA_REF_H = tree-data-ref.h $(LAMB
VARRAY_H = varray.h $(MACHMODE_H) $(SYSTEM_H) coretypes.h $(TM_H)
TREE_INLINE_H = tree-inline.h $(VARRAY_H) $(SPLAY_TREE_H)
REAL_H = real.h $(MACHMODE_H)
+IRA_INT_H = ira.h ira-int.h $(CFGLOOP_H)
#
# Now figure out from those variables how to compile and link.
@@ -1018,7 +1019,8 @@ OBJS-common = \
lambda-trans.o lambda-code.o tree-loop-linear.o tree-ssa-sink.o \
tree-vrp.o tree-stdarg.o tree-cfgcleanup.o tree-ssa-reassoc.o \
tree-ssa-structalias.o tree-object-size.o \
- rtl-factoring.o
+ rtl-factoring.o ira.o ira-build.o ira-costs.o ira-conflicts.o \
+ ira-color.o ira-emit.o lower-subreg.o
OBJS-md = $(out_object_file)
@@ -2154,7 +2156,7 @@ toplev.o : toplev.c $(CONFIG_H) $(SYSTEM
$(INSN_ATTR_H) output.h $(DIAGNOSTIC_H) debug.h insn-config.h intl.h \
$(RECOG_H) Makefile toplev.h dwarf2out.h sdbout.h dbxout.h $(EXPR_H) \
hard-reg-set.h $(BASIC_BLOCK_H) graph.h except.h $(REGS_H) $(TIMEVAR_H) \
- value-prof.h $(PARAMS_H) $(TM_P_H) reload.h dwarf2asm.h $(TARGET_H) \
+ value-prof.h $(PARAMS_H) $(TM_P_H) reload.h ira.h dwarf2asm.h $(TARGET_H) \
langhooks.h insn-flags.h $(CFGLAYOUT_H) $(CFGLOOP_H) hosthooks.h \
$(CGRAPH_H) $(COVERAGE_H) alloc-pool.h $(GGC_H) $(INTEGRATE_H) \
$(CPPLIB_H) opts.h params.def tree-mudflap.h $(REAL_H)
@@ -2525,7 +2527,7 @@ reload1.o : reload1.c $(CONFIG_H) $(SYST
$(EXPR_H) $(OPTABS_H) reload.h $(REGS_H) hard-reg-set.h insn-config.h \
$(BASIC_BLOCK_H) $(RECOG_H) output.h $(FUNCTION_H) toplev.h $(TM_P_H) \
addresses.h except.h $(TREE_H) $(REAL_H) $(FLAGS_H) $(MACHMODE_H) \
- $(OBSTACK_H) $(TARGET_H)
+ $(OBSTACK_H) $(TARGET_H) ira.h
rtlhooks.o : rtlhooks.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
rtlhooks-def.h $(EXPR_H) $(RECOG_H)
postreload.o : postreload.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
@@ -2554,6 +2556,33 @@ alias.o : alias.c $(CONFIG_H) $(SYSTEM_H
$(ALIAS_H) $(EMIT_RTL_H) $(GGC_H) $(FUNCTION_H) cselib.h $(TREE_H) $(TM_P_H) \
langhooks.h $(TARGET_H) gt-alias.h $(TIMEVAR_H) $(CGRAPH_H) \
$(SPLAY_TREE_H) $(VARRAY_H) $(IPA_TYPE_ESCAPE_H) tree-pass.h
+ira-build.o: ira-build.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
+ $(TARGET_H) $(RTL_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) \
+ insn-config.h $(RECOG_H) $(BASIC_BLOCK_H) toplev.h $(TM_P_H) \
+ $(PARAMS_H) $(DF_H) $(IRA_INT_H)
+ira-costs.o: ira-costs.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
+ $(TARGET_H) $(RTL_H) insn-config.h $(RECOG_H) \
+ $(REGS_H) hard-reg-set.h $(FLAGS_H) errors.h \
+ $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) \
+ $(IRA_INT_H)
+ira-conflicts.o: ira-conflicts.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
+ $(TARGET_H) $(RTL_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) \
+ insn-config.h $(RECOG_H) $(BASIC_BLOCK_H) toplev.h $(TM_P_H) $(PARAMS_H) \
+ $(DF_H) $(IRA_INT_H)
+ira-color.o: ira-color.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
+ $(TARGET_H) $(RTL_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) \
+ $(EXPR_H) $(BASIC_BLOCK_H) toplev.h $(TM_P_H) $(PARAMS_H) \
+ $(DF_H) $(IRA_INT_H)
+ira-emit.o: ira-emit.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
+ $(TARGET_H) $(RTL_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) \
+ $(EXPR_H) $(BASIC_BLOCK_H) toplev.h $(TM_P_H) $(PARAMS_H) \
+ $(IRA_INT_H)
+ira.o: ira.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
+ $(TARGET_H) $(TM_H) $(RTL_H) $(RECOG_H) \
+ $(REGS_H) hard-reg-set.h $(FLAGS_H) $(OBSTACK_H) \
+ $(EXPR_H) $(BASIC_BLOCK_H) toplev.h $(TM_P_H) \
+ $(DF_H) $(IRA_INT_H) $(PARAMS_H) $(TIMEVAR_H) $(INTEGRATE_H) \
+ tree-pass.h output.h
regmove.o : regmove.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
insn-config.h $(TIMEVAR_H) tree-pass.h \
$(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
@@ -2649,6 +2678,8 @@ hooks.o: hooks.c $(CONFIG_H) $(SYSTEM_H)
pretty-print.o: $(CONFIG_H) $(SYSTEM_H) coretypes.h intl.h $(PRETTY_PRINT_H) \
$(TREE_H)
errors.o : errors.c $(CONFIG_H) $(SYSTEM_H) errors.h $(BCONFIG_H)
+lower-subreg.o : lower-subreg.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
+ $(MACHMODE_H) $(RTL_H) bitmap.h $(TM_H)
$(out_object_file): $(out_file) $(CONFIG_H) coretypes.h $(TM_H) $(TREE_H) \
$(RTL_H) $(REGS_H) hard-reg-set.h insn-config.h conditions.h \
Index: passes.c
===================================================================
--- passes.c (revision 119879)
+++ passes.c (working copy)
@@ -634,6 +634,7 @@ init_optimization_passes (void)
NEXT_PASS (pass_unshare_all_rtl);
NEXT_PASS (pass_instantiate_virtual_regs);
NEXT_PASS (pass_jump2);
+ NEXT_PASS (pass_lower_subreg);
NEXT_PASS (pass_cse);
NEXT_PASS (pass_rtl_fwprop);
NEXT_PASS (pass_gcse);
@@ -660,6 +661,7 @@ init_optimization_passes (void)
NEXT_PASS (pass_sched);
NEXT_PASS (pass_local_alloc);
NEXT_PASS (pass_global_alloc);
+ NEXT_PASS (pass_ira);
NEXT_PASS (pass_postreload);
*p = NULL;
Index: config/sparc/sparc.h
===================================================================
--- config/sparc/sparc.h (revision 119879)
+++ config/sparc/sparc.h (working copy)
@@ -1061,6 +1061,19 @@ enum reg_class { NO_REGS, FPCC_REGS, I64
{-1, -1, -1, 0x20}, /* GENERAL_OR_EXTRA_FP_REGS */ \
{-1, -1, -1, 0x3f}} /* ALL_REGS */
+/* The following macro defines cover classes for Integrated Register
+ Allocator. Cover classes is a set of non-intersected register
+ classes covering all hard registers used for register allocation
+ purpose. Any move between two registers of a cover class should be
+ cheaper than load or store of the registers. The macro value is
+ array of register classes with LIM_REG_CLASSES used as the end
+ marker. */
+
+#define IRA_COVER_CLASSES \
+{ \
+ GENERAL_REGS, EXTRA_FP_REGS, FPCC_REGS, LIM_REG_CLASSES \
+}
+
/* Defines invalid mode changes. Borrowed from pa64-regs.h.
SImode loads to floating-point registers are not zero-extended.
Index: config/i386/i386.h
===================================================================
--- config/i386/i386.h (revision 119879)
+++ config/i386/i386.h (working copy)
@@ -1217,6 +1217,19 @@ enum reg_class
{ 0xffffffff,0x1fffff } \
}
+/* The following macro defines cover classes for Integrated Register
+ Allocator. Cover classes is a set of non-intersected register
+ classes covering all hard registers used for register allocation
+ purpose. Any move between two registers of a cover class should be
+ cheaper than load or store of the registers. The macro value is
+ array of register classes with LIM_REG_CLASSES used as the end
+ marker. */
+
+#define IRA_COVER_CLASSES \
+{ \
+ GENERAL_REGS, FLOAT_REGS, MMX_REGS, SSE_REGS, LIM_REG_CLASSES \
+}
+
/* The same information, inverted:
Return the class number of the smallest class containing
reg number REGNO. This could be a conditional expression
Index: config/ia64/ia64.h
===================================================================
--- config/ia64/ia64.h (revision 119879)
+++ config/ia64/ia64.h (working copy)
@@ -797,6 +797,19 @@ enum reg_class
0xFFFFFFFF, 0xFFFFFFFF, 0x3FFF }, \
}
+/* The following macro defines cover classes for Integrated Register
+ Allocator. Cover classes is a set of non-intersected register
+ classes covering all hard registers used for register allocation
+ purpose. Any move between two registers of a cover class should be
+ cheaper than load or store of the registers. The macro value is
+ array of register classes with LIM_REG_CLASSES used as the end
+ marker. */
+
+#define IRA_COVER_CLASSES \
+{ \
+ PR_REGS, BR_REGS, AR_M_REGS, AR_I_REGS, GR_REGS, FR_REGS, LIM_REG_CLASSES \
+}
+
/* A C expression whose value is a register class containing hard register
REGNO. In general there is more than one such class; choose a class which
is "minimal", meaning that no smaller class also contains the register. */
Index: config/rs6000/rs6000.h
===================================================================
--- config/rs6000/rs6000.h (revision 119879)
+++ config/rs6000/rs6000.h (working copy)
@@ -1061,6 +1061,22 @@ enum reg_class
{ 0xffffffff, 0xffffffff, 0xffffffff, 0x0003ffff } /* ALL_REGS */ \
}
+/* The following macro defines cover classes for Integrated Register
+ Allocator. Cover classes is a set of non-intersected register
+ classes covering all hard registers used for register allocation
+ purpose. Any move between two registers of a cover class should be
+ cheaper than load or store of the registers. The macro value is
+ array of register classes with LIM_REG_CLASSES used as the end
+ marker. */
+
+#define IRA_COVER_CLASSES \
+{ \
+ SPEC_OR_GEN_REGS /* GENERAL_REGS */, FLOAT_REGS, ALTIVEC_REGS, \
+ /*VRSAVE_REGS,*/ VSCR_REGS, SPE_ACC_REGS, SPEFSCR_REGS, \
+ /* MQ_REGS, LINK_REGS, CTR_REGS, */ \
+ CR_REGS, XER_REGS, LIM_REG_CLASSES \
+}
+
/* The same information, inverted:
Return the class number of the smallest class containing
reg number REGNO. This could be a conditional expression
Index: reload1.c
===================================================================
--- reload1.c (revision 119879)
+++ reload1.c (working copy)
@@ -45,6 +45,8 @@ Software Foundation, 51 Franklin Street,
#include "toplev.h"
#include "except.h"
#include "tree.h"
+#include "ira.h"
+#include "params.h"
#include "target.h"
/* This file contains the reload pass of the compiler, which is
@@ -388,7 +390,7 @@ static void delete_caller_save_insns (vo
static void spill_failure (rtx, enum reg_class);
static void count_spilled_pseudo (int, int, int);
static void delete_dead_insn (rtx);
-static void alter_reg (int, int);
+static void alter_reg (int, int, bool);
static void set_label_offsets (rtx, rtx, int);
static void check_eliminable_occurrences (rtx);
static void elimination_effects (rtx, enum machine_mode);
@@ -633,6 +635,20 @@ static int something_needs_operands_chan
/* Nonzero means we couldn't get enough spill regs. */
static int failure;
+/* The function is used to sort pseudos according their usage
+ frequencies (putting most frequently ones first). */
+static int
+pseudo_reg_compare (const void *v1p, const void *v2p)
+{
+ int regno1 = *(int *) v1p;
+ int regno2 = *(int *) v2p;
+ int diff;
+
+ if ((diff = REG_FREQ (regno2) - REG_FREQ (regno1)) != 0)
+ return diff;
+ return regno1 - regno2;
+}
+
/* Main entry point for the reload pass.
FIRST is the first insn of the function being compiled.
@@ -827,12 +843,27 @@ reload (rtx first, int global)
offsets_known_at = XNEWVEC (char, num_labels);
offsets_at = (HOST_WIDE_INT (*)[NUM_ELIMINABLE_REGS]) xmalloc (num_labels * NUM_ELIMINABLE_REGS * sizeof (HOST_WIDE_INT));
- /* Alter each pseudo-reg rtx to contain its hard reg number.
- Assign stack slots to the pseudos that lack hard regs or equivalents.
- Do not touch virtual registers. */
+ {
+ int n, *pseudo_regs;
- for (i = LAST_VIRTUAL_REGISTER + 1; i < max_regno; i++)
- alter_reg (i, -1);
+ /* Alter each pseudo-reg rtx to contain its hard reg number.
+ Assign stack slots to the pseudos that lack hard regs or
+ equivalents. Do not touch virtual registers. */
+
+ pseudo_regs = XNEWVEC (int, max_regno - LAST_VIRTUAL_REGISTER - 1);
+ for (n = 0, i = LAST_VIRTUAL_REGISTER + 1; i < max_regno; i++)
+ pseudo_regs [n++] = i;
+
+ if (flag_ira)
+ qsort (pseudo_regs, n, sizeof (int), pseudo_reg_compare);
+ if (frame_pointer_needed || ! flag_ira)
+ for (i = 0; i < n; i++)
+ alter_reg (pseudo_regs [i], -1, false);
+ else
+ for (i = n - 1; i >= 0; i--)
+ alter_reg (pseudo_regs [i], -1, false);
+ free (pseudo_regs);
+ }
/* If we have some registers we think can be eliminated, scan all insns to
see if there is an insn that sets one of these registers to something
@@ -954,7 +985,7 @@ reload (rtx first, int global)
the loop. */
reg_equiv_memory_loc[i] = 0;
reg_equiv_init[i] = 0;
- alter_reg (i, -1);
+ alter_reg (i, -1, true);
}
}
@@ -1689,6 +1720,7 @@ static HARD_REG_SET used_spill_regs_loca
static void
count_spilled_pseudo (int spilled, int spilled_nregs, int reg)
{
+ int freq = REG_FREQ (reg);
int r = reg_renumber[reg];
int nregs = hard_regno_nregs[r][PSEUDO_REGNO_MODE (reg)];
@@ -1698,9 +1730,9 @@ count_spilled_pseudo (int spilled, int s
SET_REGNO_REG_SET (&spilled_pseudos, reg);
- spill_add_cost[r] -= REG_FREQ (reg);
+ spill_add_cost[r] -= freq;
while (nregs-- > 0)
- spill_cost[r + nregs] -= REG_FREQ (reg);
+ spill_cost[r + nregs] -= freq;
}
/* Find reload register to use for reload number ORDER. */
@@ -1969,7 +2001,7 @@ delete_dead_insn (rtx insn)
can share one stack slot. */
static void
-alter_reg (int i, int from_reg)
+alter_reg (int i, int from_reg, bool dont_share_p)
{
/* When outputting an inline function, this can happen
for a reg that isn't actually used. */
@@ -2002,7 +2034,13 @@ alter_reg (int i, int from_reg)
unsigned int total_size = MAX (inherent_size, reg_max_ref_width[i]);
unsigned int min_align = reg_max_ref_width[i] * BITS_PER_UNIT;
int adjust = 0;
+ bool shared_p = false;
+
+ x = (dont_share_p || ! flag_ira
+ ? NULL_RTX : reuse_stack_slot (i, inherent_size, total_size));
+ if (x)
+ shared_p = true;
/* Each pseudo reg has an inherent size which comes from its own mode,
and a total size which provides room for paradoxical subregs
which refer to the pseudo reg in wider modes.
@@ -2011,7 +2049,7 @@ alter_reg (int i, int from_reg)
enough inherent space and enough total space.
Otherwise, we allocate a new slot, making sure that it has no less
inherent space, and no less total space, then the previous slot. */
- if (from_reg == -1)
+ else if (from_reg == -1 || (! dont_share_p && flag_ira))
{
/* No known place to spill from => no slot to reuse. */
x = assign_stack_local (mode, total_size,
@@ -2026,6 +2064,9 @@ alter_reg (int i, int from_reg)
/* Nothing can alias this slot except this pseudo. */
set_mem_alias_set (x, new_alias_set ());
+
+ if (! dont_share_p && flag_ira)
+ mark_new_stack_slot (x, i, total_size);
}
/* Reuse a stack slot if possible. */
@@ -2096,8 +2137,13 @@ alter_reg (int i, int from_reg)
/* If we have a decl for the original register, set it for the
memory. If this is a shared MEM, make a copy. */
- if (REG_EXPR (regno_reg_rtx[i])
- && DECL_P (REG_EXPR (regno_reg_rtx[i])))
+ if (shared_p)
+ {
+ x = copy_rtx (x);
+ set_mem_attrs_from_reg (x, regno_reg_rtx[i]);
+ }
+ else if (REG_EXPR (regno_reg_rtx[i])
+ && DECL_P (REG_EXPR (regno_reg_rtx[i])))
{
rtx decl = DECL_RTL_IF_SET (REG_EXPR (regno_reg_rtx[i]));
@@ -2361,7 +2407,7 @@ eliminate_regs_1 (rtx x, enum machine_mo
/* There exists at least one use of REGNO that cannot be
eliminated. Prevent the defining insn from being deleted. */
reg_equiv_init[regno] = NULL_RTX;
- alter_reg (regno, -1);
+ alter_reg (regno, -1, true);
}
return x;
@@ -3710,6 +3756,8 @@ finish_spills (int global)
SET_HARD_REG_BIT (pseudo_previous_regs[i], reg_renumber[i]);
/* Mark it as no longer having a hard register home. */
reg_renumber[i] = -1;
+ if (flag_ira)
+ mark_allocation_change (i);
/* We will need to scan everything again. */
something_changed = 1;
}
@@ -3746,10 +3794,23 @@ finish_spills (int global)
if (reg_old_renumber[i] != reg_renumber[i])
{
HARD_REG_SET forbidden;
+
COPY_HARD_REG_SET (forbidden, bad_spill_regs_global);
IOR_HARD_REG_SET (forbidden, pseudo_forbidden_regs[i]);
IOR_HARD_REG_SET (forbidden, pseudo_previous_regs[i]);
- retry_global_alloc (i, forbidden);
+ if (flag_ira)
+ {
+ /* We might migrate pseudo to another hard register on
+ previous iteration. So check this. */
+ if (reg_renumber [i] < 0)
+ {
+ retry_ira_color (i, forbidden);
+ if (reg_renumber[i] >= 0)
+ something_changed = 1;
+ }
+ }
+ else
+ retry_global_alloc (i, forbidden);
if (reg_renumber[i] >= 0)
CLEAR_REGNO_REG_SET (&spilled_pseudos, i);
}
@@ -3796,7 +3857,7 @@ finish_spills (int global)
if (reg_old_renumber[i] == regno)
continue;
- alter_reg (i, reg_old_renumber[i]);
+ alter_reg (i, reg_old_renumber[i], false);
reg_old_renumber[i] = regno;
if (dump_file)
{
@@ -6610,7 +6671,9 @@ emit_input_reload_insns (struct insn_cha
&& REG_N_SETS (REGNO (old)) == 1)
{
reg_renumber[REGNO (old)] = REGNO (rl->reg_rtx);
- alter_reg (REGNO (old), -1);
+ if (flag_ira)
+ mark_allocation_change (REGNO (old));
+ alter_reg (REGNO (old), -1, false);
}
special = 1;
}
@@ -8081,7 +8144,9 @@ delete_output_reload (rtx insn, int j, i
/* For the debugging info, say the pseudo lives in this reload reg. */
reg_renumber[REGNO (reg)] = REGNO (rld[j].reg_rtx);
- alter_reg (REGNO (reg), -1);
+ if (flag_ira)
+ mark_allocation_change (REGNO (reg));
+ alter_reg (REGNO (reg), -1, false);
}
else
{
/* Integrated Register Allocator entry point.
Contributed by Vladimir Makarov.
Copyright (C) 2006 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING. If not, write to the Free
Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA. */
/* The integrated register allocator (IRA) is called integrated
because register coalescing and register live range splitting are
done on-the-fly during coloring. Register coalescing is done by
hard register preferencing during hard register assigning. The
live range splitting is a byproduct of the regional register
allocation.
The regional allocation is top-down process. The first we do
allocation for all function then we improve it for loops then their
subloops and so on. To reduce register shuffling, the same
mechanism of hard register prefrencing is used. This approach
works as good as Callahan-Koblentz algorithm but it is simpler.
We use Chaitin-Briggs coloring for each loop (or function) with
optional biased coloring. If pseudo-registers got different
location on loop borders we rename them inside the loop and
generate pseudo-register move insns. Some optimizations (like
removing redundant stores, moving register shuffling to less
frequent points, and code duplication reducing) to minimize effect
of register shuffling is done
If we don't improve register allocation for loops we get classic
Chaitin-Briggs coloring (only instead of separate pass of
coalescing, we use hard register preferencing).
Optionally we implements Chow's priority coloring only for all
function. It is quite analogous to the current gcc global register
allocator only we use more sophisticated hard register
preferencing.
Literature is worth to read for better understanding the code:
o Preston Briggs, Keith D. Cooper, Linda Torczon. Improvements to
Graph Coloring Register Allocation.
o David Callahan, Brian Koblenz. Register allocation via
hierarchical graph coloring
o Keith Cooper, Anshuman Dasgupta, Jason Eckhardt. Revisiting Graph
Coloring Register Allocation: A Study of the Chaitin-Briggs and
Callahan-Koblenz Algorithms.
o Guei-Yuan Lueh, Thomas Gross, and Ali-Reza Adl-Tabatabai. Global
Register Allocation Based on Graph Fusion.
*/
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "regs.h"
#include "rtl.h"
#include "tm_p.h"
#include "target.h"
#include "flags.h"
#include "obstack.h"
#include "bitmap.h"
#include "hard-reg-set.h"
#include "basic-block.h"
#include "expr.h"
#include "recog.h"
#include "params.h"
#include "timevar.h"
#include "tree-pass.h"
#include "output.h"
#include "reload.h"
#include "errors.h"
#include "integrate.h"
#include "df.h"
#include "ggc.h"
#include "ira-int.h"
static void setup_inner_mode (void);
static void setup_reg_mode_hard_regset (void);
static void setup_class_subset_and_move_costs (void);
static void setup_class_hard_regs (void);
static void setup_available_class_regs (void);
static void setup_alloc_regs (int);
static void setup_reg_subclasses (void);
static void setup_cover_classes (void);
static void setup_class_translate (void);
static void print_class_cover (FILE *);
static void find_reg_class_closure (void);
static void setup_reg_class_nregs (void);
static void setup_prohibited_class_mode_regs (void);
static void setup_eliminable_regset (void);
static void find_reg_equiv_invariant_const (void);
static void setup_reg_renumber (void);
static void calculate_allocation_cost (void);
#ifdef ENABLE_IRA_CHECKING
static void check_allocation (void);
#endif
static void fix_reg_equiv_init (void);
#ifdef ENABLE_IRA_CHECKING
static void print_redundant_copies (void);
#endif
static bool gate_ira (void);
static unsigned int rest_of_handle_ira (void);
/* Dump file for IRA. */
FILE *ira_dump_file;
/* The number of elements in the following array. */
int spilled_reg_stack_slots_num;
/* The following array contains description of spilled registers stack
slots have been used in the current function so far. */
struct spilled_reg_stack_slot *spilled_reg_stack_slots;
/* The following variable values are correspondingly overall cost of
the allocation, cost of hard register usage for the pseudos, cost
of memory usage for the pseudos, cost of loads, stores and register
move insns generated for register live range splitting. */
int overall_cost;
int reg_cost, mem_cost;
int load_cost, store_cost, shuffle_cost;
int move_loops_num, additional_jumps_num;
/* A mode whose value is immediately contained in given mode
value. */
unsigned char mode_inner_mode [NUM_MACHINE_MODES];
/* The following array is a map hard regs X modes -> number registers
for store value of given mode starting with given hard
register. */
HARD_REG_SET reg_mode_hard_regset [FIRST_PSEUDO_REGISTER] [NUM_MACHINE_MODES];
/* The following two variables are array analog of macros
MEMORY_MOVE_COST and REGISTER_MOVE_COST. */
int memory_move_cost [MAX_MACHINE_MODE] [N_REG_CLASSES] [2];
int register_move_cost [MAX_MACHINE_MODE] [N_REG_CLASSES] [N_REG_CLASSES];
/* Nonzero value of element of the following array means that the
1st class is a subset of the 2nd class. */
int class_subset_p [N_REG_CLASSES] [N_REG_CLASSES];
/* Temporary hard reg set used for different calculation. */
static HARD_REG_SET temp_hard_regset;
/* The function sets up mode_inner_mode array. */
static void
setup_inner_mode (void)
{
int i;
enum machine_mode wider;
for (i = 0; i < NUM_MACHINE_MODES; i++)
mode_inner_mode [i] = VOIDmode;
for (i = 0; i < NUM_MACHINE_MODES; i++)
{
wider = GET_MODE_WIDER_MODE (i);
if (wider != VOIDmode)
{
ira_assert (mode_inner_mode [wider] == VOIDmode);
mode_inner_mode [wider] = i;
}
}
}
/* The function sets up map REG_MODE_HARD_REGSET. */
static void
setup_reg_mode_hard_regset (void)
{
int i, m, hard_regno;
for (m = 0; m < NUM_MACHINE_MODES; m++)
for (hard_regno = 0; hard_regno < FIRST_PSEUDO_REGISTER; hard_regno++)
{
CLEAR_HARD_REG_SET (reg_mode_hard_regset [hard_regno] [m]);
for (i = hard_regno_nregs [hard_regno] [m] - 1; i >= 0; i--)
if (hard_regno + i < FIRST_PSEUDO_REGISTER)
SET_HARD_REG_BIT (reg_mode_hard_regset [hard_regno] [m],
hard_regno + i);
}
}
/* The function sets up MEMORY_MOVE_COST, REGISTER_MOVE_COST and
CLASS_SUBSET_P. */
static void
setup_class_subset_and_move_costs (void)
{
int cl, cl2;
enum machine_mode mode;
for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
{
for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
{
memory_move_cost [mode] [cl] [0] = MEMORY_MOVE_COST (mode, cl, 0);
memory_move_cost [mode] [cl] [1] = MEMORY_MOVE_COST (mode, cl, 1);
}
for (cl2 = (int) N_REG_CLASSES - 1; cl2 >= 0; cl2--)
{
if (cl != NO_REGS && cl2 != NO_REGS)
for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
register_move_cost [mode] [cl] [cl2]
= REGISTER_MOVE_COST (mode, cl, cl2);
GO_IF_HARD_REG_SUBSET (reg_class_contents[cl],
reg_class_contents[cl2],
subset);
class_subset_p [cl] [cl2] = FALSE;
continue;
subset:
class_subset_p [cl] [cl2] = TRUE;
}
}
}
/* Hard registers which can be used for the allocation of given
register class. The order is defined by the allocation order. */
short class_hard_regs [N_REG_CLASSES] [FIRST_PSEUDO_REGISTER];
/* The size of the above array for given register class. */
int class_hard_regs_num [N_REG_CLASSES];
/* Index (in class_hard_regs) for given register class and hard
register (in general case a hard register can belong to several
register classes). */
short class_hard_reg_index [N_REG_CLASSES] [FIRST_PSEUDO_REGISTER];
/* The function sets up the three arrays declared above. */
static void
setup_class_hard_regs (void)
{
int cl, i, hard_regno, n;
HARD_REG_SET processed_hard_reg_set;
ira_assert (SHRT_MAX >= FIRST_PSEUDO_REGISTER);
/* We could call ORDER_REGS_FOR_LOCAL_ALLOC here (it is usually
putting hard callee-used hard registers first). But our
heuristics work better. */
for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
{
COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents [cl]);
AND_COMPL_HARD_REG_SET (temp_hard_regset, no_alloc_regs);
CLEAR_HARD_REG_SET (processed_hard_reg_set);
for (n = 0, i = 0; i < FIRST_PSEUDO_REGISTER; i++)
{
#ifdef REG_ALLOC_ORDER
hard_regno = reg_alloc_order [i];
#else
hard_regno = i;
#endif
if (TEST_HARD_REG_BIT (processed_hard_reg_set, hard_regno))
continue;
SET_HARD_REG_BIT (processed_hard_reg_set, hard_regno);
if (! TEST_HARD_REG_BIT (temp_hard_regset, hard_regno))
class_hard_reg_index [cl] [hard_regno] = -1;
else
{
class_hard_reg_index [cl] [hard_regno] = n;
class_hard_regs [cl] [n++] = hard_regno;
}
}
class_hard_regs_num [cl] = n;
}
}
/* Number of class hard registers available for the register
allocation for given classes. */
int available_class_regs [N_REG_CLASSES];
/* Function setting up AVAILABLE_CLASS_REGS. */
static void
setup_available_class_regs (void)
{
int i, j;
memset (available_class_regs, 0, sizeof (available_class_regs));
for (i = 0; i < N_REG_CLASSES; i++)
{
COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents [i]);
AND_COMPL_HARD_REG_SET (temp_hard_regset, no_alloc_regs);
for (j = 0; j < FIRST_PSEUDO_REGISTER; j++)
if (TEST_HARD_REG_BIT (temp_hard_regset, j))
available_class_regs [i]++;
}
}
/* Hard registers can not be used for the register allocator. */
HARD_REG_SET no_alloc_regs;
/* The function setting up different global variables defining hard
registers for the allocation. It depends on USE_HARD_FRAME_P whose
nonzero value means that we can use hard frame pointer for the
allocation. */
static void
setup_alloc_regs (int use_hard_frame_p)
{
COPY_HARD_REG_SET (no_alloc_regs, fixed_reg_set);
if (! use_hard_frame_p)
SET_HARD_REG_BIT (no_alloc_regs, HARD_FRAME_POINTER_REGNUM);
setup_class_hard_regs ();
setup_available_class_regs ();
}
/* Define the following macro if allocation through malloc if
preferable. */
/*#define IRA_NO_OBSTACK*/
#ifndef IRA_NO_OBSTACK
/* Obstack used for storing all dynamic data (except bitmaps) of the
IRA. */
static struct obstack ira_obstack;
#endif
/* Obstack used for storing all bitmaps of the IRA. */
static struct bitmap_obstack ira_bitmap_obstack;
/* The function allocates memory of size LEN for IRA data. */
void *
ira_allocate (size_t len)
{
void *res;
#ifndef IRA_NO_OBSTACK
res = obstack_alloc (&ira_obstack, len);
#else
res = xmalloc (len);
#endif
return res;
}
/* The function free memory ADDR allocated for IRA data. */
void
ira_free (void *addr ATTRIBUTE_UNUSED)
{
#ifndef IRA_NO_OBSTACK
/* do nothing */
#else
free (addr);
#endif
}
/* The function allocates bitmap for IRA. */
bitmap
ira_allocate_bitmap (void)
{
return BITMAP_ALLOC (&ira_bitmap_obstack);
}
/* The function frees bitmap B allocated for IRA. */
void
ira_free_bitmap (bitmap b ATTRIBUTE_UNUSED)
{
/* do nothing */
}
/* The function allocates regset for IRA. */
regset
ira_allocate_regset (void)
{
return ALLOC_REG_SET (&ira_bitmap_obstack);
}
/* The function frees regset R allocated for IRA. */
void
ira_free_regset (regset r ATTRIBUTE_UNUSED)
{
/* do nothing */
}
/* The function returns nonzero if hard registers starting with
HARD_REGNO and containing value of MODE are not in set
HARD_REGSET. */
int
hard_reg_not_in_set_p (int hard_regno, enum machine_mode mode,
HARD_REG_SET hard_regset)
{
int i;
ira_assert (hard_regno >= 0);
for (i = hard_regno_nregs [hard_regno] [mode] - 1; i >= 0; i--)
if (TEST_HARD_REG_BIT (hard_regset, hard_regno + i))
return FALSE;
return TRUE;
}
/* The function outputs information about allocation of all pseudos
into file F. */
void
print_disposition (FILE *f)
{
int i, n, max_regno;
pseudo_t p;
basic_block bb;
fprintf (f, "Disposition:");
max_regno = max_reg_num ();
for (n = 0, i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
for (p = regno_pseudo_map [i]; p != NULL; p = PSEUDO_NEXT_REGNO_PSEUDO (p))
{
if (n % 4 == 0)
fprintf (f, "\n");
n++;
fprintf (f, " %4d:r%-4d", PSEUDO_NUM (p), PSEUDO_REGNO (p));
if ((bb = PSEUDO_LOOP_TREE_NODE (p)->bb) != NULL)
fprintf (f, "b%-3d", bb->index);
else
fprintf (f, "l%-3d", PSEUDO_LOOP_TREE_NODE (p)->loop->num);
if (PSEUDO_HARD_REGNO (p) >= 0)
fprintf (f, " %3d", PSEUDO_HARD_REGNO (p));
else
fprintf (f, " mem");
}
fprintf (f, "\n");
}
/* The function outputs information about allocation of all pseudos
into stderr. */
void
debug_disposition (void)
{
print_disposition (stderr);
}
/* For each reg class, table listing all the classes contained in it
(excluding the class itself. Fixed registers are excluded from the
consideration). */
static enum reg_class alloc_reg_class_subclasses[N_REG_CLASSES][N_REG_CLASSES];
/* The function initializes the tables of subclasses of each reg
class. */
static void
setup_reg_subclasses (void)
{
int i, j;
for (i = 0; i < N_REG_CLASSES; i++)
for (j = 0; j < N_REG_CLASSES; j++)
alloc_reg_class_subclasses [i] [j] = LIM_REG_CLASSES;
for (i = 0; i < N_REG_CLASSES; i++)
{
if (i == (int) NO_REGS)
continue;
COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents [i]);
AND_COMPL_HARD_REG_SET (temp_hard_regset, fixed_reg_set);
GO_IF_HARD_REG_EQUAL (temp_hard_regset, zero_hard_reg_set, cont);
goto ok;
cont:
continue;
ok:
for (j = 0; j < N_REG_CLASSES; j++)
if (i != j)
{
enum reg_class *p;
GO_IF_HARD_REG_SUBSET (reg_class_contents [i],
reg_class_contents [j], subclass);
continue;
subclass:
p = &alloc_reg_class_subclasses [j] [0];
while (*p != LIM_REG_CLASSES) p++;
*p = (enum reg_class) i;
}
}
}
/* The value is size of the subsequent array. */
int reg_class_cover_size;
/* The array containing cover classes whose hard registers are used
for the allocation -- see also comments for macro
IRA_COVER_CLASSES. */
enum reg_class reg_class_cover [N_REG_CLASSES];
/* The function checks IRA_COVER_CLASSES and sets the two global
variables defined above. */
static void
setup_cover_classes (void)
{
int i, j;
enum reg_class cl;
static enum reg_class classes [] = IRA_COVER_CLASSES;
reg_class_cover_size = 0;
for (i = 0; (cl = classes [i]) != LIM_REG_CLASSES; i++)
{
for (j = 0; j < i; j++)
if (reg_classes_intersect_p (cl, classes [j]))
gcc_unreachable ();
COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents [cl]);
AND_COMPL_HARD_REG_SET (temp_hard_regset, fixed_reg_set);
GO_IF_HARD_REG_EQUAL (temp_hard_regset, zero_hard_reg_set, cont);
reg_class_cover [reg_class_cover_size++] = cl;
cont:
;
}
}
/* Map of register classes to corresponding cover class containing the
given class. If given class is not a subset of a cover class, we
translate it into the cheapest cover class. */
enum reg_class class_translate [N_REG_CLASSES];
/* The function sets up array CLASS_TRANSLATE. */
static void
setup_class_translate (void)
{
enum reg_class cl, cover_class, best_class, *cl_ptr;
enum machine_mode mode;
int i, cost, min_cost, best_cost;
for (cl = 0; cl < N_REG_CLASSES; cl++)
class_translate [cl] = NO_REGS;
for (i = 0; i < reg_class_cover_size; i++)
{
cover_class = reg_class_cover [i];
for (cl_ptr = &alloc_reg_class_subclasses [cover_class] [0];
(cl = *cl_ptr) != LIM_REG_CLASSES;
cl_ptr++)
{
if (class_translate [cl] == NO_REGS)
class_translate [cl] = cover_class;
#ifdef ENABLE_IRA_CHECKING
else
{
COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents [cl]);
AND_COMPL_HARD_REG_SET (temp_hard_regset, fixed_reg_set);
GO_IF_HARD_REG_SUBSET (temp_hard_regset, zero_hard_reg_set, ok);
gcc_unreachable ();
ok:
;
}
#endif
}
class_translate [cover_class] = cover_class;
}
/* For classes which are not fully covered by a cover class (in
other words covered by more one cover class), use the cheapest
cover class. */
for (cl = 0; cl < N_REG_CLASSES; cl++)
{
if (cl == NO_REGS || class_translate [cl] != NO_REGS)
continue;
best_class = NO_REGS;
best_cost = INT_MAX;
for (i = 0; i < reg_class_cover_size; i++)
{
cover_class = reg_class_cover [i];
COPY_HARD_REG_SET (temp_hard_regset,
reg_class_contents [cover_class]);
AND_HARD_REG_SET (temp_hard_regset, reg_class_contents [cl]);
GO_IF_HARD_REG_EQUAL (temp_hard_regset, zero_hard_reg_set,
no_intersection);
min_cost = INT_MAX;
for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
{
cost = (memory_move_cost [mode] [cl] [0]
+ memory_move_cost [mode] [cl] [1]);
if (min_cost > cost)
min_cost = cost;
}
if (best_class == NO_REGS || best_cost > min_cost)
{
best_class = cover_class;
best_cost = min_cost;
}
no_intersection:
;
}
class_translate [cl] = best_class;
}
}
/* The function outputs all cover classes and the translation map into
file F. */
static void
print_class_cover (FILE *f)
{
static const char *const reg_class_names[] = REG_CLASS_NAMES;
int i;
fprintf (f, "Class cover:\n");
for (i = 0; i < reg_class_cover_size; i++)
fprintf (f, " %s", reg_class_names [reg_class_cover [i]]);
fprintf (f, "\nClass translation:\n");
for (i = 0; i < N_REG_CLASSES; i++)
fprintf (f, " %s -> %s\n", reg_class_names [i],
reg_class_names [class_translate [i]]);
}
/* The function outputs all cover classes and the translation map into
stderr. */
void
debug_class_cover (void)
{
print_class_cover (stderr);
}
/* Function setting up different arrays concerning class subsets and
cover classes. */
static void
find_reg_class_closure (void)
{
setup_reg_subclasses ();
setup_cover_classes ();
setup_class_translate ();
}
/* Map: register class x machine mode -> number of hard registers of
given class needed to store value of given mode. If the number is
different, the size will be negative. */
int reg_class_nregs [N_REG_CLASSES] [MAX_MACHINE_MODE];
/* Maximal value of the previous array elements. */
int max_nregs;
/* Function forming REG_CLASS_NREGS map. */
static void
setup_reg_class_nregs (void)
{
int m;
enum reg_class cl;
max_nregs = -1;
for (cl = 0; cl < N_REG_CLASSES; cl++)
for (m = 0; m < MAX_MACHINE_MODE; m++)
{
reg_class_nregs [cl] [m] = CLASS_MAX_NREGS (cl, m);
if (max_nregs < reg_class_nregs [cl] [m])
max_nregs = reg_class_nregs [cl] [m];
}
}
/* Array whose values are hard regset of hard registers of given
register class whose HARD_REGNO_MODE_OK values are zero. */
HARD_REG_SET prohibited_class_mode_regs [N_REG_CLASSES] [NUM_MACHINE_MODES];
/* The function setting up PROHIBITED_CLASS_MODE_REGS. */
static void
setup_prohibited_class_mode_regs (void)
{
int i, j, k, hard_regno;
enum reg_class cl;
for (i = 0; i < reg_class_cover_size; i++)
{
cl = reg_class_cover [i];
for (j = 0; j < NUM_MACHINE_MODES; j++)
{
CLEAR_HARD_REG_SET (prohibited_class_mode_regs [cl] [j]);
for (k = class_hard_regs_num [cl] - 1; k >= 0; k--)
{
hard_regno = class_hard_regs [cl] [k];
if (! HARD_REGNO_MODE_OK (hard_regno, j))
SET_HARD_REG_BIT (prohibited_class_mode_regs [cl] [j],
hard_regno);
}
}
}
}
/* Hard regsets whose all bits are correspondingly zero or one. */
HARD_REG_SET zero_hard_reg_set;
HARD_REG_SET one_hard_reg_set;
/* Function called once during compiler work. It sets up different
arrays whose values don't depend on the compiled function. */
void
init_ira_once (void)
{
CLEAR_HARD_REG_SET (zero_hard_reg_set);
SET_HARD_REG_SET (one_hard_reg_set);
setup_inner_mode ();
setup_reg_mode_hard_regset ();
setup_class_subset_and_move_costs ();
setup_alloc_regs (flag_omit_frame_pointer != 0);
find_reg_class_closure ();
setup_reg_class_nregs ();
setup_prohibited_class_mode_regs ();
init_ira_costs_once ();
}
/* The function sets up ELIMINABLE_REGSET and REGS_EVER_LIVE. */
static void
setup_eliminable_regset (void)
{
int i;
#ifdef ELIMINABLE_REGS
static const struct {const int from, to; } eliminables[] = ELIMINABLE_REGS;
#endif
int need_fp
= (! flag_omit_frame_pointer
|| (current_function_calls_alloca && EXIT_IGNORE_STACK)
|| FRAME_POINTER_REQUIRED);
CLEAR_HARD_REG_SET (eliminable_regset);
/* Build the regset of all eliminable registers and show we can't
use those that we already know won't be eliminated. */
#ifdef ELIMINABLE_REGS
for (i = 0; i < (int) ARRAY_SIZE (eliminables); i++)
{
bool cannot_elim
= (! CAN_ELIMINATE (eliminables [i].from, eliminables [i].to)
|| (eliminables [i].to == STACK_POINTER_REGNUM && need_fp));
if (! regs_asm_clobbered [eliminables [i].from])
{
if (! cannot_elim)
SET_HARD_REG_BIT (eliminable_regset, eliminables [i].from);
}
else if (cannot_elim)
error ("%s cannot be used in asm here",
reg_names [eliminables [i].from]);
else
regs_ever_live [eliminables [i].from] = 1;
}
#if FRAME_POINTER_REGNUM != HARD_FRAME_POINTER_REGNUM
if (! regs_asm_clobbered [HARD_FRAME_POINTER_REGNUM])
{
if (! need_fp)
SET_HARD_REG_BIT (eliminable_regset, HARD_FRAME_POINTER_REGNUM);
}
else if (need_fp)
error ("%s cannot be used in asm here",
reg_names [HARD_FRAME_POINTER_REGNUM]);
else
regs_ever_live [HARD_FRAME_POINTER_REGNUM] = 1;
#endif
#else
if (! regs_asm_clobbered [FRAME_POINTER_REGNUM])
{
if (! need_fp)
SET_HARD_REG_BIT (eliminable_regset, FRAME_POINTER_REGNUM);
}
else if (need_fp)
error ("%s cannot be used in asm here", reg_names [FRAME_POINTER_REGNUM]);
else
regs_ever_live [FRAME_POINTER_REGNUM] = 1;
#endif
}
/* The element value is nonzero if the corresponding regno value is
invariant. */
int *reg_equiv_invariant_p;
/* The element value is equiv constant or NULL_RTX. */
rtx *reg_equiv_const;
/* The function sets up the two array declaraed above. */
static void
find_reg_equiv_invariant_const (void)
{
int i, invariant_p;
rtx list, insn, note, constant, x;
for (i = FIRST_PSEUDO_REGISTER; i < reg_equiv_init_size; i++)
{
constant = NULL_RTX;
invariant_p = FALSE;
for (list = reg_equiv_init [i]; list != NULL_RTX; list = XEXP (list, 1))
{
insn = XEXP (list, 0);
note = find_reg_note (insn, REG_EQUIV, NULL_RTX);
if (note == NULL_RTX)
continue;
x = XEXP (note, 0);
if (! function_invariant_p (x)
|| ! flag_pic
/* A function invariant is often CONSTANT_P but may
include a register. We promise to only pass CONSTANT_P
objects to LEGITIMATE_PIC_OPERAND_P. */
|| (CONSTANT_P (x) && LEGITIMATE_PIC_OPERAND_P (x)))
{
/* It can happen that a REG_EQUIV note contains a MEM that
is not a legitimate memory operand. As later stages of
reload assume that all addresses found in the
reg_equiv_* arrays were originally legitimate, we
ignore such REG_EQUIV notes. */
if (memory_operand (x, VOIDmode))
continue;
else if (function_invariant_p (x))
{
if (GET_CODE (x) == PLUS
|| x == frame_pointer_rtx || x == arg_pointer_rtx)
invariant_p = TRUE;
else
constant = x;
}
}
}
reg_equiv_invariant_p [i] = invariant_p;
reg_equiv_const [i] = constant;
}
}
/* The function sets up REG_RENUMBER and CALLER_SAVE_NEEDED used by
reload from the allocation found by IRA. */
static void
setup_reg_renumber (void)
{
int i, hard_regno;
pseudo_t p;
for (i = 0; i < pseudos_num; i++)
{
p = pseudos [i];
if (PSEUDO_REGNO (p) < 0)
/* It is a cap. */
continue;
hard_regno = PSEUDO_HARD_REGNO (p);
reg_renumber [REGNO (PSEUDO_REG (p))]
= (hard_regno < 0 ? -1 : hard_regno);
if (hard_regno >= 0 && PSEUDO_CALLS_CROSSED_NUM (p) != 0
&& ! hard_reg_not_in_set_p (hard_regno, PSEUDO_MODE (p),
call_used_reg_set))
{
ira_assert (flag_caller_saves);
caller_save_needed = 1;
}
}
}
/* The function calculates cost of the found register allocation. */
static void
calculate_allocation_cost (void)
{
int i, hard_regno, cost;
pseudo_t p;
overall_cost = reg_cost = mem_cost = 0;
for (i = 0; i < pseudos_num; i++)
{
p = pseudos [i];
hard_regno = PSEUDO_HARD_REGNO (p) = reg_renumber [PSEUDO_REGNO (p)];
PSEUDO_ASSIGNED_P (p) = TRUE;
if (hard_regno < 0)
{
cost = PSEUDO_MEMORY_COST (p);
mem_cost += cost;
}
else
{
cost = (PSEUDO_HARD_REG_COSTS (p)
[class_hard_reg_index
[PSEUDO_COVER_CLASS (p)] [hard_regno]]);
reg_cost += cost;
}
overall_cost += cost;
}
if (ira_dump_file != NULL)
{
fprintf (ira_dump_file,
"+++Costs: overall %d, reg %d, mem %d, ld %d, st %d, move %d\n",
overall_cost, reg_cost, mem_cost,
load_cost, store_cost, shuffle_cost);
fprintf (ira_dump_file, "+++ move loops %d, new jumps %d\n",
move_loops_num, additional_jumps_num);
}
}
#ifdef ENABLE_IRA_CHECKING
/* The function checks correctness of the allocation. */
static void
check_allocation (void)
{
pseudo_t p, conflict_p, *pseudo_vec;
int i, hard_regno, conflict_hard_regno, j, nregs, conflict_nregs;
for (i = 0; i < pseudos_num; i++)
{
p = pseudos [i];
if (PSEUDO_REGNO (p) < 0 || (hard_regno = PSEUDO_HARD_REGNO (p)) < 0)
continue;
nregs = hard_regno_nregs [hard_regno] [PSEUDO_MODE (p)];
pseudo_vec = PSEUDO_CONFLICT_PSEUDO_VEC (p);
for (j = 0; (conflict_p = pseudo_vec [j]) != NULL; j++)
if ((conflict_hard_regno = PSEUDO_HARD_REGNO (conflict_p)) >= 0)
{
conflict_nregs
= (hard_regno_nregs
[conflict_hard_regno] [PSEUDO_MODE (conflict_p)]);
if ((conflict_hard_regno <= hard_regno
&& hard_regno < conflict_hard_regno + conflict_nregs)
|| (hard_regno <= conflict_hard_regno
&& conflict_hard_regno < hard_regno + nregs))
{
fprintf (stderr, "bad allocation for %d and %d\n",
PSEUDO_REGNO (p), PSEUDO_REGNO (conflict_p));
gcc_unreachable ();
}
}
}
}
#endif
/* The function fixes values of array REG_EQUIV_INIT after live range
splitting done by IRA. */
static void
fix_reg_equiv_init (void)
{
int max_regno = max_reg_num ();
int i, new_regno;
rtx x, prev, next, insn, set;
if (reg_equiv_init_size < max_regno)
{
reg_equiv_init = ggc_realloc (reg_equiv_init, max_regno * sizeof (rtx));
while (reg_equiv_init_size < max_regno)
reg_equiv_init [reg_equiv_init_size++] = NULL_RTX;
for (i = FIRST_PSEUDO_REGISTER; i < reg_equiv_init_size; i++)
for (prev = NULL_RTX, x = reg_equiv_init [i]; x != NULL_RTX; x = next)
{
next = XEXP (x, 1);
insn = XEXP (x, 0);
set = single_set (insn);
ira_assert (set != NULL_RTX
&& (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))));
if (REG_P (SET_DEST (set))
&& ((int) REGNO (SET_DEST (set)) == i
|| (int) ORIGINAL_REGNO (SET_DEST (set)) == i))
new_regno = REGNO (SET_DEST (set));
else if (REG_P (SET_SRC (set))
&& ((int) REGNO (SET_SRC (set)) == i
|| (int) ORIGINAL_REGNO (SET_SRC (set)) == i))
new_regno = REGNO (SET_SRC (set));
else
gcc_unreachable ();
if (new_regno == i)
prev = x;
else
{
if (prev == NULL_RTX)
reg_equiv_init [i] = next;
else
XEXP (prev, 1) = next;
XEXP (x, 1) = reg_equiv_init [new_regno];
reg_equiv_init [new_regno] = x;
}
}
}
}
#ifdef ENABLE_IRA_CHECKING
/* The function prints redundant memory memory copies. */
static void
print_redundant_copies (void)
{
int i, hard_regno;
pseudo_t p;
struct pseudo_copy *cp, *next_cp;
for (i = 0; i < pseudos_num; i++)
{
p = pseudos [i];
if (PSEUDO_REGNO (p) < 0)
/* It is a cap. */
continue;
hard_regno = PSEUDO_HARD_REGNO (p);
if (hard_regno >= 0)
continue;
for (cp = PSEUDO_COPIES (p); cp != NULL; cp = next_cp)
if (cp->first == p)
next_cp = cp->next_first_pseudo_copy;
else
{
next_cp = cp->next_second_pseudo_copy;
if (ira_dump_file != NULL && cp->move_insn != NULL_RTX
&& PSEUDO_HARD_REGNO (cp->first) == hard_regno)
fprintf (ira_dump_file, "move %d(freq %d):%d\n",
INSN_UID (cp->move_insn), cp->freq, hard_regno);
}
}
}
#endif
/* This is the main entry of IRA. */
void
ira (FILE *f)
{
int overall_cost_before;
int rebuild_p;
ira_dump_file = f;
no_new_pseudos = 0;
rebuild_p = update_equiv_regs ();
#ifndef IRA_NO_OBSTACK
gcc_obstack_init (&ira_obstack);
#endif
bitmap_obstack_initialize (&ira_bitmap_obstack);
reg_equiv_invariant_p = ira_allocate (max_reg_num () * sizeof (int));
memset (reg_equiv_invariant_p, 0, max_reg_num () * sizeof (int));
reg_equiv_const = ira_allocate (max_reg_num () * sizeof (rtx));
memset (reg_equiv_const, 0, max_reg_num () * sizeof (rtx));
find_reg_equiv_invariant_const ();
if (rebuild_p)
{
rebuild_jump_labels (get_insns ());
purge_all_dead_edges ();
delete_unreachable_blocks ();
}
setup_eliminable_regset ();
overall_cost = reg_cost = mem_cost = 0;
load_cost = store_cost = shuffle_cost = 0;
move_loops_num = additional_jumps_num = 0;
ira_build (flag_ira_algorithm != IRA_ALGORITHM_CB
&& flag_ira_algorithm != IRA_ALGORITHM_PRIORITY);
ira_color ();
ira_emit ();
max_regno = max_reg_num ();
/* Allocate the reg_renumber array. */
allocate_reg_info (max_regno, FALSE, TRUE);
setup_reg_renumber ();
no_new_pseudos = 1;
if (flag_ira_algorithm == IRA_ALGORITHM_REGIONAL)
{
/* Even if new registers are not created rebuild IRA internal
representation to use correct regno pseudo map. */
ira_destroy ();
ira_build (FALSE);
}
calculate_allocation_cost ();
#ifdef ENABLE_IRA_CHECKING
check_allocation ();
#endif
max_regno = max_reg_num ();
delete_trivially_dead_insns (get_insns (), max_regno);
/* The allocator makes live register information inaccurate. */
life_analysis (PROP_DEATH_NOTES | PROP_LOG_LINKS | PROP_REG_INFO);
max_regno = max_reg_num ();
/* Determine if the current function is a leaf before running IRA
since this can impact optimizations done by the prologue and
epilogue thus changing register elimination offsets. */
current_function_is_leaf = leaf_function_p ();
/* Allocate the reg_renumber array. */
allocate_reg_info (max_regno, FALSE, TRUE);
/* And the reg_equiv_memory_loc array. */
VEC_safe_grow (rtx, gc, reg_equiv_memory_loc_vec, max_regno);
memset (VEC_address (rtx, reg_equiv_memory_loc_vec), 0,
sizeof (rtx) * max_regno);
reg_equiv_memory_loc = VEC_address (rtx, reg_equiv_memory_loc_vec);
allocate_initial_values (reg_equiv_memory_loc);
fix_reg_equiv_init ();
/* ??? We need it only because subsequent optimization like
post-reload needs it. */
regclass (get_insns (), max_regno);
#ifdef ENABLE_IRA_CHECKING
print_redundant_copies ();
#endif
overall_cost_before = overall_cost;
spilled_reg_stack_slots_num = 0;
spilled_reg_stack_slots
= ira_allocate (max_regno * sizeof (struct spilled_reg_stack_slot));
build_insn_chain (get_insns ());
reload_completed = ! reload (get_insns (), 1);
ira_free (spilled_reg_stack_slots);
if (ira_dump_file != NULL && overall_cost_before != overall_cost)
fprintf (ira_dump_file, "+++Overall after reload %d\n", overall_cost);
ira_destroy ();
cleanup_cfg (CLEANUP_EXPENSIVE);
ira_free (reg_equiv_invariant_p);
ira_free (reg_equiv_const);
bitmap_obstack_release (&ira_bitmap_obstack);
#ifndef IRA_NO_OBSTACK
obstack_free (&ira_obstack, NULL);
#endif
reload_completed = 1;
}
static bool
gate_ira (void)
{
return flag_ira != 0;
}
/* Run the integrated register allocator. */
static unsigned int
rest_of_handle_ira (void)
{
ira (dump_file);
return 0;
}
struct tree_opt_pass pass_ira =
{
"ira", /* name */
gate_ira, /* gate */
rest_of_handle_ira, /* execute */
NULL, /* sub */
NULL, /* next */
0, /* static_pass_number */
TV_IRA, /* tv_id */
0, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
TODO_dump_func |
TODO_ggc_collect, /* todo_flags_finish */
'y' /* letter */
};
/* Communication between the Integrated Register Allocator and rest of the compiler. Contributed by Vladimir Makarov. Copyright (C) 2006 Free Software Foundation, Inc. This file is part of GCC. GCC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. GCC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GCC; see the file COPYING. If not, write to the Free Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */ /* If secondary reloads are the same for inputs and outputs, define those macros here. */ #ifdef SECONDARY_RELOAD_CLASS #define SECONDARY_INPUT_RELOAD_CLASS(CLASS, MODE, X) \ SECONDARY_RELOAD_CLASS (CLASS, MODE, X) #define SECONDARY_OUTPUT_RELOAD_CLASS(CLASS, MODE, X) \ SECONDARY_RELOAD_CLASS (CLASS, MODE, X) #endif /* If either macro is defined, show that we need secondary reloads. */ #if defined(SECONDARY_INPUT_RELOAD_CLASS) || defined(SECONDARY_OUTPUT_RELOAD_CLASS) #define HAVE_SECONDARY_RELOADS #endif /* If MEMORY_MOVE_COST isn't defined, give it a default here. */ #ifndef MEMORY_MOVE_COST #ifdef HAVE_SECONDARY_RELOADS #define MEMORY_MOVE_COST(MODE,CLASS,IN) \ (4 + memory_move_secondary_cost ((MODE), (CLASS), (IN))) #else #define MEMORY_MOVE_COST(MODE,CLASS,IN) 4 #endif #endif extern GTY (()) struct varray_head_tag *reg_equiv_memory_loc_varray; extern rtx *reg_equiv_constant; extern rtx *reg_equiv_memory_loc; extern rtx *reg_equiv_address; extern rtx *reg_equiv_mem; extern void init_ira_once (void); extern rtx ira_eliminate_regs (rtx, enum machine_mode); extern void ira (FILE *); extern void mark_allocation_change (int); extern void try_to_migrate (int, HARD_REG_SET, void *, HARD_REG_SET *); extern void retry_ira_color (int, HARD_REG_SET); extern rtx reuse_stack_slot (int, unsigned int, unsigned int); extern void mark_new_stack_slot (rtx, int, unsigned int);
/* Building pseudos for IRA.
Contributed by Vladimir Makarov.
Copyright (C) 2006 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING. If not, write to the Free
Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA. */
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "rtl.h"
#include "tm_p.h"
#include "target.h"
#include "regs.h"
#include "flags.h"
#include "hard-reg-set.h"
#include "basic-block.h"
#include "insn-config.h"
#include "recog.h"
#include "toplev.h"
#include "params.h"
#include "df.h"
#include "output.h"
#include "reload.h"
#include "ira-int.h"
static void create_loop_tree_nodes (int);
static void finish_loop_tree_nodes (void);
static void add_loop_to_tree (struct loop *);
static void form_loop_tree (void);
static void initiate_pseudos (void);
static void check_pseudo_conflict_vec (pseudo_t, int);
static void add_pseudo_conflict (pseudo_t, pseudo_t);
static pseudo_t create_cap_pseudo (pseudo_t);
static void finish_pseudos (void);
static void initiate_copies (void);
static void finish_copies (void);
static void create_insn_pseudos (rtx, int);
static void create_bb_pseudos (struct ira_loop_tree_node *);
static void create_loop_pseudos (edge);
static void create_loop_tree_node_pseudos (struct ira_loop_tree_node *);
static void create_pseudos (void);
static void create_loop_tree_node_caps (struct ira_loop_tree_node *);
/* All natural loops. */
struct loops ira_loops;
/* The root of the loop tree corresponding to the all function. */
struct ira_loop_tree_node *ira_loop_tree_root;
/* All basic block data are referred through the following array. We
can not use member `aux' for this because it is used for insertion
of insns on edges. */
struct ira_loop_tree_node *ira_bb_nodes;
/* All loop data are referred through the following array. */
struct ira_loop_tree_node *ira_loop_nodes;
/* Map regno -> pseudo for the current loop tree node. */
pseudo_t *regno_pseudo_map;
/* Array of references to all pseudos and its size. The order number
of the pseudo corresponds to the index in the array. */
pseudo_t *pseudos;
int pseudos_num;
/* Array of references to copies and its size. The order number of
the copy corresponds to the index in the array. */
copy_t *copies;
int copies_num;
/* Data flow data used for IRA data flow analysis. */
struct df *build_df;
/* LAST_BASIC_BLOCK before generating additional insns because of live
range splitting. Emitting insns on a critical edge creates a new
basic block. */
static int last_basic_block_before_change;
/* The following function creates the loop tree nodes. If LOOPS_P is
zero, the nodes corresponding to the loops (except the root which
corresponds the all function) will be not created (it will be done
only for basic blocks). */
static void
create_loop_tree_nodes (int loops_p)
{
unsigned int i, j;
int max_regno, skip_p;
edge_iterator ei;
edge e;
VEC (edge, heap) *edges;
loop_p loop;
ira_bb_nodes
= ira_allocate (sizeof (struct ira_loop_tree_node) * last_basic_block);
last_basic_block_before_change = last_basic_block;
for (i = 0; i < (unsigned int) last_basic_block; i++)
{
ira_bb_nodes [i].regno_pseudo_map = NULL;
ira_bb_nodes [i].mentioned_pseudos = NULL;
ira_bb_nodes [i].modified_regnos = NULL;
ira_bb_nodes [i].border_pseudos = NULL;
ira_bb_nodes [i].local_copies = NULL;
}
ira_loop_nodes = ira_allocate (sizeof (struct ira_loop_tree_node)
* VEC_length (loop_p, ira_loops.larray));
max_regno = max_reg_num ();
for (i = 0; VEC_iterate (loop_p, ira_loops.larray, i, loop); i++)
{
if (loop != ira_loops.tree_root)
{
ira_loop_nodes [i].regno_pseudo_map = NULL;
if (! loops_p)
continue;
skip_p = FALSE;
FOR_EACH_EDGE (e, ei, loop->header->preds)
if (e->src != loop->latch
&& (e->flags & EDGE_ABNORMAL) && EDGE_CRITICAL_P (e))
{
skip_p = TRUE;
break;
}
if (skip_p)
continue;
edges = get_loop_exit_edges (loop);
for (j = 0; VEC_iterate (edge, edges, j, e); j++)
if ((e->flags & EDGE_ABNORMAL) && EDGE_CRITICAL_P (e))
{
skip_p = TRUE;
break;
}
VEC_free (edge, heap, edges);
if (skip_p)
continue;
}
ira_loop_nodes [i].regno_pseudo_map
= ira_allocate (sizeof (pseudo_t) * max_regno);
memset (ira_loop_nodes [i].regno_pseudo_map, 0,
sizeof (pseudo_t) * max_regno);
ira_loop_nodes [i].mentioned_pseudos = ira_allocate_bitmap ();
ira_loop_nodes [i].modified_regnos = ira_allocate_bitmap ();
ira_loop_nodes [i].border_pseudos = ira_allocate_bitmap ();
ira_loop_nodes [i].local_copies = ira_allocate_bitmap ();
}
}
/* The function frees the loop tree nodes. */
static void
finish_loop_tree_nodes (void)
{
unsigned int i;
loop_p loop;
for (i = 0; VEC_iterate (loop_p, ira_loops.larray, i, loop); i++)
if (ira_loop_nodes [i].regno_pseudo_map != NULL)
{
ira_free_bitmap (ira_loop_nodes [i].local_copies);
ira_free_bitmap (ira_loop_nodes [i].border_pseudos);
ira_free_bitmap (ira_loop_nodes [i].modified_regnos);
ira_free_bitmap (ira_loop_nodes [i].mentioned_pseudos);
ira_free (ira_loop_nodes [i].regno_pseudo_map);
}
ira_free (ira_loop_nodes);
for (i = 0; i < (unsigned int) last_basic_block_before_change; i++)
{
if (ira_bb_nodes [i].local_copies != NULL)
ira_free_bitmap (ira_bb_nodes [i].local_copies);
if (ira_bb_nodes [i].border_pseudos != NULL)
ira_free_bitmap (ira_bb_nodes [i].border_pseudos);
if (ira_bb_nodes [i].modified_regnos != NULL)
ira_free_bitmap (ira_bb_nodes [i].modified_regnos);
if (ira_bb_nodes [i].mentioned_pseudos != NULL)
ira_free_bitmap (ira_bb_nodes [i].mentioned_pseudos);
if (ira_bb_nodes [i].regno_pseudo_map != NULL)
ira_free (ira_bb_nodes [i].regno_pseudo_map);
}
ira_free (ira_bb_nodes);
}
/* The following recursive functions adds loop to the loop tree
hierarchy. The loop is added only once. */
static void
add_loop_to_tree (struct loop *loop)
{
struct loop *father;
struct ira_loop_tree_node *loop_node, *father_node;
/* Can not use loop node access macros because of potential checking
and because the nodes are not initialized yet. */
if (loop->outer != NULL)
add_loop_to_tree (loop->outer);
if (ira_loop_nodes [loop->num].regno_pseudo_map != NULL
&& ira_loop_nodes [loop->num].inner == NULL)
{
/* We have not added loop node to the tree yet. */
loop_node = &ira_loop_nodes [loop->num];
loop_node->loop = loop;
loop_node->bb = NULL;
for (father = loop->outer; father != NULL; father = father->outer)
if (ira_loop_nodes [father->num].regno_pseudo_map != NULL)
break;
if (father == NULL)
{
loop_node->next = NULL;
loop_node->father = NULL;
}
else
{
father_node = &ira_loop_nodes [father->num];
loop_node->next = father_node->inner;
father_node->inner = loop_node;
loop_node->father = father_node;
}
}
}
/* The following function creates the loop tree. The algorithm is
designed to provide correct order of loops (by the last loop BB)
and basic blocks in chain formed by next. */
static void
form_loop_tree (void)
{
unsigned int i;
basic_block bb;
struct loop *father;
struct ira_loop_tree_node *bb_node, *loop_node;
loop_p loop;
/* Can not use loop/bb node access macros because of potential
checking and the nodes are not initialized yet. */
for (i = 0; VEC_iterate (loop_p, ira_loops.larray, i, loop); i++)
if (ira_loop_nodes [i].regno_pseudo_map != NULL)
ira_loop_nodes [i].inner = NULL;
FOR_EACH_BB_REVERSE (bb)
{
bb_node = &ira_bb_nodes [bb->index];
bb_node->bb = bb;
bb_node->loop = NULL;
bb_node->inner = NULL;
bb_node->next = NULL;
for (father = bb->loop_father; father != NULL; father = father->outer)
if (ira_loop_nodes [father->num].regno_pseudo_map != NULL)
break;
add_loop_to_tree (father);
loop_node = &ira_loop_nodes [father->num];
bb_node->next = loop_node->inner;
bb_node->father = loop_node;
loop_node->inner = bb_node;
}
ira_loop_tree_root = IRA_LOOP_NODE_BY_INDEX (ira_loops.tree_root->num);
ira_assert (ira_loop_tree_root->regno_pseudo_map != NULL);
}
/* Varray containing references to all created pseudos. It is a
container of array pseudos. */
static varray_type pseudo_varray;
/* The function initializes data concerning pseudos. */
static void
initiate_pseudos (void)
{
VARRAY_GENERIC_PTR_NOGC_INIT (pseudo_varray, max_reg_num () * 2, "pseudos");
pseudos = NULL;
pseudos_num = 0;
regno_pseudo_map = ira_allocate (max_reg_num () * sizeof (pseudo_t));
memset (regno_pseudo_map, 0, max_reg_num () * sizeof (pseudo_t));
}
/* The function creates and returns pseudo corresponding to REGNO in
LOOP_TREE_NODE. */
pseudo_t
create_pseudo (int regno, struct ira_loop_tree_node *loop_tree_node)
{
pseudo_t p;
p = ira_allocate (sizeof (struct pseudo));
PSEUDO_REGNO (p) = regno;
PSEUDO_LOOP_TREE_NODE (p) = loop_tree_node;
if (regno >= 0)
{
PSEUDO_NEXT_REGNO_PSEUDO (p) = regno_pseudo_map [regno];
regno_pseudo_map [regno] = p;
if (loop_tree_node->regno_pseudo_map [regno] == NULL)
/* Remember that we can create temporary pseudos to break
cycles in register shuffle. */
loop_tree_node->regno_pseudo_map [regno] = p;
}
PSEUDO_CAP (p) = NULL;
PSEUDO_CAP_MEMBER (p) = NULL;
PSEUDO_NUM (p) = pseudos_num;
PSEUDO_CONFLICT_PSEUDO_VEC (p) = NULL;
PSEUDO_CONFLICT_PSEUDO_VEC_ACTIVE_SIZE (p) = 0;
CLEAR_HARD_REG_SET (PSEUDO_CONFLICT_HARD_REGS (p));
PSEUDO_FREQ (p) = 1;
PSEUDO_HARD_REGNO (p) = -1;
PSEUDO_CALL_FREQ (p) = 0;
PSEUDO_CALLS_CROSSED_NUM (p) = 0;
PSEUDO_CALLS_CROSSED (p) = NULL;
/* ??? Too conservative. */
if (regno >= 0 && REG_N_CALLS_CROSSED (regno) != 0)
PSEUDO_CALLS_CROSSED (p)
= ira_allocate (sizeof (rtx) * REG_N_CALLS_CROSSED (regno));
#ifdef STACK_REGS
PSEUDO_NO_STACK_REG_P (p) = FALSE;
#endif
PSEUDO_IN_GRAPH_P (p) = FALSE;
PSEUDO_ASSIGNED_P (p) = FALSE;
PSEUDO_MAY_BE_SPILLED_P (p) = FALSE;
PSEUDO_MODE (p) = (regno < 0 ? VOIDmode : PSEUDO_REGNO_MODE (regno));
PSEUDO_COPIES (p) = NULL;
PSEUDO_HARD_REG_COSTS (p) = NULL;
PSEUDO_CONFLICT_HARD_REG_COSTS (p) = NULL;
PSEUDO_CURR_HARD_REG_COSTS (p) = NULL;
PSEUDO_CURR_CONFLICT_HARD_REG_COSTS (p) = NULL;
PSEUDO_LEFT_CONFLICTS_NUM (p) = -1;
PSEUDO_COVER_CLASS (p) = NO_REGS;
PSEUDO_COVER_CLASS_COST (p) = 0;
PSEUDO_MEMORY_COST (p) = 0;
PSEUDO_NEXT_BUCKET_PSEUDO (p) = NULL;
PSEUDO_PREV_BUCKET_PSEUDO (p) = NULL;
VARRAY_PUSH_GENERIC_PTR (pseudo_varray, p);
pseudos = (pseudo_t *) &VARRAY_GENERIC_PTR (pseudo_varray, 0);
pseudos_num = VARRAY_ACTIVE_SIZE (pseudo_varray);
return p;
}
/* The function allocates conflict vector of P for NUM pseudos. */
void
allocate_pseudo_conflicts (pseudo_t p, int num)
{
ira_assert (PSEUDO_CONFLICT_PSEUDO_VEC (p) == NULL);
PSEUDO_CONFLICT_PSEUDO_VEC (p)
= ira_allocate (sizeof (pseudo_t) * (num + 1));
PSEUDO_CONFLICT_PSEUDO_VEC (p) [0] = NULL;
PSEUDO_CONFLICT_PSEUDO_VEC_ACTIVE_SIZE (p) = 0;
PSEUDO_CONFLICT_PSEUDO_VEC_SIZE (p) = num;
}
/* The function checks that conflict vector of P has enough space to
contain NUM pseudo references. If the space is not enough, the
function expands the conflict vector. */
static void
check_pseudo_conflict_vec (pseudo_t p, int num)
{
int size;
pseudo_t *vec;
ira_assert (PSEUDO_CONFLICT_PSEUDO_VEC (p) != NULL);
if (PSEUDO_CONFLICT_PSEUDO_VEC_SIZE (p) >= num)
return;
size = 3 * num / 2 + 1;
vec = ira_allocate (sizeof (pseudo_t) * (size + 1));
memcpy (vec, PSEUDO_CONFLICT_PSEUDO_VEC (p),
sizeof (pseudo_t)
* (PSEUDO_CONFLICT_PSEUDO_VEC_ACTIVE_SIZE (p) + 1));
ira_free (PSEUDO_CONFLICT_PSEUDO_VEC (p));
PSEUDO_CONFLICT_PSEUDO_VEC (p) = vec;
PSEUDO_CONFLICT_PSEUDO_VEC_SIZE (p) = size;
}
/* The function adds P1 to conflict vector of P2 and vise versa. */
static void
add_pseudo_conflict (pseudo_t p1, pseudo_t p2)
{
int size1, size2;
pseudo_t *vec1, *vec2;
size1 = PSEUDO_CONFLICT_PSEUDO_VEC_ACTIVE_SIZE (p1);
size2 = PSEUDO_CONFLICT_PSEUDO_VEC_ACTIVE_SIZE (p2);
check_pseudo_conflict_vec (p1, size1 + 1);
check_pseudo_conflict_vec (p2, size2 + 1);
vec1 = PSEUDO_CONFLICT_PSEUDO_VEC (p1);
vec2 = PSEUDO_CONFLICT_PSEUDO_VEC (p2);
vec1 [size1] = p2;
vec2 [size2] = p1;
vec1 [size1 + 1] = NULL;
vec2 [size2 + 1] = NULL;
PSEUDO_CONFLICT_PSEUDO_VEC_ACTIVE_SIZE (p1)++;
PSEUDO_CONFLICT_PSEUDO_VEC_ACTIVE_SIZE (p2)++;
}
/* This recursive function outputs pseudo P and if it is cap the
function outputs members. */
void
print_expanded_pseudo (pseudo_t p)
{
basic_block bb;
fprintf (ira_dump_file, " p%d(r%d", PSEUDO_NUM (p), PSEUDO_REGNO (p));
if ((bb = PSEUDO_LOOP_TREE_NODE (p)->bb) != NULL)
fprintf (ira_dump_file, ",b%d", bb->index);
else
fprintf (ira_dump_file, ",l%d", PSEUDO_LOOP_TREE_NODE (p)->loop->num);
if (PSEUDO_REGNO (p) < 0)
{
fprintf (ira_dump_file, ":");
print_expanded_pseudo (PSEUDO_CAP_MEMBER (p));
}
fprintf (ira_dump_file, ")");
}
/* The function creates and returns cap representing pseudo P in the
father loop. */
static pseudo_t
create_cap_pseudo (pseudo_t p)
{
int i, regno, hard_regs_num, conflicts_num;
int *reg_costs, *conflict_reg_costs;
basic_block bb;
pseudo_t cap, conflict_pseudo, conflict_father_pseudo;
pseudo_t *pseudo_vec;
struct ira_loop_tree_node *father;
father = PSEUDO_LOOP_TREE_NODE (p)->father;
cap = create_pseudo (-1, father);
/* We just need to set a mode giving the same size. */
PSEUDO_MODE (cap) = PSEUDO_MODE (p);
PSEUDO_COVER_CLASS (cap) = PSEUDO_COVER_CLASS (p);
PSEUDO_AVAILABLE_REGS_NUM (cap) = PSEUDO_AVAILABLE_REGS_NUM (p);
PSEUDO_CAP_MEMBER (cap) = p;
hard_regs_num = class_hard_regs_num [PSEUDO_COVER_CLASS (p)];
PSEUDO_HARD_REG_COSTS (cap) = reg_costs
= ira_allocate (hard_regs_num * sizeof (int));
memcpy (reg_costs, PSEUDO_HARD_REG_COSTS (p), hard_regs_num * sizeof (int));
PSEUDO_CONFLICT_HARD_REG_COSTS (cap) = conflict_reg_costs
= ira_allocate (hard_regs_num * sizeof (int));
memcpy (conflict_reg_costs, PSEUDO_CONFLICT_HARD_REG_COSTS (p),
hard_regs_num * sizeof (int));
PSEUDO_CURR_HARD_REG_COSTS (cap)
= ira_allocate (hard_regs_num * sizeof (int));
PSEUDO_CURR_CONFLICT_HARD_REG_COSTS (cap)
= ira_allocate (hard_regs_num * sizeof (int));
/* ??? costs, call_p etc. */
bitmap_set_bit (father->mentioned_pseudos, PSEUDO_NUM (cap));
PSEUDO_CAP (p) = cap;
PSEUDO_AVAILABLE_REGS_NUM (cap) = PSEUDO_AVAILABLE_REGS_NUM (p);
PSEUDO_COVER_CLASS_COST (cap) = PSEUDO_COVER_CLASS_COST (p);
PSEUDO_MEMORY_COST (cap) = PSEUDO_MEMORY_COST (p);
PSEUDO_FREQ (cap) = PSEUDO_FREQ (p);
PSEUDO_CALL_FREQ (cap) = PSEUDO_CALL_FREQ (p);
IOR_HARD_REG_SET (PSEUDO_CONFLICT_HARD_REGS (cap),
PSEUDO_CONFLICT_HARD_REGS (p));
PSEUDO_CALLS_CROSSED_NUM (cap) = PSEUDO_CALLS_CROSSED_NUM (p);
PSEUDO_CALLS_CROSSED (cap)
= ira_allocate (sizeof (rtx) * PSEUDO_CALLS_CROSSED_NUM (p));
memcpy (PSEUDO_CALLS_CROSSED (cap), PSEUDO_CALLS_CROSSED (p),
sizeof (rtx) * PSEUDO_CALLS_CROSSED_NUM (p));
#ifdef STACK_REGS
PSEUDO_NO_STACK_REG_P (cap) = PSEUDO_NO_STACK_REG_P (p);
#endif
pseudo_vec = PSEUDO_CONFLICT_PSEUDO_VEC (p);
for (conflicts_num = i = 0; (conflict_pseudo = pseudo_vec [i]) != NULL; i++)
conflicts_num++;
allocate_pseudo_conflicts (cap, conflicts_num);
for (conflicts_num = i = 0; (conflict_pseudo = pseudo_vec [i]) != NULL; i++)
{
regno = PSEUDO_REGNO (conflict_pseudo);
if (regno < 0)
conflict_father_pseudo = PSEUDO_CAP (conflict_pseudo);
else
{
conflict_father_pseudo = father->regno_pseudo_map [regno];
if (conflict_father_pseudo == NULL)
conflict_father_pseudo = PSEUDO_CAP (conflict_pseudo);
}
if (conflict_father_pseudo != NULL)
add_pseudo_conflict (cap, conflict_father_pseudo);
}
if (ira_dump_file != NULL)
{
fprintf (ira_dump_file, " Creating cap ");
print_expanded_pseudo (cap);
fprintf (ira_dump_file, "\n Conflicts:");
pseudo_vec = PSEUDO_CONFLICT_PSEUDO_VEC (cap);
for (i = 0; (conflict_pseudo = pseudo_vec [i]) != NULL; i++)
{
fprintf (ira_dump_file, " p%d(r%d,", PSEUDO_NUM (conflict_pseudo),
PSEUDO_REGNO (conflict_pseudo));
if ((bb = PSEUDO_LOOP_TREE_NODE (conflict_pseudo)->bb) != NULL)
fprintf (ira_dump_file, "b%d)", bb->index);
else
fprintf (ira_dump_file, "l%d)",
PSEUDO_LOOP_TREE_NODE (conflict_pseudo)->loop->num);
}
fprintf (ira_dump_file, "\n");
}
return cap;
}
/* The function frees memory allocated for all pseudos. */
static void
finish_pseudos (void)
{
int i;
pseudo_t p;
for (i = 0; i < pseudos_num; i++)
{
p = pseudos [i];
if (PSEUDO_CONFLICT_PSEUDO_VEC (p) != NULL)
ira_free (PSEUDO_CONFLICT_PSEUDO_VEC (p));
if (PSEUDO_HARD_REG_COSTS (p) != NULL)
ira_free (PSEUDO_HARD_REG_COSTS (p));
if (PSEUDO_CONFLICT_HARD_REG_COSTS (p) != NULL)
ira_free (PSEUDO_CONFLICT_HARD_REG_COSTS (p));
if (PSEUDO_CURR_HARD_REG_COSTS (p) != NULL)
ira_free (PSEUDO_CURR_HARD_REG_COSTS (p));
if (PSEUDO_CURR_CONFLICT_HARD_REG_COSTS (p) != NULL)
ira_free (PSEUDO_CURR_CONFLICT_HARD_REG_COSTS (p));
if (PSEUDO_CALLS_CROSSED (p) != NULL)
ira_free (PSEUDO_CALLS_CROSSED (p));
ira_free (p);
}
ira_free (regno_pseudo_map);
VARRAY_FREE (pseudo_varray);
}
/* Varray containing references to all created copies. It is a
container of array copies. */
static varray_type copy_varray;
/* The function initializes data concerning pseudo copies. */
static void
initiate_copies (void)
{
VARRAY_GENERIC_PTR_NOGC_INIT (copy_varray, get_max_uid (), "copies");
copies = NULL;
copies_num = 0;
}
/* The function creates and returns copy of pseudos FIRST and SECOND
with frequency FREQ and move insn MOVE_INSN. */
copy_t
create_copy (pseudo_t first, pseudo_t second, int freq, rtx move_insn)
{
copy_t cp;
cp = ira_allocate (sizeof (struct pseudo_copy));
cp->num = copies_num;
cp->first = first;
cp->second = second;
cp->freq = freq;
cp->move_insn = move_insn;
VARRAY_PUSH_GENERIC_PTR (copy_varray, cp);
copies = (copy_t *) &VARRAY_GENERIC_PTR (copy_varray, 0);
copies_num = VARRAY_ACTIVE_SIZE (copy_varray);
return cp;
}
/* The function frees memory allocated for all copies. */
static void
finish_copies (void)
{
int i;
for (i = 0; i < copies_num; i++)
ira_free (copies [i]);
VARRAY_FREE (copy_varray);
}
/* The current loop tree node. */
struct ira_loop_tree_node *ira_curr_loop_tree_node;
/* The recursive function traverses loop tree node with root LOOP_NODE
calling non-null functions PREORDER_FUNC and POSTORDER_FUNC
correspondingly in preorder and postorder. The function sets up
IRA_CURR_LOOP_TREE_NODE. */
void
traverse_loop_tree (struct ira_loop_tree_node *loop_node,
void (*preorder_func) (struct ira_loop_tree_node *),
void (*postorder_func) (struct ira_loop_tree_node *))
{
struct ira_loop_tree_node *subloop_node;
if (loop_node->bb == NULL)
ira_curr_loop_tree_node = loop_node;
if (preorder_func != NULL)
(*preorder_func) (loop_node);
for (subloop_node = loop_node->inner;
subloop_node != NULL;
subloop_node = subloop_node->next)
{
traverse_loop_tree (subloop_node, preorder_func, postorder_func);
ira_curr_loop_tree_node = loop_node;
}
if (postorder_func != NULL)
(*postorder_func) (loop_node);
}
/* The current basic block. */
static basic_block curr_bb;
/* This recursive function creates pseudos corresponding to
pseudo-registers containing in X. Nonzero OUTPUT_P means that X is
lvalue. */
static void
create_insn_pseudos (rtx x, int output_p)
{
int i, j;
const char *fmt;
enum rtx_code code = GET_CODE (x);
if (code == REG)
{
int regno;
if ((regno = REGNO (x)) >= FIRST_PSEUDO_REGISTER)
{
pseudo_t p;
if ((p = ira_curr_loop_tree_node->regno_pseudo_map [regno]) == NULL)
p = create_pseudo (regno, ira_curr_loop_tree_node);
PSEUDO_FREQ (p) += REG_FREQ_FROM_BB (curr_bb);
bitmap_set_bit (ira_curr_loop_tree_node->mentioned_pseudos,
PSEUDO_NUM (p));
if (output_p)
bitmap_set_bit (ira_curr_loop_tree_node->modified_regnos, regno);
}
return;
}
else if (code == SET)
{
create_insn_pseudos (SET_DEST (x), TRUE);
create_insn_pseudos (SET_SRC (x), FALSE);
return;
}
else if (code == CLOBBER)
{
create_insn_pseudos (XEXP (x, 0), TRUE);
return;
}
else if (code == MEM)
{
create_insn_pseudos (XEXP (x, 0), FALSE);
return;
}
else if (code == PRE_DEC || code == POST_DEC || code == PRE_INC ||
code == POST_INC || code == POST_MODIFY || code == PRE_MODIFY)
{
create_insn_pseudos (XEXP (x, 0), TRUE);
create_insn_pseudos (XEXP (x, 0), FALSE);
return;
}
fmt = GET_RTX_FORMAT (code);
for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
{
if (fmt[i] == 'e')
create_insn_pseudos (XEXP (x, i), output_p);
else if (fmt[i] == 'E')
for (j = 0; j < XVECLEN (x, i); j++)
create_insn_pseudos (XVECEXP (x, i, j), output_p);
}
}
/* The function creates pseudos corresponding to pseudo-registers
living in basic blocks represented by the corresponding loop tree
node BB_NODE. */
static void
create_bb_pseudos (struct ira_loop_tree_node *bb_node)
{
basic_block bb;
rtx insn;
unsigned int i;
bitmap_iterator bi;
pseudo_t *map;
curr_bb = bb = bb_node->bb;
ira_assert (bb != NULL);
FOR_BB_INSNS (bb, insn)
if (INSN_P (insn))
create_insn_pseudos (PATTERN (insn), FALSE);
/* It might be a pseudo living through from one subloop to
another. */
map = ira_curr_loop_tree_node->regno_pseudo_map;
EXECUTE_IF_SET_IN_REG_SET (DF_UPWARD_LIVE_IN (build_df, bb),
FIRST_PSEUDO_REGISTER, i, bi)
if (map [i] == NULL)
create_pseudo (i, ira_curr_loop_tree_node);
}
/* The function creates pseudos corresponding to pseudo-registers
living on edge E (a loop enter or exit). It also finds pseudos
living on the loop border. */
static void
create_loop_pseudos (edge e)
{
unsigned int i;
bitmap live_in_regs, border_pseudos;
bitmap_iterator bi;
pseudo_t *map;
live_in_regs = DF_UPWARD_LIVE_IN (build_df, e->dest);
map = ira_curr_loop_tree_node->regno_pseudo_map;
border_pseudos = ira_curr_loop_tree_node->border_pseudos;
EXECUTE_IF_SET_IN_REG_SET (DF_UPWARD_LIVE_OUT (build_df, e->src),
FIRST_PSEUDO_REGISTER, i, bi)
if (bitmap_bit_p (live_in_regs, i))
{
if (map [i] == NULL)
create_pseudo (i, ira_curr_loop_tree_node);
bitmap_set_bit (border_pseudos, PSEUDO_NUM (map [i]));
}
}
/* The function creates pseudos corresponding to pseudo-registers
living in loop represented by the corresponding loop tree node
LOOP_NODE. */
static void
create_loop_tree_node_pseudos (struct ira_loop_tree_node *loop_node)
{
if (loop_node->bb != NULL)
create_bb_pseudos (loop_node);
else if (loop_node != ira_loop_tree_root)
{
int i;
edge_iterator ei;
edge e;
VEC (edge, heap) *edges;
FOR_EACH_EDGE (e, ei, loop_node->loop->header->preds)
if (e->src != loop_node->loop->latch)
create_loop_pseudos (e);
edges = get_loop_exit_edges (loop_node->loop);
for (i = 0; VEC_iterate (edge, edges, i, e); i++)
create_loop_pseudos (e);
VEC_free (edge, heap, edges);
}
}
/* The function creates pseudos corresponding to pseudo-registers in
the current function. The function traverses the loop tree for
this. */
static void
create_pseudos (void)
{
int i;
pseudo_t p, father_p;
struct ira_loop_tree_node *father;
traverse_loop_tree (ira_loop_tree_root, create_loop_tree_node_pseudos, NULL);
if (flag_ira_algorithm != IRA_ALGORITHM_REGIONAL)
return;
/* Propagate frequencies for regional register allocator. */
for (i = max_reg_num () - 1; i >= FIRST_PSEUDO_REGISTER; i--)
for (p = regno_pseudo_map [i]; p != NULL; p = PSEUDO_NEXT_REGNO_PSEUDO (p))
if ((father = PSEUDO_LOOP_TREE_NODE (p)->father) != NULL
&& (father_p = father->regno_pseudo_map [i]) != NULL)
PSEUDO_FREQ (father_p) += PSEUDO_FREQ (p);
}
/* Bitmap of pseudos living only inside the current loop. */
static bitmap local_pseudos_bitmap;
/* The function creates caps representing pseudos living only inside
the loop given by LOOP_NODE on higher loop level. */
static void
create_loop_tree_node_caps (struct ira_loop_tree_node *loop_node)
{
unsigned int j;
bitmap_iterator bi;
pseudo_t p, cap, another_p, father_p;
struct pseudo_copy *cp, *next_cp;
struct ira_loop_tree_node *father;
if (loop_node->bb != NULL || loop_node == ira_loop_tree_root)
return;
bitmap_and_compl (local_pseudos_bitmap, loop_node->mentioned_pseudos,
loop_node->border_pseudos);
EXECUTE_IF_SET_IN_BITMAP (local_pseudos_bitmap, 0, j, bi)
create_cap_pseudo (pseudos [j]);
father = loop_node->father;
EXECUTE_IF_SET_IN_BITMAP (local_pseudos_bitmap, 0, j, bi)
{
p = pseudos [j];
cap = PSEUDO_CAP (p);
for (cp = PSEUDO_COPIES (p); cp != NULL; cp = next_cp)
{
if (cp->first == p)
{
next_cp = cp->next_first_pseudo_copy;
another_p = cp->second;
}
else if (cp->second == p)
{
next_cp = cp->next_second_pseudo_copy;
another_p = cp->first;
}
else
gcc_unreachable ();
if ((father_p = PSEUDO_CAP (another_p)) == NULL)
father_p = father->regno_pseudo_map [PSEUDO_REGNO (another_p)];
if (father_p != NULL)
/* Upper level pseudo might be not existing because it is
not mentioned or lived on the border. It is just
living on bb start of the loop. */
add_pseudo_copy (cap, father_p, cp->freq, cp->move_insn);
}
}
}
/* This entry function creates internal representation for IRA
(pseudos, copies, loop tree nodes). If LOOP_P is zero the nodes
corresponding to the loops (except the root which corresponds the
all function) and correspondingly pseudos for the loops will be not
created (it will be done only for basic blocks). Such value is
used for Chaitin-Briggs and Chow's priority coloring. */
void
ira_build (int loop_p)
{
build_df = df_init (DF_HARD_REGS);
df_lr_add_problem (build_df, 0);
df_ri_add_problem (build_df, DF_RI_LIFE);
df_analyze (build_df);
initiate_pseudos ();
initiate_copies ();
ira_assert (current_loops == NULL);
flow_loops_find (&ira_loops);
current_loops = &ira_loops;
create_loop_tree_nodes (loop_p);
form_loop_tree ();
free_dominance_info (CDI_DOMINATORS);
create_pseudos ();
ira_costs ();
ira_build_conflicts ();
tune_pseudo_costs_and_cover_classes ();
if (flag_ira_algorithm == IRA_ALGORITHM_REGIONAL)
{
local_pseudos_bitmap = ira_allocate_bitmap ();
traverse_loop_tree (ira_loop_tree_root, NULL,
create_loop_tree_node_caps);
ira_free_bitmap (local_pseudos_bitmap);
}
}
/* The function releases data created by function ira_build. */
void
ira_destroy (void)
{
basic_block bb;
finish_loop_tree_nodes ();
ira_assert (current_loops == &ira_loops);
flow_loops_free (&ira_loops);
FOR_ALL_BB (bb)
bb->loop_father = NULL;
current_loops = NULL;
finish_copies ();
finish_pseudos ();
df_finish (build_df);
}
/* conflict costs for conflict hard regs ??? */
/* IRA allocation based on graph coloring.
Contributed by Vladimir Makarov.
Copyright (C) 2006 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING. If not, write to the Free
Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA. */
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "rtl.h"
#include "tm_p.h"
#include "target.h"
#include "varray.h"
#include "regs.h"
#include "flags.h"
#include "sbitmap.h"
#include "bitmap.h"
#include "hard-reg-set.h"
#include "basic-block.h"
#include "expr.h"
#include "toplev.h"
#include "reload.h"
#include "params.h"
#include "df.h"
#include "ira-int.h"
/* We use optimistic colouring with optional biased colouring. */
static void update_copy_costs (pseudo_t, int);
static int assign_hard_reg (pseudo_t, int);
static void add_pseudo_to_bucket (pseudo_t, pseudo_t *);
static void add_pseudo_to_ordered_bucket (pseudo_t, pseudo_t *);
static void delete_pseudo_from_bucket (pseudo_t, pseudo_t *);
static void push_pseudo_to_stack (pseudo_t);
static void remove_pseudo_from_bucket_and_push (pseudo_t, int);
static void push_only_colorable (void);
static void push_pseudo_to_spill (pseudo_t);
static int loop_edge_freq (struct ira_loop_tree_node *, int, int);
static int calculate_pseudo_spill_cost (pseudo_t);
static void push_pseudos_to_stack (void);
static void pop_pseudos_from_stack (void);
static void put_pseudo_into_bucket (pseudo_t);
static void color_pseudos (void);
static void print_loop_title (struct ira_loop_tree_node *);
static void color_pass (struct ira_loop_tree_node *);
static int pseudo_compare_func (const void *, const void *);
static void priority_coloring (void);
static void do_coloring (void);
static void move_spill_restore (void);
static void update_pseudo_avaialable_regs (void);
/* Bitmap of pseudos which should be colored. */
static bitmap coloring_pseudo_bitmap;
/* Bitmap of pseudos which should be taken into account during
coloring. In general case it contains pseudos from
coloring_pseudo_bitmap plus other already colored conflicting
pseudos. */
static bitmap consideration_pseudo_bitmap;
/* This page contains function to choose hard register for pseudos. */
/* Array whose element value is TRUE if the corresponding hard
register already allocated for a pseudo. */
static int allocated_hardreg_p [FIRST_PSEUDO_REGISTER];
/* The function updates costs (decrease if DECR_P) of the pseudos
connected by copies with PSEUDO. */
static void
update_copy_costs (pseudo_t pseudo, int decr_p)
{
int i, hard_regno, cost;
enum machine_mode mode;
enum reg_class class;
pseudo_t another_pseudo;
struct pseudo_copy *cp, *next_cp;
if (PSEUDO_COVER_CLASS (pseudo) == NO_REGS)
return;
hard_regno = PSEUDO_HARD_REGNO (pseudo);
ira_assert (hard_regno >= 0 && PSEUDO_COVER_CLASS (pseudo) != NO_REGS);
i = class_hard_reg_index [PSEUDO_COVER_CLASS (pseudo)] [hard_regno];
ira_assert (i >= 0);
class = REGNO_REG_CLASS (hard_regno);
mode = PSEUDO_MODE (pseudo);
for (cp = PSEUDO_COPIES (pseudo); cp != NULL; cp = next_cp)
{
if (cp->first == pseudo)
{
next_cp = cp->next_first_pseudo_copy;
another_pseudo = cp->second;
}
else if (cp->second == pseudo)
{
next_cp = cp->next_second_pseudo_copy;
another_pseudo = cp->first;
}
else
gcc_unreachable ();
if (PSEUDO_COVER_CLASS (pseudo) != PSEUDO_COVER_CLASS (another_pseudo))
continue;
cost = (cp->second == pseudo
? register_move_cost [mode] [class]
[PSEUDO_COVER_CLASS (another_pseudo)]
: register_move_cost [mode] [PSEUDO_COVER_CLASS (another_pseudo)]
[class]);
if (decr_p)
cost = -cost;
PSEUDO_CURR_HARD_REG_COSTS (another_pseudo) [i] += cp->freq * cost;
PSEUDO_CURR_CONFLICT_HARD_REG_COSTS (another_pseudo) [i]
+= cp->freq * cost;
}
}
/* Function choosing a hard register for PSEUDO. If RETRY_P is
nonzero, it means that the function called from
`retry_ira_color'. */
static int
assign_hard_reg (pseudo_t pseudo, int retry_p)
{
HARD_REG_SET conflicting_regs, biased_regs;
int i, j, hard_regno, best_hard_regno, class_size;
int cost, mem_cost, min_cost, full_cost, min_full_cost, add_cost, *costs;
int *conflict_costs;
enum reg_class cover_class, class;
enum machine_mode mode;
pseudo_t conflict_pseudo, conflict_pseudo2;
pseudo_t *pseudo_vec, *pseudo_vec2;
pseudo_t another_pseudo;
struct pseudo_copy *cp, *next_cp;
static int full_costs [FIRST_PSEUDO_REGISTER];
ira_assert (! PSEUDO_ASSIGNED_P (pseudo));
cover_class = PSEUDO_COVER_CLASS (pseudo);
class_size = class_hard_regs_num [cover_class];
CLEAR_HARD_REG_SET (biased_regs);
memset (full_costs, 0, sizeof (int) * class_size);
mem_cost = PSEUDO_MEMORY_COST (pseudo);
mode = PSEUDO_MODE (pseudo);
costs = PSEUDO_CURR_HARD_REG_COSTS (pseudo);
memcpy (full_costs, costs, sizeof (int) * class_size);
COPY_HARD_REG_SET (conflicting_regs, PSEUDO_CONFLICT_HARD_REGS (pseudo));
pseudo_vec = PSEUDO_CONFLICT_PSEUDO_VEC (pseudo);
for (i = 0; (conflict_pseudo = pseudo_vec [i]) != NULL; i++)
/* Reload can give another class so we need to check all
pseudos. */
if (retry_p
|| (cover_class == PSEUDO_COVER_CLASS (conflict_pseudo)
&& bitmap_bit_p (consideration_pseudo_bitmap,
PSEUDO_NUM (conflict_pseudo))))
{
if (PSEUDO_ASSIGNED_P (conflict_pseudo))
{
if ((hard_regno = PSEUDO_HARD_REGNO (conflict_pseudo)) >= 0)
IOR_HARD_REG_SET
(conflicting_regs,
reg_mode_hard_regset
[hard_regno] [PSEUDO_MODE (conflict_pseudo)]);
continue;
}
else
{
conflict_costs
= PSEUDO_CURR_CONFLICT_HARD_REG_COSTS (conflict_pseudo);
if (conflict_costs != NULL)
for (j = class_size - 1; j >= 0; j--)
full_costs [j] -= conflict_costs [j];
}
if (retry_p || ! flag_ira_biased_coloring)
continue;
pseudo_vec2 = PSEUDO_CONFLICT_PSEUDO_VEC (conflict_pseudo);
for (j = 0; (conflict_pseudo2 = pseudo_vec2 [j]) != NULL; j++)
if (cover_class == PSEUDO_COVER_CLASS (conflict_pseudo2)
&& PSEUDO_ASSIGNED_P (conflict_pseudo2)
&& (hard_regno = PSEUDO_HARD_REGNO (conflict_pseudo2)) >= 0
&& (retry_p
|| bitmap_bit_p (consideration_pseudo_bitmap,
PSEUDO_NUM (conflict_pseudo2))))
IOR_HARD_REG_SET (biased_regs,
reg_mode_hard_regset
[hard_regno] [PSEUDO_MODE (conflict_pseudo2)]);
}
for (cp = PSEUDO_COPIES (pseudo); cp != NULL; cp = next_cp)
{
if (cp->first == pseudo)
{
next_cp = cp->next_first_pseudo_copy;
another_pseudo = cp->second;
}
else if (cp->second == pseudo)
{
next_cp = cp->next_second_pseudo_copy;
another_pseudo = cp->first;
}
else
gcc_unreachable ();
if (PSEUDO_ASSIGNED_P (another_pseudo))
continue;
conflict_costs = PSEUDO_CURR_CONFLICT_HARD_REG_COSTS (another_pseudo);
if (conflict_costs != NULL)
for (j = class_size - 1; j >= 0; j--)
full_costs [j] += conflict_costs [j];
}
IOR_HARD_REG_SET (conflicting_regs, no_alloc_regs);
IOR_COMPL_HARD_REG_SET (conflicting_regs, reg_class_contents [cover_class]);
best_hard_regno = -1;
GO_IF_HARD_REG_SUBSET (reg_class_contents [cover_class], conflicting_regs,
fail);
min_cost = min_full_cost = INT_MAX;
/* We don't care about giving callee saved registers to pseudos no
living through calls because call used register are allocated
first (it is usual practice to put them first in
REG_ALLOC_ORDER). */
for (i = 0; i < class_size; i++)
{
hard_regno = class_hard_regs [cover_class] [i];
#ifdef STACK_REGS
if (PSEUDO_NO_STACK_REG_P (pseudo)
&& FIRST_STACK_REG <= hard_regno && hard_regno <= LAST_STACK_REG)
continue;
#endif
if (TEST_HARD_REG_BIT (prohibited_class_mode_regs
[cover_class] [mode], hard_regno))
continue;
if (hard_reg_not_in_set_p (hard_regno, mode, conflicting_regs))
{
cost = costs [i];
full_cost = full_costs [i];
if (! allocated_hardreg_p [hard_regno]
&& hard_reg_not_in_set_p (hard_regno, mode, call_used_reg_set))
/* We need to save/restore the register in
epilogue/prologue. Therefore we increase the cost. */
{
/* ??? If only part is call clobbered. */
class = REGNO_REG_CLASS (hard_regno);
add_cost = (memory_move_cost [mode] [class] [0]
+ memory_move_cost [mode] [class] [1] - 1);
cost += add_cost;
full_cost += add_cost;
}
if (min_cost > cost)
min_cost = cost;
if (min_full_cost > full_cost)
{
min_full_cost = full_cost;
best_hard_regno = hard_regno;
ira_assert (hard_regno >= 0);
}
else if (min_full_cost == full_cost && flag_ira_biased_coloring != 0
&& TEST_HARD_REG_BIT (biased_regs, hard_regno)
&& ! TEST_HARD_REG_BIT (biased_regs, best_hard_regno))
best_hard_regno = hard_regno;
}
}
if (min_cost > mem_cost)
best_hard_regno = -1;
fail:
PSEUDO_HARD_REGNO (pseudo) = best_hard_regno;
PSEUDO_ASSIGNED_P (pseudo) = TRUE;
if (best_hard_regno >= 0)
{
allocated_hardreg_p [best_hard_regno] = TRUE;
update_copy_costs (pseudo, TRUE);
}
return best_hard_regno >= 0;
}
/* This page contains allocator based on Chaitin algorithm. */
/* Bucket of pseudos pseudo be colored currently without spilling. */
static pseudo_t colorable_pseudo_bucket;
/* Bucket of pseudos pseudo might be not colored currently without
spilling. */
static pseudo_t uncolorable_pseudo_bucket;
/* Add PSEUDO to *BUCKET_PTR bucket. PSEUDO should be not in a bucket
before the call. */
static void
add_pseudo_to_bucket (pseudo_t pseudo, pseudo_t *bucket_ptr)
{
pseudo_t first_pseudo;
first_pseudo = *bucket_ptr;
PSEUDO_NEXT_BUCKET_PSEUDO (pseudo) = first_pseudo;
PSEUDO_PREV_BUCKET_PSEUDO (pseudo) = NULL;
if (first_pseudo != NULL)
PSEUDO_PREV_BUCKET_PSEUDO (first_pseudo) = pseudo;
*bucket_ptr = pseudo;
}
/* Add PSEUDO to *BUCKET_PTR bucket maintaining the order according
their frequency. PSEUDO should be not in a bucket before the
call. */
static void
add_pseudo_to_ordered_bucket (pseudo_t pseudo, pseudo_t *bucket_ptr)
{
pseudo_t before, after;
enum reg_class cover_class;
int freq, nregs;
freq = PSEUDO_FREQ (pseudo);
cover_class = PSEUDO_COVER_CLASS (pseudo);
nregs = reg_class_nregs [cover_class] [PSEUDO_MODE (pseudo)];
for (before = *bucket_ptr, after = NULL;
before != NULL;
after = before, before = PSEUDO_NEXT_BUCKET_PSEUDO (before))
if (PSEUDO_FREQ (before) > freq)
break;
PSEUDO_NEXT_BUCKET_PSEUDO (pseudo) = before;
PSEUDO_PREV_BUCKET_PSEUDO (pseudo) = after;
if (after == NULL)
*bucket_ptr = pseudo;
else
PSEUDO_NEXT_BUCKET_PSEUDO (after) = pseudo;
if (before != NULL)
PSEUDO_PREV_BUCKET_PSEUDO (before) = pseudo;
}
/* Delete PSEUDO from *BUCKET_PTR bucket. It should be there before
the call. */
static void
delete_pseudo_from_bucket (pseudo_