RFA: patch for register pressure sensitive insn scheduling

Vladimir Makarov vmakarov@redhat.com
Tue Sep 1 01:30:00 GMT 2009


  The following patch fixes bug # 24319 which was in bugzilla for last
4 years.  The bug results in compiler crashes in the 1st insn scheduling
for x86/x86_64.

  The patch also implements register pressure sensitive insns
scheduling.  The code for the both purposes is tightly tied so it is
sent in one patch.

  The solution of bug #24319 is not ideal.  It is possible to create a
target with a specific ABI where the bug might still occur.  It is
also a bit conservative by creating sometime unnecessary dependencies
with insns containing hard registers to prevent the bug.  To solve
these two small pitfalls would require much more efforts and I don't
think it is worth to do.

  As for register pressure sensitive insns scheduling, I tried a lot
of approaches and found the one in the patch works the best.

  Usually the compiler literature proposes the two mode insn
scheduling: when the register pressure *at the current scheduling
point* is low, use usual scheduling heuristics (like longest critical
path length) and when the register pressure is high, use heuristics to
decrease register pressure.  I found that in reality this approach
works pretty bad giving practically no improvement at all because if
at the current scheduling point the register pressure is low, it can
be high at the subsequent points and scheduling insn can increase the
pressure at these points even more.

  So the approach used in this patch monitors max register pressure
not only at the current but at all subsequent points and how scheduling a
particular insn affects them.

  Insns before the 1st insn scheduling are reordered in a way
minimizing register pressure (I guess it is done mostly TER).  So
minimizing register pressure is based on original order of insns.

  Implementing register pressure sensitive insn scheduling required
some changes in haifa scheduling which can increase compiler time
(about %3 in worse case for power6).  It is mostly because we need to
look at insns in ready list *and* queue (which contains insns with
resolved dependencies but which can not be issued because of input
data are not ready or needed functional units are not free) vs looking
at only ready list in usual scheduling.  Another compiler slowdown is
because of costly additional calculation of register classes (before
the patch it was done once in IRA) and updating max register pressure.
Also finding dying registers is different than in the rest of compiler
because of insn moves. I worked hard to decrease this compiler speed
degradation but still register pressure sensitive insn scheduling is
expensive.

  Here the comparison of register pressure insn scheduling and regular
one for some targets on SPEC2000:

                          x86 (CoreI7)  x86_64 (CoreI7)  Power6 
(-mtune=power6)
SPECInt2000 score         +1.6%         +0.3%            +1.2%
SPECFP2000 score          +6.4%         +1.9%            +0.2%

SPECInt2000 code size     -0.86%        -0.12%           -0.07%
SPECFP2000 code size      -5.33%        -2.27%           -0.53%

SPECInt2000 compile time  +0.7%         +1.6%            +2.6%
SPECFP2000 compile time   -0.2%         +1.1%            +3.1%

  Somebody could ask me why register pressure sensitive insn
scheduling is not default.  Although register pressure sensitive insn
scheduling generates much better results for x86/x86_64 than usual
scheduling, it still generates bigger code (about 0.4% for SPECInt and
1.3%-2.2% for SPECFP) and practically the same code performance as
for *without* the 1st scheduling for x86/x86_64 which is now default.
Plus it requires more compiler time (1st insn scheduling is an
expensive pass by itself).  So I don't see necessity to make it by default.
Although I see it could be beneficial for some programs and x86/x86_64
processors which have pipeline descriptions and are more sensitive for
insn scheduling than Core I7.  I guess targets maintainers should
decide is it worth to make it by default for the target.

  The patch was successfully bootsrapped on x86/x86_64 with two
combinations '-O2 -g -fschedule-insns' and '-O2 -g -fschedule-insns
-fsched-pressure' and on ppc64 and itanium with '-O2 -g -fsched-pressure'.

As an insn scheduler maintainer, I don't need an approval for changes
in *sched*.[ch] files.  But I need an approval for changes in the rest
files.  And of course, any comments for *sched*.[ch] would be
appreciated.

Is the patch ok to commit to the trunk?

2009-08-31  Vladimir Makarov  <vmakarov@redhat.com>

    * doc/invoke.texi (-fsched-pressure): Document it.
    (-fsched-reg-pressure-heuristic): Remove it.
    
    * reload.c (ira.h): Include.
    (find_reloads): Add choosing reload on number of small spilled
    classes.
    
    * haifa-sched.c (ira.h): Include.
    (sched_pressure_p, sched_regno_cover_class, curr_reg_pressure,
    saved_reg_pressure, curr_reg_live, saved_reg_live,
    region_ref_regs): New variables.
    (sched_init_region_reg_pressure_info, mark_regno_birth_or_death,
    initiate_reg_pressure_info, setup_ref_regs,
    initiate_bb_reg_pressure_info, save_reg_pressure,
    restore_reg_pressure, dying_use_p, print_curr_reg_pressure): New
    functions.
    (setup_insn_reg_pressure_info): New function.
    (rank_for_schedule): Add pressure checking and insn issue time.
    Remove comparison of insn reg weights.
    (ready_sort): Set insn reg pressure info.
    (update_register_pressure, setup_insn_max_reg_pressure,
    update_reg_and_insn_max_reg_pressure,
    sched_setup_bb_reg_pressure_info): New functions.
    (schedule_insn): Add code for printing and updating reg pressure
    info.
    (find_set_reg_weight, find_insn_reg_weight): Remove.
    (ok_for_early_queue_removal): Do nothing if pressure_only_p.
    (debug_ready_list): Print reg pressure info.
    (schedule_block): Ditto.  Check insn issue time.
    (sched_init): Set up sched_pressure_p.  Allocate and set up some
    reg pressure related info.
    (sched_finish): Free some reg pressure related info.
    (fix_tick_ready): Make insn always ready if pressure_p.
    (init_h_i_d): Don't call find_insn_reg_weight.
    (haifa_finish_h_i_d): Free insn reg pressure info.
    
    * ira-int.h (ira_hard_regno_cover_class, ira_reg_class_nregs,
    ira_memory_move_cost, ira_class_hard_regs,
    ira_class_hard_regs_num, ira_no_alloc_regs,
    ira_available_class_regs, ira_reg_class_cover_size,
    ira_reg_class_cover, ira_class_translate): Move to ira.h.

    * ira-lives.c (single_reg_class): Check mode to find how many
    registers are necessary for operand.
    (ira_implicitly_set_insn_hard_regs): New.

    * common.opt (fsched-pressure): New options.
    (fsched-reg-pressure-heuristic): Remove.

    * ira.c (setup_eliminable_regset): Rename to
    ira_setup_eliminable_regset.  Make it external.
    (expand_reg_info): Pass cover class to setup_reg_classes.
    (ira): Call resize_reg_info instead of allocate_reg_info.

    * sched-deps.c: Include ira.h.
    (implicit_reg_pending_clobbers, implicit_reg_pending_uses): New.
    (create_insn_reg_use, create_insn_reg_set, setup_insn_reg_uses,
    reg_pressure_info, insn_use_p, mark_insn_pseudo_birth,
    mark_insn_hard_regno_birth, mark_insn_reg_birth,
    mark_pseudo_death, mark_hard_regno_death, mark_reg_death,
    mark_insn_reg_store, mark_insn_reg_clobber,
    setup_insn_reg_pressure_info): New.
    (sched_analyze_1): Update implicit_reg_pending_uses.
    (sched_analyze_insn): Find implicit sets, uses, clobbers of regs.
    Use them to create dependencies.  Set insn reg uses and pressure
    info.  Process reg_pending_uses in one place.
    (free_deps): Free implicit sets.
    (remove_from_deps): Remove implicit sets if necessary.  Check
    implicit sets when clearing reg_last_in_use.
    (init_deps_global): Clear implicit_reg_pending_clobbers and
    implicit_reg_pending_uses.
    
    * ira.h (ira_hard_regno_cover_class, ira_reg_class_nregs,
    ira_memory_move_cost, ira_class_hard_regs,
    ira_class_hard_regs_num, ira_no_alloc_regs,
    ira_available_class_regs, ira_reg_class_cover_size,
    ira_reg_class_cover, ira_class_translate): Move from ira-int.h.
    (ira_setup_eliminable_regset, ira_set_pseudo_classes,
    ira_implicitly_set_insn_hard_regs): New prototypes.
    
    * ira-costs.c (pseudo_classes_defined_p, allocno_p,
    cost_elements_num): New variables.
    (allocno_costs, total_costs): Rename to costs and
    total_allocno_costs.
    (COSTS_OF_ALLOCNO): Rename to COSTS.
    (allocno_pref): Rename to pref.
    (allocno_pref_buffer): Rename to pref_buffer.
    (common_classes): Rename to regno_cover_class.
    (COST_INDEX): New.
    (record_reg_classes): Set allocno attributes only if allocno_p.
    (record_address_regs): Ditto.  Use COST_INDEX instead of
    ALLOCNO_NUM.
    (scan_one_insn): Use COST_INDEX and COSTS instead of ALLOCNO_NUM
    and COSTS_OF_ALLOCNO.
    (print_costs): Rename to print_allocno_costs.
    (print_pseudo_costs): New.
    (process_bb_node_for_costs): Split into 2 functions with new
    function process_bb_for_costs.  Pass BB to process_bb_for_costs.
    (find_allocno_class_costs): Rename to find_costs_and_classes.  Add
    new parameter dump_file.  Use cost_elements_num instead of
    ira_allocnos_num.  Make one iteration if preferred classes were
    already calculated for scheduler.  Make 2 versions of code
    depending on allocno_p.
    (setup_allocno_cover_class_and_costs): Check allocno_p.  Use
    regno_cover_class and COSTS instead of common_classes and
    COSTS_OF_ALLOCNO.
    (init_costs, finish_costs): New.
    (ira_costs): Set up allocno_p and cost_elements_num.  Call
    init_costs and finish_costs.
    (ira_set_pseudo_classes): New.

    * rtl.h (allocate_reg_info): Remove.
    (resize_reg_info): Change return type.
    (reg_cover_class): New.
    (setup_reg_classes): Add new parameter.
    
    * sched-int.h (struct deps_reg): New member implicit_sets.
    (sched_pressure_p, sched_regno_cover_class): New external
    definitions.
    (INCREASE_BITS): New macro.
    (struct reg_pressure_data, struct reg_use_data): New.
    (struct _haifa_insn_data): Remove reg_weight.  Add members
    reg_pressure, reg_use_list, reg_set_list, and
    reg_pressure_excess_cost_change.
    (struct deps): New member implicit_sets.
    (pressure_p): New variable.
    (COVER_CLASS_BITS, INCREASE_BITS): New macros.
    (struct reg_pressure_data, struct reg_use_data): New.
    (INSN_REG_WEIGHT): Remove.
    (INSN_REG_PRESSURE, INSN_MAX_REG_PRESSURE, INSN_REG_USE_LIST,
    INSN_REG_SET_LIST, INSN_REG_PRESSURE_EXCESS_COST_CHANGE): New
    macros.
    (sched_init_region_reg_pressure_info,
    sched_setup_bb_reg_pressure_info): New prototypes.
    
        * reginfo.c (struct reg_pref): New member coverclass.
    (reg_cover_class): New function.
    (reginfo_init, pass_reginfo_init): Move after free_reg_info.
    (reg_info_size): New variable.
    (allocate_reg_info): Make static.  Setup reg_info_size.
    (resize_reg_info): Use reg_info_size.  Return flag of resizing.
    (setup_reg_classes): Add a new parameter.  Setup cover class too.

    * Makefile.in (reload.o, haifa-sched.o, sched-deps.o): Add ira.h to the
    dependencies.

    * sched-rgn.c (deps_join): Set up implicit_sets.
    (schedule_region): Set up region and basic blocks pressure
    relative info.
    
    * passes.c (init_optimization_passes): Move
    pass_subregs_of_mode_init before pass_sched.
    

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pressure-sensitive-scheduling.patch
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20090901/285051eb/attachment.ksh>


More information about the Gcc-patches mailing list