This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] New Optimization: Partitioning hot & cold basic blocks


The following patch implements an optimization we have had in the Apple version of gcc for
the past 6 months, and which we would like the FSF gcc community to adopt. This
optimization builds on the basic block reordering optimization. As with the basic block
reordering optimization, it uses feedback profile information. With this information it tags
every basic block as either 'hot' or 'cold'. When the assembly and/or .o files are written,
the hot and cold basic blocks are written into separate sections. The idea behind this
optimization is to improve paging and cache locality performance. In order to deal with
basic blocks that appear to be close together in the CFG potentially being written far
apart in the assembly and .o files, there is some code for cleaning up edges that cross
between hot and cold sections.


This patch has been tested on Apple G4 and G5 machines, running both the Jaguar and
Panther operating systems. I t was tested by: 1). running it on a test case specifically designed
to test the hot/cold partitioning, and verifying that it compiled and ran correctly (and did the
partitioning). 2. Running it on the SpecINT 2000 tests that the FSF gcc 3.4 compiler passes (gzip, vpr, mcf, parser, and twolf). 3). Bootstrapping the compiler with this patch. 4). Running the DejaGnu test suite with this patch. The attached patch was generated with 'diff -c3p'.


Below is the ChangeLog entry for this patch:


2003-10-08 Caroline Tice <ctice@apple.com>


* basic-block.h (partition_hot_cold_basic_blocks): Add extern function declaration.
* bb-reorder.c (function.h, obstack.h): Add two new include statements.
(find_rarely_executed_basic_blocks): New function.
(mark_bb_for_unlikely_executed_section): New function.
(color_basic-blocks): New function.
(find_all_crossing_edges): New function.
(add_labels_and_missing_jumps): New function.
(add_section_boundary_notes): New function.
(fix_up_fall_thru_edges): New function.
(fix_edges_for_rarely_executed_code): New function.
(partition_hot_cold_basic_blocks): New function.
* cfgcleanup.c (has_section_boundary_note): New function.
(has_dont_shorten_note): New function.
(try_simplify_cond_jump): Added a test to not perform this optimization on a basic block
containing a jump that crosses between hot and cold sections.
(try_forward_edges): Added a test to not perform this optimization on a basic block
containing a jump that crosses between hot and cold sections.
(merge_blocks_move_predecessor_nojumps): Added a test to not perform this
optimization on a basic block containing a jump that crosses between hot and cold
sections.
(merge_blocks_move_successor_nojumps): Added a test to not perform this
optimization on basic blocks containing a jump that crosses between hot and cold
sections.
(merge_blocks_move): Added a test to not perform this optimization on basic blocks
containing a jump that crosses between hot and cold sections.
(try_crossjump_bb): Added a test to not perform this optimization if the predecessor
basic block contains a jump that crosses between hot and cold sections.
(try_optimize_cfg): Added a test to avoid simplifying a jump if the basic block contains a
jump that crosses between hot and cold sections.
* cfglayout.c (update_unlikely_executed_notes): New function.
(has_dont_shorten_branch): New function.
(fixup_reorder_chain):
Moved an ifdef to make it valid with this new optimization.
Added code so that when a new jumping basic block is added, it is given the appropriate
notes and tags for this new optimization.
Added code to update basic block indices correctly for the new NOTE insns introduced by
this optimization.
(duplicate_insn_chain): Added code to correctly duplicate the new NOTE insns
introduced by this optimization.
* cfglayout.h (has_section_boundary_note, has_dont_shorten_note,
scan_ahead_for_unlikely_executed_note): Added new extern function declarations.
* common.opt (freorder-blocks-and-partition): Added new flag for this optimization.
* dbxout.c (dbx_function_end): Added code to make sure scope labels at the end of
functions are written into the correct (hot or cold) section.
* final.c (shorten_branches): Added #ifdef code that checks for the definition of
LONG_COND_BRANCH_SIZE, which is to be defined in the machine-specific code for
the compiler (for those architectures which use "short" conditional branches, which may
not be able to span the distance between hot and cold sections in the .s or .o file). If this
size is defined, and if we are performing the partitioning optimization, and if the current
instruction is a jump instruction that crosses between hot and cold sections, the size of
the jump insn is updated to be the size defined by LONG_COND_BRANCH_SIZE. This
size is then used later, in machine-specific code to convert conditional branches that are
too short into appropriate unconditional branches.
(scan_ahead_for_unlikely_executed_note): New function.
(is_jump_table_basic_block): New function.
(final_scan_insn): Added code to check for NOTE instruction indicating whether basic
block belongs in hot or cold section, and to make sure the current basic block is
being written to the appropriate section. Also added code to ensure that jump table
basic blocks end up in the correct section.
* flags.h (flag_reorder_blocks_and_partition): New flag.
* opts.c (decode_options): Code to handle new flag, flag_reorder_blocks_and_partition.
(common_handle_option): Code to handle new flag, flag_reorder_blocks_and_partition.
* output.h (unlikely_text_section): New extern function declaration.
(in_unlikely_text_section): New extern function declaration.
(HOT_TEXT_SECTION_NAME): Make sure this macro is defined.
(UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Make sure this macro is defined.
(SECTION_FORMAT_STRING): Make sure this macro is defined.
* print-rtl.c (print_rtx): Added code for handling the new NOTE insns introduced by this
optimization.
* rtl.c (NOTE_INSN_UNLIKELY_EXECUTED_CODE): New note (see below).
(NOTE_INSN_DONT_SHORTEN_BRANCH): New note (see below).
(NOTE_INSN_SECTION_BOUNDARY): New note (see below).
* rtl.h (NOTE_INSN_UNLIKELY_EXECUTED_CODE): New note instruction, indicating
the basic block containing it belongs in the cold section.
(NOTE_INSN_DONT_SHORTEN_BRANCH): New note indicating basic block containing
it should not have jump/basic-block optimizations performed on it.
(NOTE_INSN_SECTION_BOUNDARY): New note indicating the basic block containing
it contains a conditional jump that crosses between hot and cold sections.
(insn_on_section_boundary) : New extern function declaration.
* toplev.c (flag_reorder_blocks_and_partition): Added code to initialize this flag, and to
tie it to the command-line option freorder-blocks-and-partition.
(rest_of_handle_stack_regs): Added flag_reorder_blocks_and_partition as an 'or'
condition for calling reorder_basic_blocks.
(rest_of_handle_reorder_blocks): Added flag_reorder_blocks_and_partition as an 'or'
condition for calling reorder_basic_blocks.
(rest_of_compilation): Added call to partition_hot_cold_basic_blocks.
* varasm.c (cfglayout.h): Added new include statement.
(unlikely_section_label_printed): New global variable, used for determining when to
output section name labels for cold sections.
(in_section): Added in_unlikely_executed_text to enum data structure.
(text_section): Modified code to use SECTION_FORMAT_STRING and
HOT_TEXT_SECTION_NAME macros.
(unlikely_text_section): New function.
(in_unlikely_text_section): New function.
(function_section): Added code to make sure beginning of function is written into
correct section (hot or cold).
(assemble_start_function): Added code to make sure stuff is written to the correct
section.
(assemble_zeros): Added in_unlikely_text_section as an 'or' condition to an if
statement that was checking 'in_text_section'.
(assemble_variable): Added 'in_unlikely_text_section' as an 'or' condition to an if
statement that was checking 'in_text_section'.
* config/rs6000/darwin.h (UNLIKELY_EXECUTED_TEXT_SECTION_NAME): changed
text string to something more informative.
(SECTION_FORMAT_STRING): Added new definition.
* config/rs6000/rs6000.c (rs6000_assemble_integer): Added '!in_unlikely_text_section'
as an 'and' condition to an if statement that was already checking '!in_text_section'.
(output_cbranch): Modified 'need_longbranch' to be true if an insn size is
LONG_COND_BRANCH_SIZE.
* config/rs6000/rs6000.h (LONG_COND_BRANCH_SIZE): Added new definition.




Attachment: hot-cold-patch.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]