This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] New Optimization: Partitioning hot & cold basic blocks
- From: Caroline Tice <ctice at apple dot com>
- To: gcc-patches at gcc dot gnu dot org
- Date: Wed, 8 Oct 2003 12:20:33 -0700
- Subject: [PATCH] New Optimization: Partitioning hot & cold basic blocks
The following patch implements an optimization we have had in the Apple
version of gcc for
the past 6 months, and which we would like the FSF gcc community to
adopt. This
optimization builds on the basic block reordering optimization. As
with the basic block
reordering optimization, it uses feedback profile information. With
this information it tags
every basic block as either 'hot' or 'cold'. When the assembly and/or
.o files are written,
the hot and cold basic blocks are written into separate sections. The
idea behind this
optimization is to improve paging and cache locality performance. In
order to deal with
basic blocks that appear to be close together in the CFG potentially
being written far
apart in the assembly and .o files, there is some code for cleaning up
edges that cross
between hot and cold sections.
This patch has been tested on Apple G4 and G5 machines, running both
the Jaguar and
Panther operating systems. I t was tested by: 1). running it on a test
case specifically designed
to test the hot/cold partitioning, and verifying that it compiled and
ran correctly (and did the
partitioning). 2. Running it on the SpecINT 2000 tests that the FSF
gcc 3.4 compiler passes (gzip, vpr, mcf, parser, and twolf). 3).
Bootstrapping the compiler with this patch. 4). Running the DejaGnu
test suite with this patch. The attached patch was generated with
'diff -c3p'.
Below is the ChangeLog entry for this patch:
2003-10-08 Caroline Tice <ctice@apple.com>
* basic-block.h (partition_hot_cold_basic_blocks): Add extern function
declaration.
* bb-reorder.c (function.h, obstack.h): Add two new include
statements.
(find_rarely_executed_basic_blocks): New function.
(mark_bb_for_unlikely_executed_section): New function.
(color_basic-blocks): New function.
(find_all_crossing_edges): New function.
(add_labels_and_missing_jumps): New function.
(add_section_boundary_notes): New function.
(fix_up_fall_thru_edges): New function.
(fix_edges_for_rarely_executed_code): New function.
(partition_hot_cold_basic_blocks): New function.
* cfgcleanup.c (has_section_boundary_note): New function.
(has_dont_shorten_note): New function.
(try_simplify_cond_jump): Added a test to not perform this
optimization on a basic block
containing a jump that crosses between hot and cold sections.
(try_forward_edges): Added a test to not perform this optimization on
a basic block
containing a jump that crosses between hot and cold sections.
(merge_blocks_move_predecessor_nojumps): Added a test to not perform
this
optimization on a basic block containing a jump that crosses between
hot and cold
sections.
(merge_blocks_move_successor_nojumps): Added a test to not perform
this
optimization on basic blocks containing a jump that crosses between
hot and cold
sections.
(merge_blocks_move): Added a test to not perform this optimization on
basic blocks
containing a jump that crosses between hot and cold sections.
(try_crossjump_bb): Added a test to not perform this optimization if
the predecessor
basic block contains a jump that crosses between hot and cold sections.
(try_optimize_cfg): Added a test to avoid simplifying a jump if the
basic block contains a
jump that crosses between hot and cold sections.
* cfglayout.c (update_unlikely_executed_notes): New function.
(has_dont_shorten_branch): New function.
(fixup_reorder_chain):
Moved an ifdef to make it valid with this new optimization.
Added code so that when a new jumping basic block is added, it is
given the appropriate
notes and tags for this new optimization.
Added code to update basic block indices correctly for the new NOTE
insns introduced by
this optimization.
(duplicate_insn_chain): Added code to correctly duplicate the new
NOTE insns
introduced by this optimization.
* cfglayout.h (has_section_boundary_note, has_dont_shorten_note,
scan_ahead_for_unlikely_executed_note): Added new extern
function declarations.
* common.opt (freorder-blocks-and-partition): Added new flag for this
optimization.
* dbxout.c (dbx_function_end): Added code to make sure scope labels
at the end of
functions are written into the correct (hot or cold) section.
* final.c (shorten_branches): Added #ifdef code that checks for the
definition of
LONG_COND_BRANCH_SIZE, which is to be defined in the machine-specific
code for
the compiler (for those architectures which use "short" conditional
branches, which may
not be able to span the distance between hot and cold sections in the
.s or .o file). If this
size is defined, and if we are performing the partitioning
optimization, and if the current
instruction is a jump instruction that crosses between hot and cold
sections, the size of
the jump insn is updated to be the size defined by
LONG_COND_BRANCH_SIZE. This
size is then used later, in machine-specific code to convert
conditional branches that are
too short into appropriate unconditional branches.
(scan_ahead_for_unlikely_executed_note): New function.
(is_jump_table_basic_block): New function.
(final_scan_insn): Added code to check for NOTE instruction
indicating whether basic
block belongs in hot or cold section, and to make sure the current
basic block is
being written to the appropriate section. Also added code to ensure
that jump table
basic blocks end up in the correct section.
* flags.h (flag_reorder_blocks_and_partition): New flag.
* opts.c (decode_options): Code to handle new flag,
flag_reorder_blocks_and_partition.
(common_handle_option): Code to handle new flag,
flag_reorder_blocks_and_partition.
* output.h (unlikely_text_section): New extern function declaration.
(in_unlikely_text_section): New extern function declaration.
(HOT_TEXT_SECTION_NAME): Make sure this macro is defined.
(UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Make sure this macro is defined.
(SECTION_FORMAT_STRING): Make sure this macro is defined.
* print-rtl.c (print_rtx): Added code for handling the new NOTE insns
introduced by this
optimization.
* rtl.c (NOTE_INSN_UNLIKELY_EXECUTED_CODE): New note (see below).
(NOTE_INSN_DONT_SHORTEN_BRANCH): New note (see below).
(NOTE_INSN_SECTION_BOUNDARY): New note (see below).
* rtl.h (NOTE_INSN_UNLIKELY_EXECUTED_CODE): New note instruction,
indicating
the basic block containing it belongs in the cold section.
(NOTE_INSN_DONT_SHORTEN_BRANCH): New note indicating basic block
containing
it should not have jump/basic-block optimizations performed on it.
(NOTE_INSN_SECTION_BOUNDARY): New note indicating the basic block
containing
it contains a conditional jump that crosses between hot and cold
sections.
(insn_on_section_boundary) : New extern function declaration.
* toplev.c (flag_reorder_blocks_and_partition): Added code to
initialize this flag, and to
tie it to the command-line option freorder-blocks-and-partition.
(rest_of_handle_stack_regs): Added flag_reorder_blocks_and_partition
as an 'or'
condition for calling reorder_basic_blocks.
(rest_of_handle_reorder_blocks): Added
flag_reorder_blocks_and_partition as an 'or'
condition for calling reorder_basic_blocks.
(rest_of_compilation): Added call to partition_hot_cold_basic_blocks.
* varasm.c (cfglayout.h): Added new include statement.
(unlikely_section_label_printed): New global variable, used for
determining when to
output section name labels for cold sections.
(in_section): Added in_unlikely_executed_text to enum data structure.
(text_section): Modified code to use SECTION_FORMAT_STRING and
HOT_TEXT_SECTION_NAME macros.
(unlikely_text_section): New function.
(in_unlikely_text_section): New function.
(function_section): Added code to make sure beginning of function is
written into
correct section (hot or cold).
(assemble_start_function): Added code to make sure stuff is written
to the correct
section.
(assemble_zeros): Added in_unlikely_text_section as an 'or' condition
to an if
statement that was checking 'in_text_section'.
(assemble_variable): Added 'in_unlikely_text_section' as an 'or'
condition to an if
statement that was checking 'in_text_section'.
* config/rs6000/darwin.h (UNLIKELY_EXECUTED_TEXT_SECTION_NAME): changed
text string to something more informative.
(SECTION_FORMAT_STRING): Added new definition.
* config/rs6000/rs6000.c (rs6000_assemble_integer): Added
'!in_unlikely_text_section'
as an 'and' condition to an if statement that was already checking
'!in_text_section'.
(output_cbranch): Modified 'need_longbranch' to be true if an insn
size is
LONG_COND_BRANCH_SIZE.
* config/rs6000/rs6000.h (LONG_COND_BRANCH_SIZE): Added new
definition.
Attachment:
hot-cold-patch.txt
Description: Text document