[PATCH] New Optimization: Partitioning hot & cold basic blocks

Caroline Tice ctice@apple.com
Wed Oct 8 19:20:00 GMT 2003


The following patch implements an optimization we have had in the Apple 
version of gcc for
the past 6 months, and which we would like the FSF gcc community to 
adopt.  This
optimization builds on the basic block reordering optimization.  As 
with the basic block
reordering optimization, it uses feedback profile information.  With 
this information it tags
every basic block as either 'hot' or 'cold'.  When the assembly and/or 
.o files are written,
the hot and cold basic blocks are written into separate sections.  The 
idea behind this
optimization  is to improve paging and cache locality performance.  In 
order to deal with
basic  blocks that appear to be close together in the CFG potentially 
being written far
apart in the assembly and .o files, there is some code for cleaning up  
edges that cross
between hot and cold sections.

This patch has been tested on Apple G4 and G5 machines, running both 
the Jaguar and
Panther operating systems.  I t was tested by: 1). running it on a test 
case specifically designed
to test the hot/cold partitioning, and verifying that it compiled and 
ran correctly (and did the
partitioning). 2. Running it on the SpecINT  2000 tests that the FSF 
gcc 3.4 compiler passes (gzip, vpr, mcf, parser, and twolf).  3). 
Bootstrapping the compiler with this patch.  4). Running the DejaGnu 
test suite with this patch.  The attached patch was generated with 
'diff -c3p'.

Below is the ChangeLog entry for this patch:


2003-10-08  Caroline Tice  <ctice@apple.com>

	* basic-block.h (partition_hot_cold_basic_blocks): Add extern function 
declaration.
	* bb-reorder.c (function.h, obstack.h):  Add two new include 
statements.
	(find_rarely_executed_basic_blocks): New function.
	(mark_bb_for_unlikely_executed_section): New function.
	(color_basic-blocks): New function.
	(find_all_crossing_edges): New function.
	(add_labels_and_missing_jumps): New function.
	(add_section_boundary_notes): New function.
	(fix_up_fall_thru_edges): New function.
	(fix_edges_for_rarely_executed_code): New function.
	(partition_hot_cold_basic_blocks): New function.
	* cfgcleanup.c (has_section_boundary_note): New function.
	(has_dont_shorten_note): New function.
	(try_simplify_cond_jump):  Added a test to not perform this 
optimization on a basic block
	containing a jump that crosses between hot and cold sections.
	(try_forward_edges):  Added a test to not perform this optimization on 
a basic block
	containing a jump that crosses between hot and cold sections.
	(merge_blocks_move_predecessor_nojumps):  Added a test to not perform 
this
	optimization on a basic block containing a jump that crosses between 
hot and cold
	sections.
	(merge_blocks_move_successor_nojumps):  Added a test to not perform 
this
	optimization on basic blocks containing a jump that crosses between 
hot and cold
	sections.
	(merge_blocks_move):  Added a test to not perform this optimization on 
basic blocks
	containing a jump that crosses between hot and cold sections.
	(try_crossjump_bb):  Added a test to not perform this optimization if 
the predecessor
	basic block contains a jump that crosses between hot and cold sections.
	(try_optimize_cfg):   Added a test to avoid simplifying a jump if the 
basic block contains a
	jump that crosses between hot and cold sections.
	* cfglayout.c (update_unlikely_executed_notes):  New function.
	(has_dont_shorten_branch):  New function.
	(fixup_reorder_chain):
	Moved an ifdef to make it valid with this new optimization.
	Added code so that when a new jumping basic block is added, it is 
given the appropriate
	notes and tags for this new optimization.
	Added code to update basic block indices correctly for the new NOTE 
insns introduced by
	this optimization.
	(duplicate_insn_chain):  Added code to correctly duplicate the new 
NOTE insns
	introduced by this optimization.
	* cfglayout.h (has_section_boundary_note, has_dont_shorten_note,
            scan_ahead_for_unlikely_executed_note):  Added new extern 
function declarations.
	* common.opt (freorder-blocks-and-partition):  Added new flag for this 
optimization.
	* dbxout.c (dbx_function_end):  Added code to make sure scope labels 
at the end of
	functions are written into the correct (hot or cold) section.
	* final.c (shorten_branches):  Added #ifdef code that checks for the 
definition of
	LONG_COND_BRANCH_SIZE, which is to be defined in the machine-specific 
code for
	the compiler (for those architectures which use "short" conditional 
branches, which may
	not be able to span the distance between hot and cold sections in the 
.s or .o file).  If this
	size is defined, and if we are performing the partitioning 
optimization, and if the current
	instruction is a jump instruction that crosses between hot and cold 
sections, the size of
	the jump insn is updated to be the size defined by 
LONG_COND_BRANCH_SIZE.  This
	size is then used later, in machine-specific code to convert 
conditional branches that are
	too short into appropriate unconditional branches.
	(scan_ahead_for_unlikely_executed_note):  New function.
	(is_jump_table_basic_block):  New function.
	(final_scan_insn):  Added code to check for NOTE instruction 
indicating whether basic
	block belongs in hot or cold section, and to make sure the current 
basic block is
	being written to the appropriate section.  Also added code to ensure 
that jump table
	basic blocks end up in the correct section.
	* flags.h (flag_reorder_blocks_and_partition):  New flag.
	* opts.c (decode_options): Code to handle new flag, 
flag_reorder_blocks_and_partition.
	(common_handle_option): Code to handle new flag, 
flag_reorder_blocks_and_partition.
	* output.h (unlikely_text_section): New extern function declaration.
	(in_unlikely_text_section): New extern function declaration.
	(HOT_TEXT_SECTION_NAME): Make sure this macro is defined.
	(UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Make sure this macro is defined.
	(SECTION_FORMAT_STRING):  Make sure this macro is defined.
	* print-rtl.c (print_rtx): Added code for handling the new NOTE insns 
introduced by this
	optimization.
	* rtl.c  (NOTE_INSN_UNLIKELY_EXECUTED_CODE): New note (see below).
	(NOTE_INSN_DONT_SHORTEN_BRANCH): New note (see below).
	(NOTE_INSN_SECTION_BOUNDARY): New note (see below).
	* rtl.h (NOTE_INSN_UNLIKELY_EXECUTED_CODE):  New note instruction, 
indicating
	the basic block containing it belongs in the cold section.
	(NOTE_INSN_DONT_SHORTEN_BRANCH):  New note indicating basic block 
containing
	it should not have jump/basic-block optimizations performed on it.
	(NOTE_INSN_SECTION_BOUNDARY):  New note indicating the basic block 
containing
	it contains a conditional jump that crosses between hot and cold 
sections.
	(insn_on_section_boundary) : New extern function declaration.
	* toplev.c (flag_reorder_blocks_and_partition):  Added code to 
initialize this flag, and to
	tie it to the command-line option freorder-blocks-and-partition.
	(rest_of_handle_stack_regs):  Added flag_reorder_blocks_and_partition 
as an 'or'
	condition for calling reorder_basic_blocks.
	(rest_of_handle_reorder_blocks):  Added 
flag_reorder_blocks_and_partition as an 'or'
	condition for calling reorder_basic_blocks.
	(rest_of_compilation):  Added call to partition_hot_cold_basic_blocks.
	* varasm.c (cfglayout.h):  Added new include statement.
	(unlikely_section_label_printed):  New global variable, used for 
determining when to
	output section name labels for cold sections.
	(in_section):  Added in_unlikely_executed_text to enum data structure.
	(text_section):  Modified code to use SECTION_FORMAT_STRING and
	HOT_TEXT_SECTION_NAME macros.
	(unlikely_text_section):  New function.
	(in_unlikely_text_section):  New function.
	(function_section):  Added code to make sure beginning of function is 
written into
	correct section (hot or cold).
	(assemble_start_function):  Added code to make sure stuff is written 
to the correct
	section.
	(assemble_zeros):  Added in_unlikely_text_section as an 'or' condition 
to an if
	statement that was checking 'in_text_section'.
	(assemble_variable):  Added 'in_unlikely_text_section' as an 'or' 
condition to an if
	statement that was checking 'in_text_section'.
	* config/rs6000/darwin.h (UNLIKELY_EXECUTED_TEXT_SECTION_NAME): changed
	text string to something more informative.
	(SECTION_FORMAT_STRING):  Added new definition.
	* config/rs6000/rs6000.c (rs6000_assemble_integer):  Added 
'!in_unlikely_text_section'
	as an 'and' condition to an if statement that was already checking 
'!in_text_section'.
	(output_cbranch):   Modified 'need_longbranch' to be true if an insn 
size is
	LONG_COND_BRANCH_SIZE.
	* config/rs6000/rs6000.h (LONG_COND_BRANCH_SIZE):  Added new 
definition.



-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: hot-cold-patch.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20031008/0bade300/attachment.txt>


More information about the Gcc-patches mailing list