This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
- From: Christophe Lyon <christophe dot lyon at linaro dot org>
- To: Teresa Johnson <tejohnson at google dot com>
- Cc: reply at codereview dot appspotmail dot com, David Li <davidxl at google dot com>, Steven Bosscher <stevenb dot gcc at gmail dot com>, Matthew Gretton-Dann <matthew dot gretton-dann at linaro dot org>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 26 Nov 2012 17:25:10 +0100
- Subject: Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
- References: <20121115201013.D8E7E61423@tjsboxrox.mtv.corp.google.com> <CAAe5K+WvT67aLCZgwY9XJSW5YFoOizTzSg3Nu0=5v-Z2O1QqvA@mail.gmail.com>
Hi,
I have tested your patch on Spec2000 on ARM, and I can still see
several failures caused by:
"error: fallthru edge crosses section boundary", including the case
described in PR55121.
On 26 November 2012 16:55, Teresa Johnson <tejohnson@google.com> wrote:
> Ping.
> Teresa
>
> On Thu, Nov 15, 2012 at 12:10 PM, Teresa Johnson <tejohnson@google.com> wrote:
>> Revised patch that fixes failures encountered when enabling
>> -freorder-blocks-and-partition, including the failure reported in PR 53743.
>>
>> This includes new verification code to ensure no cold blocks dominate hot
>> blocks contributed by Steven Bosscher.
>>
>> I attempted to make the handling of partition updates through the optimization
>> passes much more consistent, removing a number of partial fixes in the code
>> stream in the process. The code to fixup partitions (including the BB_PARTITION
>> assignement, region crossing jump notes, and switch text section notes) is
>> now handled in a few centralized locations. For example, inside
>> rtl_redirect_edge_and_branch and force_nonfallthru_and_redirect, so that callers
>> don't need to attempt the fixup themselves.
>>
>> For optimization passes that make adjustments to the cfg while in cfg layout
>> mode that are not easy to fix up incrementally, the new routine
>> fixup_partitions handles the cleanup globally. This does require calculation
>> of the dominance relation, however, as far as I can tell the routines which
>> now invoke this global fixup (try_optimize_cfg and commit_edge_insertions)
>> are invoked typically once (or a small number of times in the case of
>> try_optimize_cfg) per optimization pass. Additionally, I compared the
>> -ftime-report output for some large fdo compilations and saw only minimal
>> increases in the dominance computation times, which were only a tiny percent
>> of the overall compile time.
>>
>> Additionally, I added a flag to the rtl_data structure to indicate whether
>> any partitioning was actually performed, so that optimizations which were
>> conservatively disabled whenever the flag_reorder_blocks_and_partition
>> is enabled (e.g. try_crossjump_to_edge, part of connect_traces) can be less
>> conservative for functions where no partitions were formed (e.g. they are
>> completely hot).
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu. Also tested with SPEC2006 int
>> benchmarks and internal google benchmarks using profile feedback and
>> -freorder-blocks-and-partition to get more coverage. Ok for trunk?
>>
>> Thanks,
>> Teresa
>>
>> 2012-11-14 Teresa Johnson <tejohnson@google.com>
>> Steven Bosscher <steven@gcc.gnu.org>
>>
>> * cfghooks.h (cfg_layout_finalize): New parameter.
>> * modulo-sched.c (rest_of_handle_sms): New cfg_layout_finalize
>> parameter.
>> * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
>> as this is now done by redirect_edge_and_branch_force.
>> * function.c (thread_prologue_and_epilogue_insns): Insert new bb after
>> barriers, new cfg_layout_finalize parameter, and don't store exit
>> predecessor BB until after it is potentially split.
>> * function.h (struct rtl_data): New flag has_bb_partition.
>> * hw-doloop.c (reorder_loops): New cfg_layout_finalize parameter.
>> * cfgcleanup.c (try_crossjump_to_edge): Only skip optimization if
>> any blocks in function actually partitioned.
>> (try_optimize_cfg): If cfg changed, invoke fixup_partitions to clean
>> up partitioning.
>> * bb-reorder.c (connect_traces): Only look for partitions and skip
>> block copying if any blocks in function actually partitioned.
>> (emit_barrier_after_bb): Handle insertion in non-cfglayout mode.
>> (find_rarely_executed_basic_blocks_and_crossing_edges): Ensure
>> that no cold blocks dominate a hot block.
>> (fix_up_fall_thru_edges): Replace BB_COPY_PARTITION with assert
>> as this is now done by force_nonfallthru_and_redirect.
>> (add_reg_crossing_jump_notes): Handle the fact that some jumps may
>> already be marked with region crossing note.
>> (reorder_basic_blocks): Only need to verify partitions if any
>> blocks in function actually partitioned.
>> (insert_section_boundary_note): Only need to insert note if any
>> blocks in function actually partitioned.
>> (rest_of_handle_reorder_blocks): New cfg_layout_finalize
>> parameter, and remove call to insert_section_boundary_note as this
>> is now called via cfg_layout_finalize/fixup_reorder_chain.
>> (duplicate_computed_gotos): New cfg_layout_finalize
>> parameter.
>> (partition_hot_cold_basic_blocks): Set flag indicating function
>> has bb partitions.
>> * bb-reorder.h: Declare insert_section_boundary_note and
>> emit_barrier_after_bb, which are no longer static.
>> * basic-block.h: Declare new function fixup_partitions.
>> * cfgrtl.c (try_redirect_by_replacing_jump): Remove unnecessary
>> check for region crossing note.
>> (fixup_partition_crossing): New function.
>> (fixup_bb_partition): Ditto.
>> (rtl_redirect_edge_and_branch): Fixup partition boundaries.
>> (force_nonfallthru_and_redirect): Fixup partition boundaries,
>> remove old code that tried to do this. Emit barrier correctly
>> when we are in cfglayout mode.
>> (rtl_split_edge): Correctly fixup partition boundaries.
>> (commit_one_edge_insertion): Remove old code that tried to
>> fixup region crossing edge since this is now handled in
>> split_block, and set up insertion point correctly since
>> block may now end in a jump.
>> (commit_edge_insertions): Invoke fixup_partitions to sanitize partition
>> boundaries after optimizations that modify cfg and before trying to
>> verify the flow info.
>> (fixup_partitions): New function.
>> (rtl_verify_flow_info_1): Add verification that no cold bbs dominate
>> hot bbs.
>> (record_effective_endpoints): Remove region-crossing notes and set flag
>> indicating that they need to be reinserted on exit from cfglayout mode.
>> (outof_cfg_layout_mode): New cfg_layout_finalize parameter.
>> (fixup_reorder_chain): Call insert_section_boundary_note if necessary.
>> Remove old code that attempted to fixup region crossing note as
>> this is now handled in force_nonfallthru_and_redirect.
>> (duplicate_insn_chain): Don't duplicate switch section notes.
>> (cfg_layout_finalize): Pass new parameter to fixup_reorder_chain.
>> (rtl_can_remove_branch_p): Remove unnecessary check for region crossing
>> note.
>>
>> Index: cfghooks.h
>> ===================================================================
>> --- cfghooks.h (revision 193376)
>> +++ cfghooks.h (working copy)
>> @@ -204,7 +204,7 @@ extern void copy_bbs (basic_block *, unsigned, bas
>> void account_profile_record (struct profile_record *, int);
>>
>> extern void cfg_layout_initialize (unsigned int);
>> -extern void cfg_layout_finalize (void);
>> +extern void cfg_layout_finalize (bool);
>>
>> /* Hooks containers. */
>> extern struct cfg_hooks gimple_cfg_hooks;
>> @@ -218,4 +218,3 @@ extern void cfg_layout_rtl_register_cfg_hooks (voi
>> extern void gimple_register_cfg_hooks (void);
>> extern struct cfg_hooks get_cfg_hooks (void);
>> extern void set_cfg_hooks (struct cfg_hooks);
>> -
>> Index: modulo-sched.c
>> ===================================================================
>> --- modulo-sched.c (revision 193376)
>> +++ modulo-sched.c (working copy)
>> @@ -3354,7 +3354,7 @@ rest_of_handle_sms (void)
>> if (bb->next_bb != EXIT_BLOCK_PTR)
>> bb->aux = bb->next_bb;
>> free_dominance_info (CDI_DOMINATORS);
>> - cfg_layout_finalize ();
>> + cfg_layout_finalize (false);
>> #endif /* INSN_SCHEDULING */
>> return 0;
>> }
>> Index: ifcvt.c
>> ===================================================================
>> --- ifcvt.c (revision 193376)
>> +++ ifcvt.c (working copy)
>> @@ -3900,10 +3900,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg
>> if (new_bb)
>> {
>> df_bb_replace (then_bb_index, new_bb);
>> - /* Since the fallthru edge was redirected from test_bb to new_bb,
>> - we need to ensure that new_bb is in the same partition as
>> - test bb (you can not fall through across section boundaries). */
>> - BB_COPY_PARTITION (new_bb, test_bb);
>> + /* This should have been done above via force_nonfallthru_and_redirect
>> + (possibly called from redirect_edge_and_branch_force). */
>> + gcc_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb));
>> }
>>
>> num_true_changes++;
>> Index: function.c
>> ===================================================================
>> --- function.c (revision 193376)
>> +++ function.c (working copy)
>> @@ -6249,8 +6249,10 @@ thread_prologue_and_epilogue_insns (void)
>> break;
>> if (e)
>> {
>> - copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
>> - NULL_RTX, e->src);
>> + /* Make sure we insert after any barriers. */
>> + rtx end = get_last_bb_insn (e->src);
>> + copy_bb = create_basic_block (NEXT_INSN (end),
>> + NULL_RTX, e->src);
>> BB_COPY_PARTITION (copy_bb, e->src);
>> }
>> else
>> @@ -6475,7 +6477,7 @@ thread_prologue_and_epilogue_insns (void)
>> if (cur_bb->index >= NUM_FIXED_BLOCKS
>> && cur_bb->next_bb->index >= NUM_FIXED_BLOCKS)
>> cur_bb->aux = cur_bb->next_bb;
>> - cfg_layout_finalize ();
>> + cfg_layout_finalize (false);
>> }
>>
>> epilogue_done:
>> @@ -6517,7 +6519,7 @@ epilogue_done:
>> basic_block simple_return_block_cold = NULL;
>> edge pending_edge_hot = NULL;
>> edge pending_edge_cold = NULL;
>> - basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb;
>> + basic_block exit_pred;
>> int i;
>>
>> gcc_assert (entry_edge != orig_entry_edge);
>> @@ -6545,6 +6547,12 @@ epilogue_done:
>> else
>> pending_edge_cold = e;
>> }
>> +
>> + /* Save a pointer to the exit's predecessor BB for use in
>> + inserting new BBs at the end of the function. Do this
>> + after the call to split_block above which may split
>> + the original exit pred. */
>> + exit_pred = EXIT_BLOCK_PTR->prev_bb;
>>
>> FOR_EACH_VEC_ELT (edge, unconverted_simple_returns, i, e)
>> {
>> Index: function.h
>> ===================================================================
>> --- function.h (revision 193376)
>> +++ function.h (working copy)
>> @@ -459,6 +459,11 @@ struct GTY(()) rtl_data {
>> sched2) and is useful only if the port defines LEAF_REGISTERS. */
>> bool uses_only_leaf_regs;
>>
>> + /* Nonzero if the function being compiled has undergone hot/cold partitioning
>> + (under flag_reorder_blocks_and_partition) and has at least one cold
>> + block. */
>> + bool has_bb_partition;
>> +
>> /* Like regs_ever_live, but 1 if a reg is set or clobbered from an
>> asm. Unlike regs_ever_live, elements of this array corresponding
>> to eliminable regs (like the frame pointer) are set if an asm
>> Index: hw-doloop.c
>> ===================================================================
>> --- hw-doloop.c (revision 193376)
>> +++ hw-doloop.c (working copy)
>> @@ -547,7 +547,7 @@ reorder_loops (hwloop_info loops)
>> else
>> bb->aux = NULL;
>> }
>> - cfg_layout_finalize ();
>> + cfg_layout_finalize (false);
>> clear_aux_for_blocks ();
>> df_analyze ();
>> }
>> Index: cfgcleanup.c
>> ===================================================================
>> --- cfgcleanup.c (revision 193376)
>> +++ cfgcleanup.c (working copy)
>> @@ -1824,7 +1824,7 @@ try_crossjump_to_edge (int mode, edge e1, edge e2,
>> partition boundaries). See the comments at the top of
>> bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */
>>
>> - if (flag_reorder_blocks_and_partition && reload_completed)
>> + if (crtl->has_bb_partition && reload_completed)
>> return false;
>>
>> /* Search backward through forwarder blocks. We don't need to worry
>> @@ -2767,10 +2767,21 @@ try_optimize_cfg (int mode)
>> df_analyze ();
>> }
>>
>> + if (changed)
>> + {
>> + /* Edge forwarding in particular can cause hot blocks previously
>> + reached by both hot and cold blocks to become dominated only
>> + by cold blocks. This will cause the verification below to fail,
>> + and lead to now cold code in the hot section. This is not easy
>> + to detect and fix during edge forwarding, and in some cases
>> + is only visible after newly unreachable blocks are deleted,
>> + which will be done in fixup_partitions. */
>> + fixup_partitions ();
>> +
>> #ifdef ENABLE_CHECKING
>> - if (changed)
>> - verify_flow_info ();
>> + verify_flow_info ();
>> #endif
>> + }
>>
>> changed_overall |= changed;
>> first_pass = false;
>> Index: bb-reorder.c
>> ===================================================================
>> --- bb-reorder.c (revision 193376)
>> +++ bb-reorder.c (working copy)
>> @@ -1054,7 +1054,7 @@ connect_traces (int n_traces, struct trace *traces
>> current_partition = BB_PARTITION (traces[0].first);
>> two_passes = false;
>>
>> - if (flag_reorder_blocks_and_partition)
>> + if (crtl->has_bb_partition)
>> for (i = 0; i < n_traces && !two_passes; i++)
>> if (BB_PARTITION (traces[0].first)
>> != BB_PARTITION (traces[i].first))
>> @@ -1263,7 +1263,7 @@ connect_traces (int n_traces, struct trace *traces
>> }
>> }
>>
>> - if (flag_reorder_blocks_and_partition)
>> + if (crtl->has_bb_partition)
>> try_copy = false;
>>
>> /* Copy tiny blocks always; copy larger blocks only when the
>> @@ -1381,13 +1381,14 @@ get_uncond_jump_length (void)
>> return length;
>> }
>>
>> -/* Emit a barrier into the footer of BB. */
>> +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode. */
>>
>> -static void
>> +void
>> emit_barrier_after_bb (basic_block bb)
>> {
>> rtx barrier = emit_barrier_after (BB_END (bb));
>> - BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
>> + if (current_ir_type () == IR_RTL_CFGLAYOUT)
>> + BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
>> }
>>
>> /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions.
>> @@ -1463,18 +1464,109 @@ find_rarely_executed_basic_blocks_and_crossing_edg
>> {
>> VEC(edge, heap) *crossing_edges = NULL;
>> basic_block bb;
>> - edge e;
>> - edge_iterator ei;
>> + edge e, e2;
>> + edge_iterator ei, ei2;
>> + unsigned int cold_bb_count = 0;
>> + VEC (basic_block, heap) *bbs_in_hot_partition = NULL;
>> + VEC (basic_block, heap) *bbs_newly_hot = NULL;
>>
>> /* Mark which partition (hot/cold) each basic block belongs in. */
>> FOR_EACH_BB (bb)
>> {
>> if (probably_never_executed_bb_p (cfun, bb))
>> - BB_SET_PARTITION (bb, BB_COLD_PARTITION);
>> + {
>> + BB_SET_PARTITION (bb, BB_COLD_PARTITION);
>> + cold_bb_count++;
>> + }
>> else
>> - BB_SET_PARTITION (bb, BB_HOT_PARTITION);
>> + {
>> + BB_SET_PARTITION (bb, BB_HOT_PARTITION);
>> + VEC_safe_push (basic_block, heap, bbs_in_hot_partition, bb);
>> + }
>> }
>>
>> + /* Ensure that no cold bbs dominate hot bbs. This could happen as a result of
>> + several different possibilities. One is that there are edge weight insanities
>> + due to optimization phases that do not properly update basic block profile
>> + counts. The second is that the entry of the function may not be hot, because
>> + it is entered fewer times than the number of profile training runs, but there
>> + is a loop inside the function that causes blocks within the function to be
>> + above the threshold for hotness. */
>> + if (cold_bb_count)
>> + {
>> + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS);
>> +
>> + if (dom_calculated_here)
>> + calculate_dominance_info (CDI_DOMINATORS);
>> +
>> + /* Keep examining hot bbs until we have either checked them all, or
>> + re-marked all cold bbs hot. */
>> + while (! VEC_empty (basic_block, bbs_in_hot_partition)
>> + && cold_bb_count)
>> + {
>> + basic_block dom_bb;
>> +
>> + bb = VEC_pop (basic_block, bbs_in_hot_partition);
>> + dom_bb = get_immediate_dominator (CDI_DOMINATORS, bb);
>> +
>> + /* If bb's immediate dominator is also hot then it is ok. */
>> + if (BB_PARTITION (dom_bb) != BB_COLD_PARTITION)
>> + continue;
>> +
>> + /* We have a hot bb with an immediate dominator that is cold.
>> + The dominator needs to be re-marked to hot. */
>> + BB_SET_PARTITION (dom_bb, BB_HOT_PARTITION);
>> + cold_bb_count--;
>> +
>> + /* Now we need to examine newly-hot dom_bb to see if it is also
>> + dominated by a cold bb. */
>> + VEC_safe_push (basic_block, heap, bbs_in_hot_partition, dom_bb);
>> +
>> + /* We should also adjust any cold blocks that the newly-hot bb
>> + feeds and see if it makes sense to re-mark those as hot as
>> + well. */
>> + VEC_safe_push (basic_block, heap, bbs_newly_hot, dom_bb);
>> + while (! VEC_empty (basic_block, bbs_newly_hot))
>> + {
>> + basic_block new_hot_bb = VEC_pop (basic_block, bbs_newly_hot);
>> + /* Examine all successors of this newly-hot bb to see if they
>> + are cold and should be re-marked as hot. */
>> + FOR_EACH_EDGE (e, ei, new_hot_bb->succs)
>> + {
>> + bool any_cold_preds = false;
>> + basic_block succ = e->dest;
>> + if (BB_PARTITION (succ) != BB_COLD_PARTITION)
>> + continue;
>> + /* Does this block have any cold predecessors now? */
>> + FOR_EACH_EDGE (e2, ei2, succ->preds)
>> + {
>> + if (BB_PARTITION (e2->src) == BB_COLD_PARTITION)
>> + {
>> + any_cold_preds = true;
>> + break;
>> + }
>> + }
>> + if (any_cold_preds)
>> + continue;
>> +
>> + /* Here we have a successor of newly-hot bb that is cold
>> + but no longer has any cold precessessors. Since the original
>> + assignment of our newly-hot bb was incorrect, this successor's
>> + assignment as cold is also suspect. Go ahead and re-mark it
>> + as hot now too. Better heuristics may be in order here. */
>> + BB_SET_PARTITION (succ, BB_HOT_PARTITION);
>> + cold_bb_count--;
>> + VEC_safe_push (basic_block, heap, bbs_in_hot_partition, succ);
>> + /* Examine this successor as a newly-hot bb. */
>> + VEC_safe_push (basic_block, heap, bbs_newly_hot, succ);
>> + }
>> + }
>> + }
>> +
>> + if (dom_calculated_here)
>> + free_dominance_info (CDI_DOMINATORS);
>> + }
>> +
>> /* The format of .gcc_except_table does not allow landing pads to
>> be in a different partition as the throw. Fix this by either
>> moving or duplicating the landing pads. */
>> @@ -1766,10 +1858,10 @@ fix_up_fall_thru_edges (void)
>> new_bb->aux = cur_bb->aux;
>> cur_bb->aux = new_bb;
>>
>> - /* Make sure new fall-through bb is in same
>> - partition as bb it's falling through from. */
>> + /* This is done by force_nonfallthru_and_redirect. */
>> + gcc_assert (BB_PARTITION (new_bb)
>> + == BB_PARTITION (cur_bb));
>>
>> - BB_COPY_PARTITION (new_bb, cur_bb);
>> single_succ_edge (new_bb)->flags |= EDGE_CROSSING;
>> }
>> else
>> @@ -2067,7 +2159,10 @@ add_reg_crossing_jump_notes (void)
>> FOR_EACH_BB (bb)
>> FOR_EACH_EDGE (e, ei, bb->succs)
>> if ((e->flags & EDGE_CROSSING)
>> - && JUMP_P (BB_END (e->src)))
>> + && JUMP_P (BB_END (e->src))
>> + /* Some notes were added during fix_up_fall_thru_edges, via
>> + force_nonfallthru_and_redirect. */
>> + && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX))
>> add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
>> }
>>
>> @@ -2160,7 +2255,7 @@ reorder_basic_blocks (void)
>> dump_flow_info (dump_file, dump_flags);
>> }
>>
>> - if (flag_reorder_blocks_and_partition)
>> + if (crtl->has_bb_partition)
>> verify_hot_cold_block_grouping ();
>> }
>>
>> @@ -2172,14 +2267,14 @@ reorder_basic_blocks (void)
>> encountering this note will make the compiler switch between the
>> hot and cold text sections. */
>>
>> -static void
>> +void
>> insert_section_boundary_note (void)
>> {
>> basic_block bb;
>> rtx new_note;
>> int first_partition = 0;
>>
>> - if (!flag_reorder_blocks_and_partition)
>> + if (!crtl->has_bb_partition)
>> return;
>>
>> FOR_EACH_BB (bb)
>> @@ -2222,10 +2317,8 @@ rest_of_handle_reorder_blocks (void)
>> FOR_EACH_BB (bb)
>> if (bb->next_bb != EXIT_BLOCK_PTR)
>> bb->aux = bb->next_bb;
>> - cfg_layout_finalize ();
>> + cfg_layout_finalize (true);
>>
>> - /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */
>> - insert_section_boundary_note ();
>> return 0;
>> }
>>
>> @@ -2366,7 +2459,7 @@ duplicate_computed_gotos (void)
>> }
>>
>> done:
>> - cfg_layout_finalize ();
>> + cfg_layout_finalize (false);
>>
>> BITMAP_FREE (candidates);
>> return 0;
>> @@ -2511,6 +2604,8 @@ partition_hot_cold_basic_blocks (void)
>> if (crossing_edges == NULL)
>> return 0;
>>
>> + crtl->has_bb_partition = true;
>> +
>> /* Make sure the source of any crossing edge ends in a jump and the
>> destination of any crossing edge has a label. */
>> add_labels_and_missing_jumps (crossing_edges);
>> Index: bb-reorder.h
>> ===================================================================
>> --- bb-reorder.h (revision 193376)
>> +++ bb-reorder.h (working copy)
>> @@ -36,4 +36,8 @@ extern struct target_bb_reorder *this_target_bb_re
>>
>> extern int get_uncond_jump_length (void);
>>
>> +extern void insert_section_boundary_note (void);
>> +
>> +extern void emit_barrier_after_bb (basic_block bb);
>> +
>> #endif
>> Index: basic-block.h
>> ===================================================================
>> --- basic-block.h (revision 193376)
>> +++ basic-block.h (working copy)
>> @@ -806,6 +806,7 @@ extern basic_block force_nonfallthru_and_redirect
>> extern bool contains_no_active_insn_p (const_basic_block);
>> extern bool forwarder_block_p (const_basic_block);
>> extern bool can_fallthru (basic_block, basic_block);
>> +extern void fixup_partitions (void);
>>
>> /* In cfgbuild.c. */
>> extern void find_many_sub_basic_blocks (sbitmap);
>> Index: cfgrtl.c
>> ===================================================================
>> --- cfgrtl.c (revision 193376)
>> +++ cfgrtl.c (working copy)
>> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3. If not see
>> #include "tree.h"
>> #include "hard-reg-set.h"
>> #include "basic-block.h"
>> +#include "bb-reorder.h"
>> #include "regs.h"
>> #include "flags.h"
>> #include "function.h"
>> @@ -67,11 +68,12 @@ along with GCC; see the file COPYING3. If not see
>> Only applicable if the CFG is in cfglayout mode. */
>> static GTY(()) rtx cfg_layout_function_footer;
>> static GTY(()) rtx cfg_layout_function_header;
>> +static bool had_sec_boundary_notes;
>>
>> static rtx skip_insns_after_block (basic_block);
>> static void record_effective_endpoints (void);
>> static rtx label_for_bb (basic_block);
>> -static void fixup_reorder_chain (void);
>> +static void fixup_reorder_chain (bool finalize_reorder_blocks);
>>
>> void verify_insn_chain (void);
>> static void fixup_fallthru_exit_predecessor (void);
>> @@ -976,8 +978,7 @@ try_redirect_by_replacing_jump (edge e, basic_bloc
>> partition boundaries). See the comments at the top of
>> bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */
>>
>> - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)
>> - || BB_PARTITION (src) != BB_PARTITION (target))
>> + if (BB_PARTITION (src) != BB_PARTITION (target))
>> return NULL;
>>
>> /* We can replace or remove a complex jump only when we have exactly
>> @@ -1286,6 +1287,71 @@ redirect_branch_edge (edge e, basic_block target)
>> return e;
>> }
>>
>> +/* Called when edge E has been redirected to a new destination,
>> + in order to update the region crossing flag on the edge and
>> + jump. */
>> +
>> +static void
>> +fixup_partition_crossing (edge e, basic_block target)
>> +{
>> + rtx note;
>> +
>> + gcc_assert (e->dest == target);
>> +
>> + if (e->src == ENTRY_BLOCK_PTR || target == EXIT_BLOCK_PTR)
>> + return;
>> + /* If we redirected an existing edge, it may already be marked
>> + crossing, even though the new src is missing a reg crossing note.
>> + But make sure reg crossing note doesn't already exist before
>> + inserting. */
>> + if (BB_PARTITION (e->src) != BB_PARTITION (target))
>> + {
>> + e->flags |= EDGE_CROSSING;
>> + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
>> + if (JUMP_P (BB_END (e->src))
>> + && !note)
>> + add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
>> + }
>> + else if (BB_PARTITION (e->src) == BB_PARTITION (target))
>> + {
>> + e->flags &= ~EDGE_CROSSING;
>> + /* Remove the region crossing note from jump at end of
>> + e->src if it exists. */
>> + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
>> + if (note)
>> + remove_note (BB_END (e->src), note);
>> + }
>> +}
>> +
>> +/* Called when block BB has been reassigned to a different partition,
>> + to ensure that the region crossing attributes are updated. */
>> +
>> +static void
>> +fixup_bb_partition (basic_block bb)
>> +{
>> + edge e;
>> + edge_iterator ei;
>> +
>> + /* Now need to make bb's pred edges non-region crossing. */
>> + FOR_EACH_EDGE (e, ei, bb->preds)
>> + {
>> + fixup_partition_crossing (e, e->dest);
>> + }
>> +
>> + /* Possibly need to make bb's successor edges region crossing,
>> + or remove stale region crossing. */
>> + FOR_EACH_EDGE (e, ei, bb->succs)
>> + {
>> + if ((e->flags & EDGE_FALLTHRU)
>> + && BB_PARTITION (bb) != BB_PARTITION (e->dest)
>> + && e->dest != EXIT_BLOCK_PTR)
>> + /* force_nonfallthru_and_redirect calls fixup_partition_crossing. */
>> + force_nonfallthru (e);
>> + else
>> + fixup_partition_crossing (e, e->dest);
>> + }
>> +}
>> +
>> /* Attempt to change code to redirect edge E to TARGET. Don't do that on
>> expense of adding new instructions or reordering basic blocks.
>>
>> @@ -1302,16 +1368,18 @@ rtl_redirect_edge_and_branch (edge e, basic_block
>> {
>> edge ret;
>> basic_block src = e->src;
>> + basic_block dest = e->dest;
>>
>> if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH))
>> return NULL;
>>
>> - if (e->dest == target)
>> + if (dest == target)
>> return e;
>>
>> if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL)
>> {
>> df_set_bb_dirty (src);
>> + fixup_partition_crossing (ret, target);
>> return ret;
>> }
>>
>> @@ -1320,6 +1388,7 @@ rtl_redirect_edge_and_branch (edge e, basic_block
>> return NULL;
>>
>> df_set_bb_dirty (src);
>> + fixup_partition_crossing (ret, target);
>> return ret;
>> }
>>
>> @@ -1454,18 +1523,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc
>> /* Make sure new block ends up in correct hot/cold section. */
>>
>> BB_COPY_PARTITION (jump_block, e->src);
>> - if (flag_reorder_blocks_and_partition
>> - && targetm_common.have_named_sections
>> - && JUMP_P (BB_END (jump_block))
>> - && !any_condjump_p (BB_END (jump_block))
>> - && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING))
>> - add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX);
>>
>> /* Wire edge in. */
>> new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU);
>> new_edge->probability = probability;
>> new_edge->count = count;
>>
>> + /* If e->src was previously region crossing, it no longer is
>> + and the reg crossing note should be removed. */
>> + fixup_partition_crossing (new_edge, jump_block);
>> +
>> /* Redirect old edge. */
>> redirect_edge_pred (e, jump_block);
>> e->probability = REG_BR_PROB_BASE;
>> @@ -1521,13 +1588,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc
>> LABEL_NUSES (label)++;
>> }
>>
>> - emit_barrier_after (BB_END (jump_block));
>> + /* We might be in cfg layout mode, and if so, the following routine will
>> + insert the barrier correctly. */
>> + emit_barrier_after_bb (jump_block);
>> redirect_edge_succ_nodup (e, target);
>>
>> if (abnormal_edge_flags)
>> make_edge (src, target, abnormal_edge_flags);
>>
>> df_mark_solutions_dirty ();
>> + fixup_partition_crossing (e, target);
>> return new_bb;
>> }
>>
>> @@ -1626,7 +1696,7 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU
>> static basic_block
>> rtl_split_edge (edge edge_in)
>> {
>> - basic_block bb;
>> + basic_block bb, new_bb;
>> rtx before;
>>
>> /* Abnormal edges cannot be split. */
>> @@ -1659,12 +1729,26 @@ rtl_split_edge (edge edge_in)
>> else
>> {
>> bb = create_basic_block (before, NULL, edge_in->dest->prev_bb);
>> - /* ??? Why not edge_in->dest->prev_bb here? */
>> - BB_COPY_PARTITION (bb, edge_in->dest);
>> + if (edge_in->src == ENTRY_BLOCK_PTR)
>> + BB_COPY_PARTITION (bb, edge_in->dest);
>> + else
>> + /* Put the split bb into the src partition, to avoid creating
>> + a situation where a cold bb dominates a hot bb, in the case
>> + where src is cold and dest is hot. The src will dominate
>> + the new bb (whereas it might not have dominated dest). */
>> + BB_COPY_PARTITION (bb, edge_in->src);
>> }
>>
>> make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU);
>>
>> + /* Can't allow a region crossing edge to be fallthrough. */
>> + if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest)
>> + && edge_in->dest != EXIT_BLOCK_PTR)
>> + {
>> + new_bb = force_nonfallthru (single_succ_edge (bb));
>> + gcc_assert (!new_bb);
>> + }
>> +
>> /* For non-fallthru edges, we must adjust the predecessor's
>> jump instruction to target our new block. */
>> if ((edge_in->flags & EDGE_FALLTHRU) == 0)
>> @@ -1777,17 +1861,13 @@ commit_one_edge_insertion (edge e)
>> else
>> {
>> bb = split_edge (e);
>> - after = BB_END (bb);
>>
>> - if (flag_reorder_blocks_and_partition
>> - && targetm_common.have_named_sections
>> - && e->src != ENTRY_BLOCK_PTR
>> - && BB_PARTITION (e->src) == BB_COLD_PARTITION
>> - && !(e->flags & EDGE_CROSSING)
>> - && JUMP_P (after)
>> - && !any_condjump_p (after)
>> - && (single_succ_edge (bb)->flags & EDGE_CROSSING))
>> - add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX);
>> + /* If e crossed a partition boundary, we needed to make bb end in
>> + a region-crossing jump, even though it was originally fallthru. */
>> + if (JUMP_P (BB_END (bb)))
>> + before = BB_END (bb);
>> + else
>> + after = BB_END (bb);
>> }
>>
>> /* Now that we've found the spot, do the insertion. */
>> @@ -1827,6 +1907,14 @@ commit_edge_insertions (void)
>> {
>> basic_block bb;
>>
>> + /* Optimization passes that invoke this routine can cause hot blocks
>> + previously reached by both hot and cold blocks to become dominated only
>> + by cold blocks. This will cause the verification below to fail,
>> + and lead to now cold code in the hot section. In some cases this
>> + may only be visible after newly unreachable blocks are deleted,
>> + which will be done by fixup_partitions. */
>> + fixup_partitions ();
>> +
>> #ifdef ENABLE_CHECKING
>> verify_flow_info ();
>> #endif
>> @@ -2028,7 +2116,75 @@ get_last_bb_insn (basic_block bb)
>>
>> return end;
>> }
>> -
>> +
>> +/* Perform cleanup on the hot/cold bb partitioning after optimization
>> + passes that modify the cfg. */
>> +
>> +void
>> +fixup_partitions (void)
>> +{
>> + basic_block bb;
>> +
>> + if (!crtl->has_bb_partition)
>> + return;
>> +
>> + /* Delete any blocks that became unreachable and weren't
>> + already cleaned up, for example during edge forwarding
>> + and convert_jumps_to_returns. This will expose more
>> + opportunities for fixing the partition boundaries here.
>> + Also, the calculation of the dominance graph during verification
>> + will assert if there are unreachable nodes. */
>> + delete_unreachable_blocks ();
>> +
>> + /* If there are partitions, do a sanity check on them: A basic block in
>> + a cold partition cannot dominate a basic block in a hot partition.
>> + Fixup any that now violate this requirement, as a result of edge
>> + forwarding and unreachable block deletion. */
>> + VEC (basic_block, heap) *bbs_in_cold_partition = NULL;
>> + VEC (basic_block, heap) *bbs_to_fix = NULL;
>> + FOR_EACH_BB (bb)
>> + if ((BB_PARTITION (bb) == BB_COLD_PARTITION))
>> + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, bb);
>> + if (! VEC_empty (basic_block, bbs_in_cold_partition))
>> + {
>> + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS);
>> + basic_block son;
>> +
>> + if (dom_calculated_here)
>> + calculate_dominance_info (CDI_DOMINATORS);
>> +
>> + while (! VEC_empty (basic_block, bbs_in_cold_partition))
>> + {
>> + bb = VEC_pop (basic_block, bbs_in_cold_partition);
>> + /* If bb is not yet cold (because it was added below as
>> + a block dominated by a cold bb) then mark it cold here. */
>> + if ((BB_PARTITION (bb) != BB_COLD_PARTITION))
>> + {
>> + BB_SET_PARTITION (bb, BB_COLD_PARTITION);
>> + VEC_safe_push (basic_block, heap, bbs_to_fix, bb);
>> + }
>> + /* Any blocks dominated by a block in the cold section
>> + must also be cold. */
>> + for (son = first_dom_son (CDI_DOMINATORS, bb);
>> + son;
>> + son = next_dom_son (CDI_DOMINATORS, son))
>> + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, son);
>> + }
>> +
>> + if (dom_calculated_here)
>> + free_dominance_info (CDI_DOMINATORS);
>> + }
>> +
>> + /* Do the partition fixup after all necessary blocks have been converted to
>> + cold, so that we only update the region crossings the minimum number of
>> + places, which can require forcing edges to be non fallthru. */
>> + while (! VEC_empty (basic_block, bbs_to_fix))
>> + {
>> + bb = VEC_pop (basic_block, bbs_to_fix);
>> + fixup_bb_partition (bb);
>> + }
>> +}
>> +
>> /* Verify the CFG and RTL consistency common for both underlying RTL and
>> cfglayout RTL.
>>
>> @@ -2052,6 +2208,7 @@ rtl_verify_flow_info_1 (void)
>> rtx x;
>> int err = 0;
>> basic_block bb;
>> + bool have_partitions = false;
>>
>> /* Check the general integrity of the basic blocks. */
>> FOR_EACH_BB_REVERSE (bb)
>> @@ -2169,6 +2326,8 @@ rtl_verify_flow_info_1 (void)
>>
>> if (e->flags & EDGE_ABNORMAL)
>> n_abnormal++;
>> +
>> + have_partitions |= is_crossing;
>> }
>>
>> if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX))
>> @@ -2293,6 +2452,40 @@ rtl_verify_flow_info_1 (void)
>> }
>> }
>>
>> + /* If there are partitions, do a sanity check on them: A basic block in
>> + a cold partition cannot dominate a basic block in a hot partition. */
>> + VEC (basic_block, heap) *bbs_in_cold_partition = NULL;
>> + if (have_partitions && !err)
>> + FOR_EACH_BB (bb)
>> + if ((BB_PARTITION (bb) == BB_COLD_PARTITION))
>> + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, bb);
>> + if (! VEC_empty (basic_block, bbs_in_cold_partition))
>> + {
>> + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS);
>> + basic_block son;
>> +
>> + if (dom_calculated_here)
>> + calculate_dominance_info (CDI_DOMINATORS);
>> +
>> + while (! VEC_empty (basic_block, bbs_in_cold_partition))
>> + {
>> + bb = VEC_pop (basic_block, bbs_in_cold_partition);
>> + if ((BB_PARTITION (bb) != BB_COLD_PARTITION))
>> + {
>> + error ("non-cold basic block %d dominated "
>> + "by a block in the cold partition", bb->index);
>> + err = 1;
>> + }
>> + for (son = first_dom_son (CDI_DOMINATORS, bb);
>> + son;
>> + son = next_dom_son (CDI_DOMINATORS, son))
>> + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, son);
>> + }
>> +
>> + if (dom_calculated_here)
>> + free_dominance_info (CDI_DOMINATORS);
>> + }
>> +
>> /* Clean up. */
>> return err;
>> }
>> @@ -2965,14 +3158,41 @@ record_effective_endpoints (void)
>> else
>> cfg_layout_function_header = NULL_RTX;
>>
>> + had_sec_boundary_notes = false;
>> +
>> next_insn = get_insns ();
>> FOR_EACH_BB (bb)
>> {
>> rtx end;
>>
>> if (PREV_INSN (BB_HEAD (bb)) && next_insn != BB_HEAD (bb))
>> - BB_HEADER (bb) = unlink_insn_chain (next_insn,
>> - PREV_INSN (BB_HEAD (bb)));
>> + {
>> + /* Rather than try to keep section boundary notes incrementally
>> + up-to-date through cfg layout optimizations, simply remove them
>> + and flag that they should be re-inserted when exiting
>> + cfg layout mode. */
>> + rtx check_insn = next_insn;
>> + while (check_insn)
>> + {
>> + if (NOTE_P (check_insn)
>> + && NOTE_KIND (check_insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS)
>> + {
>> + had_sec_boundary_notes |= true;
>> + /* Remove note from chain. Grab new next_insn first. */
>> + if (next_insn == check_insn)
>> + next_insn = NEXT_INSN (check_insn);
>> + /* Delete note. */
>> + delete_insn (check_insn);
>> + /* There will only be one. */
>> + break;
>> + }
>> + check_insn = NEXT_INSN (check_insn);
>> + }
>> + /* If we still have header instructions left after above loop. */
>> + if (next_insn != BB_HEAD (bb))
>> + BB_HEADER (bb) = unlink_insn_chain (next_insn,
>> + PREV_INSN (BB_HEAD (bb)));
>> + }
>> end = skip_insns_after_block (bb);
>> if (NEXT_INSN (BB_END (bb)) && BB_END (bb) != end)
>> BB_FOOTER (bb) = unlink_insn_chain (NEXT_INSN (BB_END (bb)), end);
>> @@ -3000,7 +3220,7 @@ outof_cfg_layout_mode (void)
>> if (bb->next_bb != EXIT_BLOCK_PTR)
>> bb->aux = bb->next_bb;
>>
>> - cfg_layout_finalize ();
>> + cfg_layout_finalize (false);
>>
>> return 0;
>> }
>> @@ -3120,10 +3340,13 @@ relink_block_chain (bool stay_in_cfglayout_mode)
>> }
>>
>>
>> -/* Given a reorder chain, rearrange the code to match. */
>> +/* Given a reorder chain, rearrange the code to match. If
>> + this is called when we will FINALIZE_REORDER_BLOCKS, or when
>> + section boundary notes were removed on entry to cfg layout
>> + mode, insert section boundary notes here. */
>>
>> static void
>> -fixup_reorder_chain (void)
>> +fixup_reorder_chain (bool finalize_reorder_blocks)
>> {
>> basic_block bb;
>> rtx insn = NULL;
>> @@ -3150,7 +3373,7 @@ static void
>> PREV_INSN (BB_HEADER (bb)) = insn;
>> insn = BB_HEADER (bb);
>> while (NEXT_INSN (insn))
>> - insn = NEXT_INSN (insn);
>> + insn = NEXT_INSN (insn);
>> }
>> if (insn)
>> NEXT_INSN (insn) = BB_HEAD (bb);
>> @@ -3175,6 +3398,11 @@ static void
>> insn = NEXT_INSN (insn);
>>
>> set_last_insn (insn);
>> +
>> + /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */
>> + if (had_sec_boundary_notes || finalize_reorder_blocks)
>> + insert_section_boundary_note ();
>> +
>> #ifdef ENABLE_CHECKING
>> verify_insn_chain ();
>> #endif
>> @@ -3187,7 +3415,7 @@ static void
>> edge e_fall, e_taken, e;
>> rtx bb_end_insn;
>> rtx ret_label = NULL_RTX;
>> - basic_block nb, src_bb;
>> + basic_block nb;
>> edge_iterator ei;
>>
>> if (EDGE_COUNT (bb->succs) == 0)
>> @@ -3322,7 +3550,6 @@ static void
>> /* We got here if we need to add a new jump insn.
>> Note force_nonfallthru can delete E_FALL and thus we have to
>> save E_FALL->src prior to the call to force_nonfallthru. */
>> - src_bb = e_fall->src;
>> nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label);
>> if (nb)
>> {
>> @@ -3330,17 +3557,6 @@ static void
>> bb->aux = nb;
>> /* Don't process this new block. */
>> bb = nb;
>> -
>> - /* Make sure new bb is tagged for correct section (same as
>> - fall-thru source, since you cannot fall-thru across
>> - section boundaries). */
>> - BB_COPY_PARTITION (src_bb, single_pred (bb));
>> - if (flag_reorder_blocks_and_partition
>> - && targetm_common.have_named_sections
>> - && JUMP_P (BB_END (bb))
>> - && !any_condjump_p (BB_END (bb))
>> - && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING))
>> - add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX);
>> }
>> }
>>
>> @@ -3644,10 +3860,11 @@ duplicate_insn_chain (rtx from, rtx to)
>> case NOTE_INSN_FUNCTION_BEG:
>> /* There is always just single entry to function. */
>> case NOTE_INSN_BASIC_BLOCK:
>> + /* We should only switch text sections once. */
>> + case NOTE_INSN_SWITCH_TEXT_SECTIONS:
>> break;
>>
>> case NOTE_INSN_EPILOGUE_BEG:
>> - case NOTE_INSN_SWITCH_TEXT_SECTIONS:
>> emit_note_copy (insn);
>> break;
>>
>> @@ -3759,10 +3976,13 @@ break_superblocks (void)
>> }
>>
>> /* Finalize the changes: reorder insn list according to the sequence specified
>> - by aux pointers, enter compensation code, rebuild scope forest. */
>> + by aux pointers, enter compensation code, rebuild scope forest. If
>> + this is called when we will FINALIZE_REORDER_BLOCKS, indicate that
>> + to fixup_reorder_chain so that it can insert the proper switch text
>> + section notes. */
>>
>> void
>> -cfg_layout_finalize (void)
>> +cfg_layout_finalize (bool finalize_reorder_blocks)
>> {
>> #ifdef ENABLE_CHECKING
>> verify_flow_info ();
>> @@ -3775,7 +3995,7 @@ void
>> #endif
>> )
>> fixup_fallthru_exit_predecessor ();
>> - fixup_reorder_chain ();
>> + fixup_reorder_chain (finalize_reorder_blocks);
>>
>> rebuild_jump_labels (get_insns ());
>> delete_dead_jumptables ();
>> @@ -4454,8 +4674,7 @@ rtl_can_remove_branch_p (const_edge e)
>> if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH))
>> return false;
>>
>> - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)
>> - || BB_PARTITION (src) != BB_PARTITION (target))
>> + if (BB_PARTITION (src) != BB_PARTITION (target))
>> return false;
>>
>> if (!onlyjump_p (insn)
>>
>> --
>> This patch is available for review at http://codereview.appspot.com/6823047
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413