Fix PR 53743 and other -freorder-blocks-and-partition failures
Teresa Johnson
tejohnson@google.com
Wed May 29 14:58:00 GMT 2013
On Thu, May 23, 2013 at 6:18 AM, Teresa Johnson <tejohnson@google.com> wrote:
> On Wed, May 22, 2013 at 2:05 PM, Teresa Johnson <tejohnson@google.com> wrote:
>> Revised patch included below. The spacing of my pasted in patch text
>> looks funky again, let me know if you want the patch as an attachment
>> instead.
>>
>> I addressed all of Steven's comments, except for the suggestion to use
>> gcc_assert
>> instead of error() in verify_hot_cold_block_grouping() to keep this consistent
>> with the rest of the verify_flow_info subroutines (let me know if this is ok).
>
> I fixed this issue too, which was actually in
> insert_section_boundary_note(), so that it gcc_asserts more
> efficiently as suggested. Retested, latest patch below.
>
> Honza, would you be able to review the patch?
Ping. Still needs a global maintainer to review and approve.
Also, I submitted a PR for the debug range issue:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57451
Thanks!
Teresa
>
> Thanks!
> Teresa
>
>>
>> The other main changes:
>> (1) Added several test cases (cloned from the torture subdirectories,
>> where I manually
>> built/ran with FDO and -freorder-blocks-and-partition with both the
>> current trunk and
>> my fixed trunk compiler, and was able to expose some failures I fixed.
>> (2) Changed existing tree-prof tests that used
>> -freorder-blocks-and-partition to be
>> built with -O2 instead of -O, so that partitioning actually kicks in.
>> (3) Fixed a couple of failures in the new
>> verify_hot_cold_block_grouping() checks
>> exposed by the torture tests I ran manually with splitting (2 of the
>> tests cloned
>> to tree-prof in this patch). One was in computed goto where we were
>> too aggressive
>> about cloning crossing edges, and the other was in rtl_split_edge
>> called from the "stack"
>> pass which was not correctly inserting the new bb in the correct partition since
>> bb layout is complete at that point.
>>
>> Re-tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap
>> builds and regression testing. Re-built/ran cpu2006int with profile
>> feedback and -freorder-blocks-and-partition enabled.
>>
>> Ok for trunk?
>>
>> Thanks!
>> Teresa
>
> 2013-05-23 Teresa Johnson <tejohnson@google.com>
>
> * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
> as this is now done by redirect_edge_and_branch_force.
> * function.c (thread_prologue_and_epilogue_insns): Insert new bb after
> barriers, and fix interaction with splitting.
> * emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes.
> * cfgcleanup.c (try_forward_edges): Fix early return value to properly
> reflect changes made in the routine.
> * bb-reorder.c (emit_barrier_after_bb): Move to cfgrtl.c.
> (fix_up_fall_thru_edges): Remove incorrect check for bb layout order
> since this is called in cfglayout mode, and replace partition fixup
> with assert as that is now done by force_nonfallthru_and_redirect.
> (add_reg_crossing_jump_notes): Handle the fact that some jumps may
> already be marked with region crossing note.
> (insert_section_boundary_note): Make non-static, gate on flag
> has_bb_partition, rewrite to also check for multiple partitions.
> (rest_of_handle_reorder_blocks): Remove call to
> insert_section_boundary_note, now done later during free_cfg.
> (duplicate_computed_gotos): Don't duplicate partition crossing edge.
> * bb-reorder.h (insert_section_boundary_note): Declare.
> * Makefile.in (cfgrtl.o): Depend on bb-reorder.h
> * cfgrtl.c (rest_of_pass_free_cfg): If partitions exist
> invoke insert_section_boundary_note.
> (try_redirect_by_replacing_jump): Remove unnecessary
> check for region crossing note.
> (fixup_partition_crossing): New function.
> (rtl_redirect_edge_and_branch): Fixup partition boundaries.
> (emit_barrier_after_bb): Move here from bb-reorder.c, handle insertion
> in non-cfglayout mode.
> (force_nonfallthru_and_redirect): Fixup partition boundaries,
> remove old code that tried to do this. Emit barrier correctly
> when we are in cfglayout mode.
> (last_bb_in_partition): New function.
> (rtl_split_edge): Correctly fixup partition boundaries.
> (commit_one_edge_insertion): Remove old code that tried to
> fixup region crossing edge since this is now handled in
> split_block, and set up insertion point correctly since
> block may now end in a jump.
> (verify_hot_cold_block_grouping): Guard against checking when not in
> linearized RTL mode.
> (rtl_verify_edges): Add checks for incorrect/missing REG_CROSSING_JUMP
> notes.
> (rtl_verify_flow_info_1): Move verify_hot_cold_block_grouping to
> rtl_verify_flow_info, so not called in cfglayout mode.
> (rtl_verify_flow_info): Move verify_hot_cold_block_grouping here.
> (fixup_reorder_chain): Remove old code that attempted to fixup region
> crossing note as this is now handled in force_nonfallthru_and_redirect.
> (duplicate_insn_chain): Don't duplicate switch section notes.
> (rtl_can_remove_branch_p): Remove unnecessary check for region crossing
> note.
> * basic-block.h (emit_barrier_after_bb): Declare.
> * testsuite/gcc.dg/tree-prof/va-arg-pack-1.c: Cloned from c-torture, made
> into -freorder-blocks-and-partition test.
> * testsuite/gcc.dg/tree-prof/comp-goto-1.c: Ditto.
> * testsuite/gcc.dg/tree-prof/20041218-1.c: Ditto.
> * testsuite/gcc.dg/tree-prof/pr52027.c: Use -O2.
> * testsuite/gcc.dg/tree-prof/pr50907.c: Ditto.
> * testsuite/gcc.dg/tree-prof/pr45354.c: Ditto.
> * testsuite/g++.dg/tree-prof/partition2.C: Ditto.
> * testsuite/g++.dg/tree-prof/partition3.C: Ditto.
>
> Index: ifcvt.c
> ===================================================================
> --- ifcvt.c (revision 199014)
> +++ ifcvt.c (working copy)
> @@ -3905,10 +3905,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg
> if (new_bb)
> {
> df_bb_replace (then_bb_index, new_bb);
> - /* Since the fallthru edge was redirected from test_bb to new_bb,
> - we need to ensure that new_bb is in the same partition as
> - test bb (you can not fall through across section boundaries). */
> - BB_COPY_PARTITION (new_bb, test_bb);
> + /* This should have been done above via force_nonfallthru_and_redirect
> + (possibly called from redirect_edge_and_branch_force). */
> + gcc_checking_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb));
> }
>
> num_true_changes++;
> Index: function.c
> ===================================================================
> --- function.c (revision 199014)
> +++ function.c (working copy)
> @@ -6270,8 +6270,10 @@ thread_prologue_and_epilogue_insns (void)
> break;
> if (e)
> {
> - copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
> - NULL_RTX, e->src);
> + /* Make sure we insert after any barriers. */
> + rtx end = get_last_bb_insn (e->src);
> + copy_bb = create_basic_block (NEXT_INSN (end),
> + NULL_RTX, e->src);
> BB_COPY_PARTITION (copy_bb, e->src);
> }
> else
> @@ -6538,7 +6540,7 @@ epilogue_done:
> basic_block simple_return_block_cold = NULL;
> edge pending_edge_hot = NULL;
> edge pending_edge_cold = NULL;
> - basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb;
> + basic_block exit_pred;
> int i;
>
> gcc_assert (entry_edge != orig_entry_edge);
> @@ -6566,6 +6568,12 @@ epilogue_done:
> else
> pending_edge_cold = e;
> }
> +
> + /* Save a pointer to the exit's predecessor BB for use in
> + inserting new BBs at the end of the function. Do this
> + after the call to split_block above which may split
> + the original exit pred. */
> + exit_pred = EXIT_BLOCK_PTR->prev_bb;
>
> FOR_EACH_VEC_ELT (unconverted_simple_returns, i, e)
> {
> Index: emit-rtl.c
> ===================================================================
> --- emit-rtl.c (revision 199014)
> +++ emit-rtl.c (working copy)
> @@ -3574,6 +3574,7 @@ try_split (rtx pat, rtx trial, int last)
> break;
>
> case REG_NON_LOCAL_GOTO:
> + case REG_CROSSING_JUMP:
> for (insn = insn_last; insn != NULL_RTX; insn = PREV_INSN (insn))
> {
> if (JUMP_P (insn))
> Index: cfgcleanup.c
> ===================================================================
> --- cfgcleanup.c (revision 199014)
> +++ cfgcleanup.c (working copy)
> @@ -456,7 +456,7 @@ try_forward_edges (int mode, basic_block b)
>
> if (first != EXIT_BLOCK_PTR
> && find_reg_note (BB_END (first), REG_CROSSING_JUMP, NULL_RTX))
> - return false;
> + return changed;
>
> while (counter < n_basic_blocks)
> {
> Index: bb-reorder.c
> ===================================================================
> --- bb-reorder.c (revision 199014)
> +++ bb-reorder.c (working copy)
> @@ -1380,15 +1380,6 @@ get_uncond_jump_length (void)
> return length;
> }
>
> -/* Emit a barrier into the footer of BB. */
> -
> -static void
> -emit_barrier_after_bb (basic_block bb)
> -{
> - rtx barrier = emit_barrier_after (BB_END (bb));
> - BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
> -}
> -
> /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions.
> Duplicate the landing pad and split the edges so that no EH edge
> crosses partitions. */
> @@ -1720,8 +1711,7 @@ fix_up_fall_thru_edges (void)
> (i.e. fix it so the fall through does not cross and
> the cond jump does). */
>
> - if (!cond_jump_crosses
> - && cur_bb->aux == cond_jump->dest)
> + if (!cond_jump_crosses)
> {
> /* Find label in fall_thru block. We've already added
> any missing labels, so there must be one. */
> @@ -1765,10 +1755,10 @@ fix_up_fall_thru_edges (void)
> new_bb->aux = cur_bb->aux;
> cur_bb->aux = new_bb;
>
> - /* Make sure new fall-through bb is in same
> - partition as bb it's falling through from. */
> + /* This is done by force_nonfallthru_and_redirect. */
> + gcc_assert (BB_PARTITION (new_bb)
> + == BB_PARTITION (cur_bb));
>
> - BB_COPY_PARTITION (new_bb, cur_bb);
> single_succ_edge (new_bb)->flags |= EDGE_CROSSING;
> }
> else
> @@ -2064,7 +2054,10 @@ add_reg_crossing_jump_notes (void)
> FOR_EACH_BB (bb)
> FOR_EACH_EDGE (e, ei, bb->succs)
> if ((e->flags & EDGE_CROSSING)
> - && JUMP_P (BB_END (e->src)))
> + && JUMP_P (BB_END (e->src))
> + /* Some notes were added during fix_up_fall_thru_edges, via
> + force_nonfallthru_and_redirect. */
> + && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX))
> add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
> }
>
> @@ -2133,23 +2126,26 @@ reorder_basic_blocks (void)
> encountering this note will make the compiler switch between the
> hot and cold text sections. */
>
> -static void
> +void
> insert_section_boundary_note (void)
> {
> basic_block bb;
> - int first_partition = 0;
> + bool switched_sections = false;
> + int current_partition = 0;
>
> - if (!flag_reorder_blocks_and_partition)
> + if (!crtl->has_bb_partition)
> return;
>
> FOR_EACH_BB (bb)
> {
> - if (!first_partition)
> - first_partition = BB_PARTITION (bb);
> - if (BB_PARTITION (bb) != first_partition)
> + if (!current_partition)
> + current_partition = BB_PARTITION (bb);
> + if (BB_PARTITION (bb) != current_partition)
> {
> - emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
> - break;
> + gcc_assert (!switched_sections);
> + switched_sections = true;
> + emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
> + current_partition = BB_PARTITION (bb);
> }
> }
> }
> @@ -2180,8 +2176,6 @@ rest_of_handle_reorder_blocks (void)
> bb->aux = bb->next_bb;
> cfg_layout_finalize ();
>
> - /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */
> - insert_section_boundary_note ();
> return 0;
> }
>
> @@ -2315,6 +2309,11 @@ duplicate_computed_gotos (void)
> if (!bitmap_bit_p (candidates, single_succ (bb)->index))
> continue;
>
> + /* Don't duplicate a partition crossing edge, which requires difficult
> + fixup. */
> + if (find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX))
> + continue;
> +
> new_bb = duplicate_block (single_succ (bb), single_succ_edge (bb), bb);
> new_bb->aux = bb->aux;
> bb->aux = new_bb;
> Index: bb-reorder.h
> ===================================================================
> --- bb-reorder.h (revision 199014)
> +++ bb-reorder.h (working copy)
> @@ -35,4 +35,6 @@ extern struct target_bb_reorder *this_target_bb_re
>
> extern int get_uncond_jump_length (void);
>
> +extern void insert_section_boundary_note (void);
> +
> #endif
> Index: Makefile.in
> ===================================================================
> --- Makefile.in (revision 199014)
> +++ Makefile.in (working copy)
> @@ -3151,7 +3151,7 @@ cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) corety
> $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \
> insn-config.h $(EXPR_H) \
> $(CFGLOOP_H) $(OBSTACK_H) $(TARGET_H) $(TREE_H) \
> - $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h
> + $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h bb-reorder.h
> cfganal.o : cfganal.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(BASIC_BLOCK_H) \
> $(TIMEVAR_H) sbitmap.h $(BITMAP_H)
> cfgbuild.o : cfgbuild.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
> Index: cfgrtl.c
> ===================================================================
> --- cfgrtl.c (revision 199014)
> +++ cfgrtl.c (working copy)
> @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see
> #include "tree.h"
> #include "hard-reg-set.h"
> #include "basic-block.h"
> +#include "bb-reorder.h"
> #include "regs.h"
> #include "flags.h"
> #include "function.h"
> @@ -451,6 +452,9 @@ rest_of_pass_free_cfg (void)
> }
> #endif
>
> + if (crtl->has_bb_partition)
> + insert_section_boundary_note ();
> +
> free_bb_for_insn ();
> return 0;
> }
> @@ -981,8 +985,7 @@ try_redirect_by_replacing_jump (edge e, basic_bloc
> partition boundaries). See the comments at the top of
> bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */
>
> - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)
> - || BB_PARTITION (src) != BB_PARTITION (target))
> + if (BB_PARTITION (src) != BB_PARTITION (target))
> return NULL;
>
> /* We can replace or remove a complex jump only when we have exactly
> @@ -1291,6 +1294,53 @@ redirect_branch_edge (edge e, basic_block target)
> return e;
> }
>
> +/* Called when edge E has been redirected to a new destination,
> + in order to update the region crossing flag on the edge and
> + jump. */
> +
> +static void
> +fixup_partition_crossing (edge e)
> +{
> + rtx note;
> +
> + if (e->src == ENTRY_BLOCK_PTR || e->dest == EXIT_BLOCK_PTR)
> + return;
> + /* If we redirected an existing edge, it may already be marked
> + crossing, even though the new src is missing a reg crossing note.
> + But make sure reg crossing note doesn't already exist before
> + inserting. */
> + if (BB_PARTITION (e->src) != BB_PARTITION (e->dest))
> + {
> + e->flags |= EDGE_CROSSING;
> + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
> + if (JUMP_P (BB_END (e->src))
> + && !note)
> + add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
> + }
> + else if (BB_PARTITION (e->src) == BB_PARTITION (e->dest))
> + {
> + e->flags &= ~EDGE_CROSSING;
> + /* Remove the section crossing note from jump at end of
> + src if it exists, and if no other successors are
> + still crossing. */
> + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
> + if (note)
> + {
> + bool has_crossing_succ = false;
> + edge e2;
> + edge_iterator ei;
> + FOR_EACH_EDGE (e2, ei, e->src->succs)
> + {
> + has_crossing_succ |= (e2->flags & EDGE_CROSSING);
> + if (has_crossing_succ)
> + break;
> + }
> + if (!has_crossing_succ)
> + remove_note (BB_END (e->src), note);
> + }
> + }
> +}
> +
> /* Attempt to change code to redirect edge E to TARGET. Don't do that on
> expense of adding new instructions or reordering basic blocks.
>
> @@ -1307,16 +1357,18 @@ rtl_redirect_edge_and_branch (edge e, basic_block
> {
> edge ret;
> basic_block src = e->src;
> + basic_block dest = e->dest;
>
> if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH))
> return NULL;
>
> - if (e->dest == target)
> + if (dest == target)
> return e;
>
> if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL)
> {
> df_set_bb_dirty (src);
> + fixup_partition_crossing (ret);
> return ret;
> }
>
> @@ -1325,9 +1377,22 @@ rtl_redirect_edge_and_branch (edge e, basic_block
> return NULL;
>
> df_set_bb_dirty (src);
> + fixup_partition_crossing (ret);
> return ret;
> }
>
> +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode. */
> +
> +void
> +emit_barrier_after_bb (basic_block bb)
> +{
> + rtx barrier = emit_barrier_after (BB_END (bb));
> + gcc_assert (current_ir_type() == IR_RTL_CFGRTL
> + || current_ir_type () == IR_RTL_CFGLAYOUT);
> + if (current_ir_type () == IR_RTL_CFGLAYOUT)
> + BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
> +}
> +
> /* Like force_nonfallthru below, but additionally performs redirection
> Used by redirect_edge_and_branch_force. JUMP_LABEL is used only
> when redirecting to the EXIT_BLOCK, it is either ret_rtx or
> @@ -1492,12 +1557,6 @@ force_nonfallthru_and_redirect (edge e, basic_bloc
> /* Make sure new block ends up in correct hot/cold section. */
>
> BB_COPY_PARTITION (jump_block, e->src);
> - if (flag_reorder_blocks_and_partition
> - && targetm_common.have_named_sections
> - && JUMP_P (BB_END (jump_block))
> - && !any_condjump_p (BB_END (jump_block))
> - && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING))
> - add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX);
>
> /* Wire edge in. */
> new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU);
> @@ -1508,6 +1567,10 @@ force_nonfallthru_and_redirect (edge e, basic_bloc
> redirect_edge_pred (e, jump_block);
> e->probability = REG_BR_PROB_BASE;
>
> + /* If e->src was previously region crossing, it no longer is
> + and the reg crossing note should be removed. */
> + fixup_partition_crossing (new_edge);
> +
> /* If asm goto has any label refs to target's label,
> add also edge from asm goto bb to target. */
> if (asm_goto_edge)
> @@ -1559,13 +1622,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc
> LABEL_NUSES (label)++;
> }
>
> - emit_barrier_after (BB_END (jump_block));
> + /* We might be in cfg layout mode, and if so, the following routine will
> + insert the barrier correctly. */
> + emit_barrier_after_bb (jump_block);
> redirect_edge_succ_nodup (e, target);
>
> if (abnormal_edge_flags)
> make_edge (src, target, abnormal_edge_flags);
>
> df_mark_solutions_dirty ();
> + fixup_partition_crossing (e);
> return new_bb;
> }
>
> @@ -1654,6 +1720,21 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU
> return false;
> }
>
> +/* Locate the last bb in the same partition as START_BB. */
> +
> +static basic_block
> +last_bb_in_partition (basic_block start_bb)
> +{
> + basic_block bb;
> + FOR_BB_BETWEEN (bb, start_bb, EXIT_BLOCK_PTR, next_bb)
> + {
> + if (BB_PARTITION (start_bb) != BB_PARTITION (bb->next_bb))
> + return bb;
> + }
> + /* Return bb before EXIT_BLOCK_PTR. */
> + return bb->prev_bb;
> +}
> +
> /* Split a (typically critical) edge. Return the new block.
> The edge must not be abnormal.
>
> @@ -1664,7 +1745,7 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU
> static basic_block
> rtl_split_edge (edge edge_in)
> {
> - basic_block bb;
> + basic_block bb, new_bb;
> rtx before;
>
> /* Abnormal edges cannot be split. */
> @@ -1696,13 +1777,50 @@ rtl_split_edge (edge edge_in)
> }
> else
> {
> - bb = create_basic_block (before, NULL, edge_in->dest->prev_bb);
> - /* ??? Why not edge_in->dest->prev_bb here? */
> - BB_COPY_PARTITION (bb, edge_in->dest);
> + if (edge_in->src == ENTRY_BLOCK_PTR)
> + {
> + bb = create_basic_block (before, NULL, edge_in->dest->prev_bb);
> + BB_COPY_PARTITION (bb, edge_in->dest);
> + }
> + else
> + {
> + basic_block after = edge_in->dest->prev_bb;
> + /* If this is post-bb reordering, and the edge crosses a partition
> + boundary, the new block needs to be inserted in the bb chain
> + at the end of the src partition (since we put the new bb into
> + that partition, see below). Otherwise we may end up creating
> + an extra partition crossing in the chain, which is illegal.
> + It can't go after the src, because src may have a fall-through
> + to a different block. */
> + if (crtl->bb_reorder_complete
> + && (edge_in->flags & EDGE_CROSSING))
> + {
> + after = last_bb_in_partition (edge_in->src);
> + before = NEXT_INSN (BB_END (after));
> + /* The instruction following the last bb in partition should
> + be a barrier, since it cannot end in a fall-through. */
> + gcc_checking_assert (BARRIER_P (before));
> + before = NEXT_INSN (before);
> + }
> + bb = create_basic_block (before, NULL, after);
> + /* Put the split bb into the src partition, to avoid creating
> + a situation where a cold bb dominates a hot bb, in the case
> + where src is cold and dest is hot. The src will dominate
> + the new bb (whereas it might not have dominated dest). */
> + BB_COPY_PARTITION (bb, edge_in->src);
> + }
> }
>
> make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU);
>
> + /* Can't allow a region crossing edge to be fallthrough. */
> + if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest)
> + && edge_in->dest != EXIT_BLOCK_PTR)
> + {
> + new_bb = force_nonfallthru (single_succ_edge (bb));
> + gcc_assert (!new_bb);
> + }
> +
> /* For non-fallthru edges, we must adjust the predecessor's
> jump instruction to target our new block. */
> if ((edge_in->flags & EDGE_FALLTHRU) == 0)
> @@ -1815,17 +1933,13 @@ commit_one_edge_insertion (edge e)
> else
> {
> bb = split_edge (e);
> - after = BB_END (bb);
>
> - if (flag_reorder_blocks_and_partition
> - && targetm_common.have_named_sections
> - && e->src != ENTRY_BLOCK_PTR
> - && BB_PARTITION (e->src) == BB_COLD_PARTITION
> - && !(e->flags & EDGE_CROSSING)
> - && JUMP_P (after)
> - && !any_condjump_p (after)
> - && (single_succ_edge (bb)->flags & EDGE_CROSSING))
> - add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX);
> + /* If E crossed a partition boundary, we needed to make bb end in
> + a region-crossing jump, even though it was originally fallthru. */
> + if (JUMP_P (BB_END (bb)))
> + before = BB_END (bb);
> + else
> + after = BB_END (bb);
> }
>
> /* Now that we've found the spot, do the insertion. */
> @@ -2071,7 +2185,11 @@ verify_hot_cold_block_grouping (void)
> bool switched_sections = false;
> int current_partition = BB_UNPARTITIONED;
>
> - if (!crtl->bb_reorder_complete)
> + /* Even after bb reordering is complete, we go into cfglayout mode
> + again (in compgoto). Ensure we don't call this before going back
> + into linearized RTL when any layout fixes would have been committed. */
> + if (!crtl->bb_reorder_complete
> + || current_ir_type() != IR_RTL_CFGRTL)
> return err;
>
> FOR_EACH_BB (bb)
> @@ -2116,6 +2234,7 @@ rtl_verify_edges (void)
> edge e, fallthru = NULL;
> edge_iterator ei;
> rtx note;
> + bool has_crossing_edge = false;
>
> if (JUMP_P (BB_END (bb))
> && (note = find_reg_note (BB_END (bb), REG_BR_PROB, NULL_RTX))
> @@ -2141,6 +2260,7 @@ rtl_verify_edges (void)
> is_crossing = (BB_PARTITION (e->src) != BB_PARTITION (e->dest)
> && e->src != ENTRY_BLOCK_PTR
> && e->dest != EXIT_BLOCK_PTR);
> + has_crossing_edge |= is_crossing;
> if (e->flags & EDGE_CROSSING)
> {
> if (!is_crossing)
> @@ -2160,6 +2280,13 @@ rtl_verify_edges (void)
> e->src->index);
> err = 1;
> }
> + if (JUMP_P (BB_END (bb))
> + && !find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX))
> + {
> + error ("No region crossing jump at section boundary in bb %i",
> + bb->index);
> + err = 1;
> + }
> }
> else if (is_crossing)
> {
> @@ -2188,6 +2315,15 @@ rtl_verify_edges (void)
> n_abnormal++;
> }
>
> + if (!has_crossing_edge
> + && find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX))
> + {
> + print_rtl_with_bb (stderr, get_insns (), TDF_RTL |
> TDF_BLOCKS | TDF_DETAILS);
> + error ("Region crossing jump across same section in bb %i",
> + bb->index);
> + err = 1;
> + }
> +
> if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX))
> {
> error ("missing REG_EH_REGION note at the end of bb %i", bb->index);
> @@ -2395,8 +2531,6 @@ rtl_verify_flow_info_1 (void)
>
> err |= rtl_verify_edges ();
>
> - err |= verify_hot_cold_block_grouping();
> -
> return err;
> }
>
> @@ -2642,6 +2776,8 @@ rtl_verify_flow_info (void)
>
> err |= rtl_verify_bb_layout ();
>
> + err |= verify_hot_cold_block_grouping ();
> +
> return err;
> }
>
> @@ -3343,7 +3479,7 @@ fixup_reorder_chain (void)
> edge e_fall, e_taken, e;
> rtx bb_end_insn;
> rtx ret_label = NULL_RTX;
> - basic_block nb, src_bb;
> + basic_block nb;
> edge_iterator ei;
>
> if (EDGE_COUNT (bb->succs) == 0)
> @@ -3478,7 +3614,6 @@ fixup_reorder_chain (void)
> /* We got here if we need to add a new jump insn.
> Note force_nonfallthru can delete E_FALL and thus we have to
> save E_FALL->src prior to the call to force_nonfallthru. */
> - src_bb = e_fall->src;
> nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label);
> if (nb)
> {
> @@ -3486,17 +3621,6 @@ fixup_reorder_chain (void)
> bb->aux = nb;
> /* Don't process this new block. */
> bb = nb;
> -
> - /* Make sure new bb is tagged for correct section (same as
> - fall-thru source, since you cannot fall-thru across
> - section boundaries). */
> - BB_COPY_PARTITION (src_bb, single_pred (bb));
> - if (flag_reorder_blocks_and_partition
> - && targetm_common.have_named_sections
> - && JUMP_P (BB_END (bb))
> - && !any_condjump_p (BB_END (bb))
> - && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING))
> - add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX);
> }
> }
>
> @@ -3796,10 +3920,11 @@ duplicate_insn_chain (rtx from, rtx to)
> case NOTE_INSN_FUNCTION_BEG:
> /* There is always just single entry to function. */
> case NOTE_INSN_BASIC_BLOCK:
> + /* We should only switch text sections once. */
> + case NOTE_INSN_SWITCH_TEXT_SECTIONS:
> break;
>
> case NOTE_INSN_EPILOGUE_BEG:
> - case NOTE_INSN_SWITCH_TEXT_SECTIONS:
> emit_note_copy (insn);
> break;
>
> @@ -4611,8 +4736,7 @@ rtl_can_remove_branch_p (const_edge e)
> if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH))
> return false;
>
> - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)
> - || BB_PARTITION (src) != BB_PARTITION (target))
> + if (BB_PARTITION (src) != BB_PARTITION (target))
> return false;
>
> if (!onlyjump_p (insn)
> Index: basic-block.h
> ===================================================================
> --- basic-block.h (revision 199014)
> +++ basic-block.h (working copy)
> @@ -796,6 +796,7 @@ extern basic_block force_nonfallthru_and_redirect
> extern bool contains_no_active_insn_p (const_basic_block);
> extern bool forwarder_block_p (const_basic_block);
> extern bool can_fallthru (basic_block, basic_block);
> +extern void emit_barrier_after_bb (basic_block bb);
>
> /* In cfgbuild.c. */
> extern void find_many_sub_basic_blocks (sbitmap);
> Index: testsuite/gcc.dg/tree-prof/va-arg-pack-1.c
> ===================================================================
> --- testsuite/gcc.dg/tree-prof/va-arg-pack-1.c (revision 0)
> +++ testsuite/gcc.dg/tree-prof/va-arg-pack-1.c (revision 0)
> @@ -0,0 +1,145 @@
> +/* __builtin_va_arg_pack () builtin tests. */
> +/* { dg-require-effective-target freorder } */
> +/* { dg-options "-O2 -freorder-blocks-and-partition" } */
> +
> +#include <stdarg.h>
> +
> +extern void abort (void);
> +
> +int v1 = 8;
> +long int v2 = 3;
> +void *v3 = (void *) &v2;
> +struct A { char c[16]; } v4 = { "foo" };
> +long double v5 = 40;
> +char seen[20];
> +int cnt;
> +
> +__attribute__ ((noinline)) int
> +foo1 (int x, int y, ...)
> +{
> + int i;
> + long int l;
> + void *v;
> + struct A a;
> + long double ld;
> + va_list ap;
> +
> + va_start (ap, y);
> + if (x < 0 || x >= 20 || seen[x])
> + abort ();
> + seen[x] = ++cnt;
> + if (y != 6)
> + abort ();
> + i = va_arg (ap, int);
> + if (i != 5)
> + abort ();
> + switch (x)
> + {
> + case 0:
> + i = va_arg (ap, int);
> + if (i != 9 || v1 != 9)
> + abort ();
> + a = va_arg (ap, struct A);
> + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0)
> + abort ();
> + v = (void *) va_arg (ap, struct A *);
> + if (v != (void *) &v4)
> + abort ();
> + l = va_arg (ap, long int);
> + if (l != 3 || v2 != 4)
> + abort ();
> + break;
> + case 1:
> + ld = va_arg (ap, long double);
> + if (ld != 41 || v5 != ld)
> + abort ();
> + i = va_arg (ap, int);
> + if (i != 8)
> + abort ();
> + v = va_arg (ap, void *);
> + if (v != &v2)
> + abort ();
> + break;
> + case 2:
> + break;
> + default:
> + abort ();
> + }
> + va_end (ap);
> + return x;
> +}
> +
> +__attribute__ ((noinline)) int
> +foo2 (int x, int y, ...)
> +{
> + long long int ll;
> + void *v;
> + struct A a, b;
> + long double ld;
> + va_list ap;
> +
> + va_start (ap, y);
> + if (x < 0 || x >= 20 || seen[x])
> + abort ();
> + seen[x] = ++cnt | 64;
> + if (y != 10)
> + abort ();
> + switch (x)
> + {
> + case 11:
> + break;
> + case 12:
> + ld = va_arg (ap, long double);
> + if (ld != 41 || v5 != 40)
> + abort ();
> + a = va_arg (ap, struct A);
> + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0)
> + abort ();
> + b = va_arg (ap, struct A);
> + if (__builtin_memcmp (b.c, v4.c, sizeof (b.c)) != 0)
> + abort ();
> + v = va_arg (ap, void *);
> + if (v != &v2)
> + abort ();
> + ll = va_arg (ap, long long int);
> + if (ll != 16LL)
> + abort ();
> + break;
> + case 2:
> + break;
> + default:
> + abort ();
> + }
> + va_end (ap);
> + return x + 8;
> +}
> +
> +__attribute__ ((noinline)) int
> +foo3 (void)
> +{
> + return 6;
> +}
> +
> +extern inline __attribute__ ((always_inline, gnu_inline)) int
> +bar (int x, ...)
> +{
> + if (x < 10)
> + return foo1 (x, foo3 (), 5, __builtin_va_arg_pack ());
> + return foo2 (x, foo3 () + 4, __builtin_va_arg_pack ());
> +}
> +
> +int
> +main (void)
> +{
> + if (bar (0, ++v1, v4, &v4, v2++) != 0)
> + abort ();
> + if (bar (1, ++v5, 8, v3) != 1)
> + abort ();
> + if (bar (2) != 2)
> + abort ();
> + if (bar (v1 + 2) != 19)
> + abort ();
> + if (bar (v1 + 3, v5--, v4, v4, v3, 16LL) != 20)
> + abort ();
> + return 0;
> +}
> Index: testsuite/gcc.dg/tree-prof/comp-goto-1.c
> ===================================================================
> --- testsuite/gcc.dg/tree-prof/comp-goto-1.c (revision 0)
> +++ testsuite/gcc.dg/tree-prof/comp-goto-1.c (revision 0)
> @@ -0,0 +1,166 @@
> +/* { dg-require-effective-target freorder } */
> +/* { dg-options "-O2 -freorder-blocks-and-partition" } */
> +#include <stdlib.h>
> +
> +#if !defined(NO_LABEL_VALUES) && (!defined(STACK_SIZE) || STACK_SIZE
>>= 4000) && __INT_MAX__ >= 2147483647
> +typedef unsigned int uint32;
> +typedef signed int sint32;
> +
> +typedef uint32 reg_t;
> +
> +typedef unsigned long int host_addr_t;
> +typedef uint32 target_addr_t;
> +typedef sint32 target_saddr_t;
> +
> +typedef union
> +{
> + struct
> + {
> + unsigned int offset:18;
> + unsigned int ignore:4;
> + unsigned int s1:8;
> + int :2;
> + signed int simm:14;
> + unsigned int s3:8;
> + unsigned int s2:8;
> + int pad2:2;
> + } f1;
> + long long ll;
> + double d;
> +} insn_t;
> +
> +typedef struct
> +{
> + target_addr_t vaddr_tag;
> + unsigned long int rigged_paddr;
> +} tlb_entry_t;
> +
> +typedef struct
> +{
> + insn_t *pc;
> + reg_t registers[256];
> + insn_t *program;
> + tlb_entry_t tlb_tab[0x100];
> +} environment_t;
> +
> +enum operations
> +{
> + LOAD32_RR,
> + METAOP_DONE
> +};
> +
> +host_addr_t
> +f ()
> +{
> + abort ();
> +}
> +
> +reg_t
> +simulator_kernel (int what, environment_t *env)
> +{
> + register insn_t *pc = env->pc;
> + register reg_t *regs = env->registers;
> + register insn_t insn;
> + register int s1;
> + register reg_t r2;
> + register void *base_addr = &&sim_base_addr;
> + register tlb_entry_t *tlb = env->tlb_tab;
> +
> + if (what != 0)
> + {
> + int i;
> + static void *op_map[] =
> + {
> + &&L_LOAD32_RR,
> + &&L_METAOP_DONE,
> + };
> + insn_t *program = env->program;
> + for (i = 0; i < what; i++)
> + program[i].f1.offset = op_map[program[i].f1.offset] - base_addr;
> + }
> +
> + sim_base_addr:;
> +
> + insn = *pc++;
> + r2 = (*(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)));
> + s1 = (insn.f1.s1 << 2);
> + goto *(base_addr + insn.f1.offset);
> +
> + L_LOAD32_RR:
> + {
> + target_addr_t vaddr_page = r2 / 4096;
> + unsigned int x = vaddr_page % 0x100;
> + insn = *pc++;
> +
> + for (;;)
> + {
> + target_addr_t tag = tlb[x].vaddr_tag;
> + host_addr_t rigged_paddr = tlb[x].rigged_paddr;
> +
> + if (tag == vaddr_page)
> + {
> + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) (rigged_paddr + r2);
> + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2));
> + s1 = insn.f1.s1 << 2;
> + goto *(base_addr + insn.f1.offset);
> + }
> +
> + if (((target_saddr_t) tag < 0))
> + {
> + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) f ();
> + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2));
> + s1 = insn.f1.s1 << 2;
> + goto *(base_addr + insn.f1.offset);
> + }
> +
> + x = (x - 1) % 0x100;
> + }
> +
> + L_METAOP_DONE:
> + return (*(reg_t *) (((char *) regs) + s1));
> + }
> +}
> +
> +insn_t program[2 + 1];
> +
> +void *malloc ();
> +
> +int
> +main ()
> +{
> + environment_t env;
> + insn_t insn;
> + int i, res;
> + host_addr_t a_page = (host_addr_t) malloc (2 * 4096);
> + target_addr_t a_vaddr = 0x123450;
> + target_addr_t vaddr_page = a_vaddr / 4096;
> + a_page = (a_page + 4096 - 1) & -4096;
> +
> + env.tlb_tab[((vaddr_page) % 0x100)].vaddr_tag = vaddr_page;
> + env.tlb_tab[((vaddr_page) % 0x100)].rigged_paddr = a_page -
> vaddr_page * 4096;
> + insn.f1.offset = LOAD32_RR;
> + env.registers[0] = 0;
> + env.registers[2] = a_vaddr;
> + *(sint32 *) (a_page + a_vaddr % 4096) = 88;
> + insn.f1.s1 = 0;
> + insn.f1.s2 = 2;
> +
> + for (i = 0; i < 2; i++)
> + program[i] = insn;
> +
> + insn.f1.offset = METAOP_DONE;
> + insn.f1.s1 = 0;
> + program[2] = insn;
> +
> + env.pc = program;
> + env.program = program;
> +
> + res = simulator_kernel (2 + 1, &env);
> +
> + if (res != 88)
> + abort ();
> + exit (0);
> +}
> +#else
> +main(){ exit (0); }
> +#endif
> Index: testsuite/gcc.dg/tree-prof/pr52027.c
> ===================================================================
> --- testsuite/gcc.dg/tree-prof/pr52027.c (revision 199014)
> +++ testsuite/gcc.dg/tree-prof/pr52027.c (working copy)
> @@ -1,6 +1,6 @@
> /* PR debug/52027 */
> /* { dg-require-effective-target freorder } */
> -/* { dg-options "-O -freorder-blocks-and-partition -fno-reorder-functions" } */
> +/* { dg-options "-O2 -freorder-blocks-and-partition
> -fno-reorder-functions" } */
>
> void
> foo (int len)
> Index: testsuite/gcc.dg/tree-prof/pr50907.c
> ===================================================================
> --- testsuite/gcc.dg/tree-prof/pr50907.c (revision 199014)
> +++ testsuite/gcc.dg/tree-prof/pr50907.c (working copy)
> @@ -1,5 +1,5 @@
> /* PR middle-end/50907 */
> /* { dg-require-effective-target freorder } */
> -/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns
> -fselective-scheduling -fpic" { target { { powerpc*-*-* ia64-*-*
> x86_64-*-* } && fpic } } } */
> +/* { dg-options "-O2 -freorder-blocks-and-partition -fschedule-insns
> -fselective-scheduling -fpic" { target { { powerpc*-*-* ia64-*-*
> x86_64-*-* } && fpic } } } */
>
> #include "pr45354.c"
> Index: testsuite/gcc.dg/tree-prof/pr45354.c
> ===================================================================
> --- testsuite/gcc.dg/tree-prof/pr45354.c (revision 199014)
> +++ testsuite/gcc.dg/tree-prof/pr45354.c (working copy)
> @@ -1,5 +1,5 @@
> /* { dg-require-effective-target freorder } */
> -/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns
> -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } }
> */
> +/* { dg-options "-O2 -freorder-blocks-and-partition -fschedule-insns
> -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } }
> */
>
> extern void abort (void);
>
> Index: testsuite/gcc.dg/tree-prof/20041218-1.c
> ===================================================================
> --- testsuite/gcc.dg/tree-prof/20041218-1.c (revision 0)
> +++ testsuite/gcc.dg/tree-prof/20041218-1.c (revision 0)
> @@ -0,0 +1,119 @@
> +/* PR rtl-optimization/16968 */
> +/* Testcase by Jakub Jelinek <jakub@redhat.com> */
> +/* { dg-require-effective-target freorder } */
> +/* { dg-options "-O2 -freorder-blocks-and-partition" } */
> +
> +struct T
> +{
> + unsigned int b, c, *d;
> + unsigned char e;
> +};
> +struct S
> +{
> + unsigned int a;
> + struct T f;
> +};
> +struct U
> +{
> + struct S g, h;
> +};
> +struct V
> +{
> + unsigned int i;
> + struct U j;
> +};
> +
> +extern void exit (int);
> +extern void abort (void);
> +
> +void *
> +dummy1 (void *x)
> +{
> + return "";
> +}
> +
> +void *
> +dummy2 (void *x, void *y)
> +{
> + exit (0);
> +}
> +
> +struct V *
> +baz (unsigned int x)
> +{
> + static struct V v;
> + __builtin_memset (&v, 0x55, sizeof (v));
> + return &v;
> +}
> +
> +int
> +check (void *x, struct S *y)
> +{
> + if (y->a || y->f.b || y->f.c || y->f.d || y->f.e)
> + abort ();
> + return 1;
> +}
> +
> +static struct V *
> +bar (unsigned int x, void *y)
> +{
> + const struct T t = { 0, 0, (void *) 0, 0 };
> + struct V *u;
> + void *v;
> + v = dummy1 (y);
> + if (!v)
> + return (void *) 0;
> +
> + u = baz (sizeof (struct V));
> + u->i = x;
> + u->j.g.a = 0;
> + u->j.g.f = t;
> + u->j.h.a = 0;
> + u->j.h.f = t;
> +
> + if (!check (v, &u->j.g) || !check (v, &u->j.h))
> + return (void *) 0;
> + return u;
> +}
> +
> +int
> +foo (unsigned int *x, unsigned int y, void **z)
> +{
> + void *v;
> + unsigned int i, j;
> +
> + *z = v = (void *) 0;
> +
> + for (i = 0; i < y; i++)
> + {
> + struct V *c;
> +
> + j = *x;
> +
> + switch (j)
> + {
> + case 1:
> + c = bar (j, x);
> + break;
> + default:
> + c = 0;
> + break;
> + }
> + if (c)
> + v = dummy2 (v, c);
> + else
> + return 1;
> + }
> +
> + *z = v;
> + return 0;
> +}
> +
> +int
> +main (void)
> +{
> + unsigned int one = 1;
> + void *p;
> + foo (&one, 1, &p);
> + abort ();
> +}
> Index: testsuite/g++.dg/tree-prof/partition2.C
> ===================================================================
> --- testsuite/g++.dg/tree-prof/partition2.C (revision 199014)
> +++ testsuite/g++.dg/tree-prof/partition2.C (working copy)
> @@ -1,6 +1,6 @@
> // PR middle-end/45458
> // { dg-require-effective-target freorder }
> -// { dg-options "-fnon-call-exceptions -freorder-blocks-and-partition" }
> +// { dg-options "-O2 -fnon-call-exceptions -freorder-blocks-and-partition" }
>
> int
> main ()
> Index: testsuite/g++.dg/tree-prof/partition3.C
> ===================================================================
> --- testsuite/g++.dg/tree-prof/partition3.C (revision 199014)
> +++ testsuite/g++.dg/tree-prof/partition3.C (working copy)
> @@ -1,6 +1,6 @@
> // PR middle-end/45566
> // { dg-require-effective-target freorder }
> -// { dg-options "-O -fnon-call-exceptions -freorder-blocks-and-partition" }
> +// { dg-options "-O2 -fnon-call-exceptions -freorder-blocks-and-partition" }
>
> int k;
>
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
More information about the Gcc-patches
mailing list