This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH, Mips] Compact branch/delay slot optimization.
- From: Simon Dardis <Simon dot Dardis at imgtec dot com>
- To: "Moore, Catherine" <Catherine_Moore at mentor dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 25 Sep 2015 14:55:57 +0000
- Subject: [PATCH, Mips] Compact branch/delay slot optimization.
- Authentication-results: sourceware.org; auth=none
Hello,
The following patch adds three small optimizations related to compact branches for MIPSR6:
When the result of a load is used by a delay slot branch immediately afterwards, undo the
delay slot branch scheduling to hide the pipeline bubble if safe and use a compact branch
instead.
Undo delay slot scheduling if an orphaned high-part relocation is in a delay slot and use a
compact branch is used instead.
Undo delay slot scheduling in the case where a forbidden slot hazard is immediately followed
by a delay slot branch. This would cause a nop to be inserted otherwise.
No regressions. OK to apply?
Thanks,
Simon
gcc/
* config/mips/mips.c: (mips_break_sequence): New function.
(mips_reorg_process_insns) Use it. Use compact branches in selected
situations.
gcc/testsuite/
* gcc.target/mips/split-ds-sequence.c: Test for the above.
Index: config/mips/mips.c
===================================================================
--- config/mips/mips.c (revision 227676)
+++ config/mips/mips.c (working copy)
@@ -16973,6 +16973,23 @@
}
}
+/* Remove a SEQUENCE and replace it with the delay slot instruction
+ followed by the branch and return the instruction in the delay slot.
+ Return the first of the two new instructions.
+ Subroutine of mips_reorg_process_insns. */
+
+static rtx_insn *
+mips_break_sequence (rtx_insn * insn)
+{
+ rtx_insn * before = PREV_INSN (insn);
+ rtx_insn * branch = SEQ_BEGIN (insn);
+ rtx_insn * ds = SEQ_END (insn);
+ remove_insn (insn);
+ add_insn_after (ds, before, NULL);
+ add_insn_after (branch, ds, NULL);
+ return ds;
+}
+
/* Go through the instruction stream and insert nops where necessary.
Also delete any high-part relocations whose partnering low parts
are now all dead. See if the whole function can then be put into
@@ -17065,6 +17082,66 @@
{
if (GET_CODE (PATTERN (insn)) == SEQUENCE)
{
+ rtx_insn * next_active = next_active_insn (insn);
+ /* Undo delay slots to avoid bubbles if the next instruction can
+ be placed in a forbidden slot or the cost of adding an
+ explicit NOP in a forbidden slot is OK. */
+ if (TARGET_CB_MAYBE
+ && INSN_P (SEQ_BEGIN (insn))
+ && INSN_P (SEQ_END (insn))
+ && ((next_active
+ && INSN_P (next_active)
+ && GET_CODE (PATTERN (next_active)) != SEQUENCE
+ && get_attr_can_delay (next_active) == CAN_DELAY_YES)
+ || !optimize_size))
+ {
+ /* To hide a potential pipeline bubble, if we scan backwards
+ from the current SEQUENCE and find that there is a load
+ of a value that is used in the CTI and there are no
+ dependencies between the CTI and instruction in the delay
+ slot, break the sequence so the load delay is hidden. */
+ HARD_REG_SET uses;
+ CLEAR_HARD_REG_SET (uses);
+ note_uses (&PATTERN (SEQ_BEGIN (insn)), record_hard_reg_uses,
+ &uses);
+ HARD_REG_SET delay_sets;
+ CLEAR_HARD_REG_SET (delay_sets);
+ note_stores (PATTERN (SEQ_END (insn)), record_hard_reg_sets,
+ &delay_sets);
+
+ rtx prev = prev_active_insn (insn);
+ if (prev
+ && GET_CODE (PATTERN (prev)) == SET
+ && MEM_P (SET_SRC (PATTERN (prev))))
+ {
+ HARD_REG_SET sets;
+ CLEAR_HARD_REG_SET (sets);
+ note_stores (PATTERN (prev), record_hard_reg_sets,
+ &sets);
+
+ /* Re-order if safe. */
+ if (!hard_reg_set_intersect_p (delay_sets, uses)
+ && hard_reg_set_intersect_p (uses, sets))
+ {
+ next_insn = mips_break_sequence (insn);
+ /* Need to process the hazards of the newly
+ introduced instructions. */
+ continue;
+ }
+ }
+
+ /* If we find an orphaned high-part relocation in a delay
+ slot then we can convert to a compact branch and get
+ the orphaned high part deleted. */
+ if (mips_orphaned_high_part_p (&htab, SEQ_END (insn)))
+ {
+ next_insn = mips_break_sequence (insn);
+ /* Need to process the hazards of the newly
+ introduced instructions. */
+ continue;
+ }
+ }
+
/* If we find an orphaned high-part relocation in a delay
slot, it's easier to turn that instruction into a NOP than
to delete it. The delay slot will be a NOP either way. */
@@ -17099,6 +17176,32 @@
{
mips_avoid_hazard (last_insn, insn, &hilo_delay,
&delayed_reg, lo_reg, &fs_delay);
+ /* When a compact branch introduces a forbidden slot hazard
+ and the next useful instruction is a SEQUENCE of a jump
+ and a non-nop instruction in the delay slot, remove the
+ sequence and replace it with the delay slot instruction
+ then the jump to clear the forbidden slot hazard. */
+
+ if (fs_delay)
+ {
+ /* Search onwards from the current position looking for
+ a SEQUENCE. We are looking for pipeline hazards here
+ and do not need to worry about labels or barriers as
+ the optimization only undoes delay slot filling which
+ only affects the order of the branch and its delay
+ slot. */
+ rtx_insn * next = next_active_insn (insn);
+ if (next
+ && USEFUL_INSN_P (next)
+ && GET_CODE (PATTERN (next)) == SEQUENCE)
+ {
+ last_insn = insn;
+ next_insn = mips_break_sequence (next);
+ /* Need to process the hazards of the newly
+ introduced instructions. */
+ continue;
+ }
+ }
last_insn = insn;
}
}