This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [cft] fix 24160
- From: Joern RENNECKE <joern dot rennecke at st dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org, Ulrich Weigand <uweigand at de dot ibm dot com>
- Date: Tue, 08 Nov 2005 20:56:58 +0000
- Subject: Re: [cft] fix 24160
- References: <20051106212222.GA26792@redhat.com>
Richard Henderson wrote:
* reload1.c (reg_equiv_invariant): New.
(reload): Allocate, initialize, and free it.
(eliminate_regs_1): Rename from eliminate_regs. New argument
may_use_invariant; only substitute from reg_equiv_invariant
when true.
(eliminate_regs): New.
(eliminate_regs_in_insn): Use eliminate_regs_1 directly; pass
true when operand is rhs of single_set.
I've found the following issues with your patch:
- We were generating reloads for equivalence-setting insns
- We allocated stack slots for pseudos that are to be replaced by equivalences.
- When the operands are actually inside a PLUS, we'd not consider that as
a PLUS context.
- We didn't consider REG_NOTES as a suitable context for replacement.
After fixing these issues, the EEMBC aifftr01 asembly was identical to what we
got before the patch.
While looking into these issues, I also foun what I considered to be a
missed optimization opportunity - when we set a register to an address
in the frame with a two-insn sequence, reload ignores the REG_EQUAL note
and we end up with:
(insn 10467 215 10468 12 (set (reg:SI 4 r4)
(const_int 2504 [0x9c8])) 172 {movsi_ie} (nil)
(nil))
(insn 10468 10467 2612 12 (set (reg:SI 4 r4)
(plus:SI (reg:SI 4 r4)
(reg/f:SI 15 r15))) 39 {*addsi3_compact} (nil)
(expr_list:REG_EQUIV (plus:SI (reg/f:SI 15 r15)
(const_int 2504 [0x9c8]))
(nil)))
(insn:HI 2612 10468 2131 12 (set (reg/f:SI 7 r7 [1200])
(plus:SI (reg/f:SI 7 r7 [1200])
(reg:SI 4 r4))) 39 {*addsi3_compact} (insn_list:REG_DEP_TRUE 2611 (n
il))
(expr_list:REG_EQUAL (plus:SI (reg/f:SI 15 r15)
(const_int 384 [0x180]))
(nil)))
That was easy enough to fix, but somehow there is little or no change in
perfromance (the EEMBC testsuites are up or down by 0.1% or less, with only a
hair-thin trend for s speed up), but a whopping 3% size increase.
My best guess at this time is that we end up with less constant sharing
and more constant tables.
With the rest of the patch in place, this change is only a one-liner, so
I left it in place conditionalized on optimize > 3 so that maybe other people
can investigate what the effect of this change is on their targets / codes,
or maybe analyze what is going on for the SH.
--- reload1.c-rth 2005-11-07 17:22:11.000000000 +0000
+++ reload1.c 2005-11-08 20:28:18.000000000 +0000
@@ -1462,7 +1462,9 @@ calculate_needs_all_insns (int global)
/* Skip insns that only set an equivalence. */
if (set && REG_P (SET_DEST (set))
&& reg_renumber[REGNO (SET_DEST (set))] < 0
- && reg_equiv_constant[REGNO (SET_DEST (set))])
+ && (reg_equiv_constant[REGNO (SET_DEST (set))]
+ || (reg_equiv_invariant[REGNO (SET_DEST (set))]))
+ && reg_equiv_init[REGNO (SET_DEST (set))])
continue;
/* If needed, eliminate any eliminable registers. */
@@ -1952,6 +1954,7 @@ alter_reg (int i, int from_reg)
if (reg_renumber[i] < 0
&& REG_N_REFS (i) > 0
&& reg_equiv_constant[i] == 0
+ && (reg_equiv_invariant[i] == 0 || reg_equiv_init[i] == 0)
&& reg_equiv_memory_loc[i] == 0)
{
rtx x;
@@ -2890,7 +2893,7 @@ eliminate_regs_in_insn (rtx insn, int re
rtx substed_operand[MAX_RECOG_OPERANDS];
rtx orig_operand[MAX_RECOG_OPERANDS];
struct elim_table *ep;
- rtx plus_src;
+ rtx plus_src, plus_cst_src;
if (! insn_is_asm && icode < 0)
{
@@ -2995,16 +2998,21 @@ eliminate_regs_in_insn (rtx insn, int re
/* We allow one special case which happens to work on all machines we
currently support: a single set with the source or a REG_EQUAL
note being a PLUS of an eliminable register and a constant. */
- plus_src = 0;
+ plus_src = plus_cst_src = 0;
if (old_set && REG_P (SET_DEST (old_set)))
{
- /* First see if the source is of the form (plus (reg) CST). */
- if (GET_CODE (SET_SRC (old_set)) == PLUS
- && REG_P (XEXP (SET_SRC (old_set), 0))
- && GET_CODE (XEXP (SET_SRC (old_set), 1)) == CONST_INT
- && REGNO (XEXP (SET_SRC (old_set), 0)) < FIRST_PSEUDO_REGISTER)
+ if (GET_CODE (SET_SRC (old_set)) == PLUS)
plus_src = SET_SRC (old_set);
- else if (REG_P (SET_SRC (old_set)))
+ /* First see if the source is of the form (plus (reg) CST). */
+ if (plus_src
+ && REG_P (XEXP (plus_src, 0))
+ && GET_CODE (XEXP (plus_src, 1)) == CONST_INT
+ && REGNO (XEXP (plus_src, 0)) < FIRST_PSEUDO_REGISTER)
+ plus_cst_src = plus_src;
+ else if (REG_P (SET_SRC (old_set))
+ /* ??? this seems to increase code size with little
+ or no speed gain. */
+ || (optimize > 3 && plus_src))
{
/* Otherwise, see if we have a REG_EQUAL note of the form
(plus (reg) CST). */
@@ -3017,16 +3025,16 @@ eliminate_regs_in_insn (rtx insn, int re
&& GET_CODE (XEXP (XEXP (links, 0), 1)) == CONST_INT
&& REGNO (XEXP (XEXP (links, 0), 0)) < FIRST_PSEUDO_REGISTER)
{
- plus_src = XEXP (links, 0);
+ plus_cst_src = XEXP (links, 0);
break;
}
}
}
}
- if (plus_src)
+ if (plus_cst_src)
{
- rtx reg = XEXP (plus_src, 0);
- HOST_WIDE_INT offset = INTVAL (XEXP (plus_src, 1));
+ rtx reg = XEXP (plus_cst_src, 0);
+ HOST_WIDE_INT offset = INTVAL (XEXP (plus_cst_src, 1));
for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++)
if (ep->from_rtx == reg && ep->can_eliminate)
@@ -3058,9 +3066,9 @@ eliminate_regs_in_insn (rtx insn, int re
/* If we have a nonzero offset, and the source is already
a simple REG, the following transformation would
increase the cost of the insn by replacing a simple REG
- with (plus (reg sp) CST). So try only when plus_src
- comes from old_set proper, not REG_NOTES. */
- else if (SET_SRC (old_set) == plus_src)
+ with (plus (reg sp) CST). So try only when we already
+ had a PLUS before. */
+ else if (plus_src)
{
new_body = old_body;
if (! replace)
@@ -3099,7 +3107,7 @@ eliminate_regs_in_insn (rtx insn, int re
/* For an asm statement, every operand is eliminable. */
if (insn_is_asm || insn_data[icode].operand[i].eliminable)
{
- bool is_set_src;
+ bool is_set_src, in_plus;
/* Check for setting a register that we know about. */
if (recog_data.operand_type[i] != OP_IN
@@ -3120,10 +3128,16 @@ eliminate_regs_in_insn (rtx insn, int re
is_set_src = false;
if (old_set && recog_data.operand_loc[i] == &SET_SRC (old_set))
is_set_src = true;
+ in_plus = false;
+ if (plus_src
+ && (recog_data.operand_loc[i] == &XEXP (plus_src, 0)
+ || recog_data.operand_loc[i] == &XEXP (plus_src, 1)))
+ in_plus = true;
substed_operand[i]
= eliminate_regs_1 (recog_data.operand[i], 0,
- replace ? insn : NULL_RTX, is_set_src);
+ replace ? insn : NULL_RTX,
+ is_set_src || in_plus);
if (substed_operand[i] != orig_operand[i])
val = 1;
/* Terminate the search in check_eliminable_occurrences at
@@ -3251,7 +3265,8 @@ eliminate_regs_in_insn (rtx insn, int re
of spill registers to be needed in the final reload pass than in
the pre-passes. */
if (val && REG_NOTES (insn) != 0)
- REG_NOTES (insn) = eliminate_regs (REG_NOTES (insn), 0, REG_NOTES (insn));
+ REG_NOTES (insn)
+ = eliminate_regs_1 (REG_NOTES (insn), 0, REG_NOTES (insn), true);
return val;
}