This is the mail archive of the
mailing list for the GCC project.
[rfc] slightly better reload constant rematerialization
- From: Richard Henderson <rth at redhat dot com>
- To: gcc-patches at gcc dot gnu dot org
- Date: Wed, 19 Jan 2005 19:45:38 -0800
- Subject: [rfc] slightly better reload constant rematerialization
Consider the following excerpt from process_gc_options in gengtype.c:
ldah $1,$LC310($29) !gprelhigh
lda $17,$LC310($1) !gprellow
First, know that the gprelhigh relocation is expressed within the
compiler as HIGH -- the fact that it references $29 is not exposed
until after reload. So as far as the register allocator is concerned,
it is a constant.
Second, know that the pseudo that is assigned this value is set
once and used twice, all in different basic blocks. Indeed, the
uses are within a loop, and the set is outside -- exactly what we
expect from a loop optimizer. Also, this means that the code in
local-alloc which tries to reduce register lifetimes by moving sets
adjacent to uses is ineffective, since it doesn't apply across blocks.
Third, know that (for alphaev67) selecting a register of FLOAT_REGS
is in fact a teeny bit better than using memory. The ftoit insn has
a fixed latency of 3, whereas a load from memory has a latency of 3
iff we hit in L1 cache, and higher otherwise. This is reflected by
the tiny advantage given by the relevant cost macros.
Except in this case, recomputing the constant would have a latency of
just 1. Noticably better than the move from the fp register.
We make a couple of decisions leading up to this that in isolation
are defensible, but which lead to the undesirable result above.
First, we havn't collected reg_equiv_constant at the time we run
regclass. The job of regclass is to choose a class to use under
the assumption that a register really is required. We only set
the alternate class to NO_REGS if other register alternatives
really are worse than loading the value from memory. From that
alone, we havn't done anything wrong. What we should be realizing
though, is that all of the actual uses are in a particular class,
and that recomputing the value from scratch in that class is less
expensive than moving from some other class. But we don't have
the information collected that would allow us to make that decision.
Second, we only do constant rematerialization in the case where we
havn't allocated a hard register to a pseudo at all. Which is not
true here -- we picked $f4.
The following patch hacks at the problem in push_reload. It does
not take into account the cost of the constant, which arguably
could be high in some cases and less desirable than just a copy.
It's good enough for Alpha, though, and I'm wondering what other
folks think of this for other platforms. If we need cost metrics,
what form should they take?
RCS file: /cvs/gcc/gcc/gcc/reload.c,v
retrieving revision 1.265
diff -u -p -d -r1.265 reload.c
--- reload.c 15 Jan 2005 16:06:15 -0000 1.265
+++ reload.c 20 Jan 2005 03:11:00 -0000
@@ -951,9 +951,25 @@ push_reload (rtx in, rtx out, rtx *inloc
int regno = REGNO (in);
- if (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] < 0
- && reg_equiv_constant[regno] != 0)
- in = reg_equiv_constant[regno];
+ if (regno >= FIRST_PSEUDO_REGISTER)
+ if (reg_renumber[regno] < 0 && reg_equiv_constant[regno] != 0)
+ in = reg_equiv_constant[regno];
+ /* If IN is a hard register containing a value that is
+ everywhere-equivalent to a constant, and we're pushing a
+ reload for it, then we made a stupid decision about what
+ register class to give the pseudo. Use the constant. */
+ if (ORIGINAL_REGNO (in) >= FIRST_PSEUDO_REGISTER
+ && reg_equiv_constant[ORIGINAL_REGNO (in)] != 0
+ /* ??? We only update eliminables for pseudos that were
+ not allocated hard registers. So we must filter for
+ true constants here. */
+ && CONSTANT_P (reg_equiv_constant[ORIGINAL_REGNO (in)]))
+ in = reg_equiv_constant[ORIGINAL_REGNO (in)];
/* Likewise for OUT. Of course, OUT will never be equivalent to