This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Move reg_equiv_* into a single structure
On 05/14/10 05:59, Paolo Bonzini wrote:
On 05/14/2010 01:15 PM, Richard Guenther wrote:
I wonder how this patch affects cache locality of accesses?
Some access will be better (due to accessing different pieces of
information for the same pseudo), some will be worse (due to
information for different pseudos always being in different cache line).
Quickly looking at "for" in Jeff's patch makes me think that the
former would win. find_reg_equiv_invariant_const is the only place
that clearly would lose after the patch, and it's called once per
function.
It's a mixed bag -- I think there are probably three cases to consider.
The first are loops which iterate through all the pseudos and only hit a
single field in the structure. find_reg_equiv_invariant would be a
great example. Those loops are going to clearly be less cache
friendly. There's only a few of these and they are typically run once
per compiled function.
The second case is iterating through the pseudos, but hitting multiple
fields in the structure. I would tend to think these are going to be
less cache friendly after my patch, but not as much so as the first
case. The loop I was most concerned about is this one in reload1.c,
which is executed once per iteration of the main reload loop:
for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
if (reg_renumber[i] < 0 && reg_equivs[i].equiv_memory_loc)
{
rtx x = eliminate_regs (reg_equivs[i].equiv_memory_loc,
VOIDmode,
NULL_RTX);
if (strict_memory_address_addr_space_p
(GET_MODE (regno_reg_rtx[i]), XEXP (x, 0),
MEM_ADDR_SPACE (x)))
reg_equivs[i].equiv_mem = x, reg_equivs[i].equiv_address = 0;
else if (CONSTANT_P (XEXP (x, 0))
|| (REG_P (XEXP (x, 0))
&& REGNO (XEXP (x, 0)) < FIRST_PSEUDO_REGISTER)
|| (GET_CODE (XEXP (x, 0)) == PLUS
&& REG_P (XEXP (XEXP (x, 0), 0))
&& (REGNO (XEXP (XEXP (x, 0), 0))
< FIRST_PSEUDO_REGISTER)
&& CONSTANT_P (XEXP (XEXP (x, 0), 1))))
reg_equivs[i].equiv_address = XEXP (x, 0),
reg_equivs[i].equiv_mem = 0;
else
{
/* Make a new stack slot. Then indicate that something
changed so we go back and recompute offsets for
eliminable registers because the allocation of memory
below might change some offset. reg_equiv_{mem,address}
will be set up for this pseudo on the next pass around
the loop. */
reg_equivs[i].equiv_memory_loc = 0;
reg_equivs[i].equiv_init = 0;
alter_reg (i, -1, true);
}
}
Luckily, we guard everything on reg_renumber[i] < 0. So the majority of
the time we don't hit the reg_equivs structure. In fact, with that
guard, the cache locality of memory accesses in this loop probably sucks
both before and after my change. Once we do hit the reg_equivs
structure, we typically hit three fields (memory_loc, mem, address) and
occasionally hit just two (memory_loc, init) which should be a trivial win.
The third case occurs when we scan an insn and hit various fields in the
reg_equivs structure based on the operands we find. Indices are
effectively random and thus the old code was extremely cache
unfriendly. Most of these cases we're going to hit multiple fields,
which should somewhat help cache locality.
In all I suspect this is a complete wash.
Jeff
However, the vast majority of accesses are