This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Move reg_equiv_* into a single structure


On 05/14/10 05:59, Paolo Bonzini wrote:
On 05/14/2010 01:15 PM, Richard Guenther wrote:
I wonder how this patch affects cache locality of accesses?

Some access will be better (due to accessing different pieces of information for the same pseudo), some will be worse (due to information for different pseudos always being in different cache line).


Quickly looking at "for" in Jeff's patch makes me think that the former would win. find_reg_equiv_invariant_const is the only place that clearly would lose after the patch, and it's called once per function.

It's a mixed bag -- I think there are probably three cases to consider.

The first are loops which iterate through all the pseudos and only hit a single field in the structure. find_reg_equiv_invariant would be a great example. Those loops are going to clearly be less cache friendly. There's only a few of these and they are typically run once per compiled function.

The second case is iterating through the pseudos, but hitting multiple fields in the structure. I would tend to think these are going to be less cache friendly after my patch, but not as much so as the first case. The loop I was most concerned about is this one in reload1.c, which is executed once per iteration of the main reload loop:


for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
if (reg_renumber[i] < 0 && reg_equivs[i].equiv_memory_loc)
{
rtx x = eliminate_regs (reg_equivs[i].equiv_memory_loc, VOIDmode,
NULL_RTX);


if (strict_memory_address_addr_space_p
(GET_MODE (regno_reg_rtx[i]), XEXP (x, 0),
MEM_ADDR_SPACE (x)))
reg_equivs[i].equiv_mem = x, reg_equivs[i].equiv_address = 0;
else if (CONSTANT_P (XEXP (x, 0))
|| (REG_P (XEXP (x, 0))
&& REGNO (XEXP (x, 0)) < FIRST_PSEUDO_REGISTER)
|| (GET_CODE (XEXP (x, 0)) == PLUS
&& REG_P (XEXP (XEXP (x, 0), 0))
&& (REGNO (XEXP (XEXP (x, 0), 0))
< FIRST_PSEUDO_REGISTER)
&& CONSTANT_P (XEXP (XEXP (x, 0), 1))))
reg_equivs[i].equiv_address = XEXP (x, 0), reg_equivs[i].equiv_mem = 0;
else
{
/* Make a new stack slot. Then indicate that something
changed so we go back and recompute offsets for
eliminable registers because the allocation of memory
below might change some offset. reg_equiv_{mem,address}
will be set up for this pseudo on the next pass around
the loop. */
reg_equivs[i].equiv_memory_loc = 0;
reg_equivs[i].equiv_init = 0;
alter_reg (i, -1, true);
}
}


Luckily, we guard everything on reg_renumber[i] < 0. So the majority of the time we don't hit the reg_equivs structure. In fact, with that guard, the cache locality of memory accesses in this loop probably sucks both before and after my change. Once we do hit the reg_equivs structure, we typically hit three fields (memory_loc, mem, address) and occasionally hit just two (memory_loc, init) which should be a trivial win.


The third case occurs when we scan an insn and hit various fields in the reg_equivs structure based on the operands we find. Indices are effectively random and thus the old code was extremely cache unfriendly. Most of these cases we're going to hit multiple fields, which should somewhat help cache locality.


In all I suspect this is a complete wash.

Jeff





However, the vast majority of accesses are


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]