This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: One more global.c speedup


In article <11143.942372278@upchuck> you wrote:

: I happened to be reviewing some literature on building conflict matrices
: and came across an interesting observation.

:                 3. Either X or Y is not evaluted on the path to P
:                    (ie it is used uninitialized) and thus the
:                    conflict can be ignored.

This opens a whole can of worms.  When a pseudo dies, we assume that we can
re-use its hard register for spills.  With your patch, a hard register might
be allocated to multiple pseudos at the same time, and we have to consider
it live as long as any pseudo is allocated to it.

While in general using uninitialized pseudos has undefined bahaviour, we
should consider that with 'traditional' compilers, and more importanly,
the optimizers might generate such code.  Consider and where in one
path, one of the operands is known to be zero.  On this path, it is not
necessary to initialize the other operand.

I suppose we can just delete combine_reloads, and good riddance.

However, find_dummy_reloads is generally useful, yet vulnerable to your
change.  When it tries to use IN for the output reload, in might really be
uninitialized and (some of) its hard register(s) might be used by another
pseudo.
We even have to consider the possibility that this third pseudo might be
dying in the same insn; fortunately, find_dummy_reloads is already mostly
immune against this possibility: if the output is earlyclobber, it will
check if (a part of) the input is mentioned anywhere else in the insn.
Since at this stage pseudos have been renumbered to hard regs, we'd find
a match with the other input pseudo.

I think find_dummy_reload already has a vulnerability when there are
multiple outputs, and in the output that it does not consider, the input
it does consider is used (e.g. a part of an address where to store
the other result).  I think the register allocators are also vulnerable to
this problem now (even without matching constraints).
I suppose it doesn't show because no machine description has a pattern
with multiple outputs where one allows a mem?
But we might still be hit if there is a reg_equiv_memory_address that involves
a pseudo.  But only if there is a REG_DEAD note for the pseudo at that point,
which AFAIK is not possible now.

Well, to come back to the new problem for find_dummy_reloads, it can be
handled by checking all hard regs in IN against what you get when
running compute_use_by_pseudos on chain->live_after.

In my upcoming reload patch, it can be checked by testing chain->spill_regs,
which will already contain all hard registers that are set or die and hence
can likely be used for something else for some time interval.
But to allow for multiply-allocted hard registers, reload will need to
run compute_use_by_pseudos on chain->life_throughout for every insn before
calling find_reloads to find out which hard registers need to be excluded
from chain->spill_regs.

I suppose it's still a win for large programs, since there can be no more
continously live pseudos than there are hard registers - except for
abberations due to multiple allocations, of course, but although we'll
need to take some steps to protect against generating incorrect code
due to multiple allocations, I don't think they will be common enough
to have a compile time performance impact through a higher number of
simultanously live pseudos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]