This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Progress report on fixup_var_refs bottleneck


Exec summary: I'm done with the first 90% of fixup_var_refs.
Unfortunately this is definitely a case where there's a second 90%.

Background: We can discover during tree->RTL expansion that a variable
was given a pseudo register, but needs to be in a stack slot
(e.g. because its address was taken).  There are things we can do to
avoid having to spill it everywhere, such as the ADDRESSOF hack, but
they don't change the overall contour of what happens.  Right now what
happens is that expand_expr calls put_var_into_stack, which generates
a stack slot, clobbers the existing REG rtx with a MEM rtx referring
to the slot, and calls fixup_var_refs.  fixup_var_refs has to scan the
entire insn chain generated to date to repair the damage.  Future
expansion sees the variable lives in the stack and generates correct
RTL to begin with.

There is an existing place where these fixups can be queued for later
processing.  Presently it is only used when a nested function does
something that forces a variable in the outer function into the
stack.  When we come back to the outer function, we immediately do all
the queued fixups.

The same routines are used much later to eliminate ADDRESSOF
expressions once we've got all the good out of them that we can.  This
complicates matters but only slightly.

My changes will cause put_var_into_stack not to smash the existing REG
expression.  Instead it will generate a brand new MEM and record the
(reg, mem) pair in the fixups queue - unconditionally.  We'll then
carry on generating RTL as if we hadn't needed to stack the variable.
At the very beginning of rest_of_compilation we will come back and
make all the needed changes at once.  The main performance win comes
because we can scan the insn chain once, write down which insns use
which REGs, and then examine just those insns in detail.  There's also
a major comprehensibility win due to not destructively converting REGs
into MEMs.  We had several of those horrible 50-line if statements,
containing tests like

	reg == var	/* if var is the old value */
	&& rtx_equal_p (mem, var)  /* and the new value */

So far I have converted everything except fixup_var_refs_1 and
purge_addressof_1 to the new form.  Those two do the messy RTL
walking.  They're going to be hard.  Also, this is a major invasive
change to code which has not been modified significantly since the
1980s; nor can I test all of its ramifications.  I anticipate serious
destabilization due to the patch.

zw

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]