This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH, PR 10474] Shedule pass_cprop_hardreg before pass_thread_prologue_and_epilogue
- From: Jeff Law <law at redhat dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 17 Apr 2013 12:43:59 -0600
- Subject: Re: [PATCH, PR 10474] Shedule pass_cprop_hardreg before pass_thread_prologue_and_epilogue
- References: <20130417154935 dot GC3656 at virgil dot suse>
On 04/17/2013 09:49 AM, Martin Jambor wrote:
I noticed similar effects when looking at range splitting. Being able
to move those calls into a deeper control level in the CFG would
definitely be an improvement.
The reason why it helps so much is that before register allocation
there are instructions moving the value of actual arguments from
"originally hard" register (e.g. SI, DI, etc.) to a pseudo at the
beginning of each function. When the argument is live across a
function call, the pseudo is likely to be assigned to a callee-saved
register and then also accessed from that register, even in the first
BB, making it require prologue, though it could be fetched from the
original one. When we convert all uses (at least in the first BB) to
the original register, the preparatory stage of shrink wrapping is
often capable of moving the register moves to a later BB, thus
creating fast paths which do not require prologue and epilogue.
Did anyone ponder just doing the hard register propagation on argument
registers prior the prologue/epilogue handling, then the full blown
propagation pass in its current location in the pipeline?
We believe this change in the pipeline should not bring about any
negative effects. During gcc bootstrap, the number of instructions
changed by pass_cprop_hardreg dropped but by only 1.2%. We have also
ran SPEC 2006 CPU benchmarks on recent Intel and AMD hardware and all
run time differences could be attributed to noise. The changes in
binary sizes were also small:
That would get you the benefit you're seeking and minimize other
effects. Of course if you try that and get effectively the same results
as moving the full propagation pass before prologue/epilogue handling
then the complexity of only propagating argument registers early is
clearly not needed and we'd probably want to go with your patch as-is.