This patch was an experiment to see if we can get the same improvement
with modifications to IRA, making it more tolerant to over-aggressive
scheduling. THe idea is that if an instruction sets a register A, and
all its inputs are live and unmodified for the lifetime of A, then
moving the instruction downwards towards its first use is going to be
beneficial from a register pressure point of view.
That alone, however, turns out to be too aggressive, performance drops
presumably because we undo too many scheduling decisions. So, the patch
detects such situations, and splits the pseudo; a new pseudo is
introduced in the original setting instruction, and a copy is added
before the first use. If the new pseudo does not get a hard register, it
is removed again and instead the setting instruction is moved to the
point of the copy.
This gets up to 6.5% on 456.hmmer on the mips target I was working on;
an embedded benchmark suite also seems to have a (small) geomean
improvement. On x86_64, I've tested spec2k, where specint is unchanged
and specfp has a tiny performance regression. All these tests were done
with a gcc-4.6 based tree.
Thoughts? Currently the patch feels somewhat bolted on to the side of
IRA, maybe there's a nicer way to achieve this?