This is the mail archive of the
mailing list for the GCC project.
Re: What happened to the IRA interprocedural reg-alloc work? (function_used_regs and friends)
On 12-10-16 5:49 PM, Steven Bosscher wrote:
On Tue, Oct 16, 2012 at 7:20 PM, Andi Kleen wrote:
Vladimir Makarov <email@example.com> writes:
... especially if you could make it work with LTO.
As I remember, the performance improvement from this optimization was
very small. There were problems in reviewing IRA and I decided to
simplify this task.
May be it is worth to return to this work.
I'm going to (or more accurately: have started looking at) porting
Vlad's patch to the trunk. I'll probably have this finished next
By the way, although classic optimal RA is NP-complete task there is
polynomial algorithm for interprocedural optimal RA (for some problem
formulation). But still, I guess it is too expensive to use it in a
production compiler. If you are interesting, here is the classical article:
I've been thinking a lot about how this IPA-RA could be made more
effective with LTO, but I don't see any easy ways to do this.
Vlad's patch basically looks at what registers are really call-used
and call-clobbered in the 'final' pass where assembly is emitted. This
is the only way to know which registers will be in that set for each
function, you can't already know that when the function is still
represented in GIMPLE or non-strict RTL. LTO works on GIMPLE function
bodies and summaries are written well before RTL is generated,
registers have not been allocated.
With "basic" LTO there's no problem. Functions have to be compiled to
RTL in topological order to make things work, but as far as I know
this already happens (not sure, tho'). With WHOPR (which is IMHO the
only useful LTO mode in general) , the set of call used/clobbered regs
cannot be known and streamed out as a summary for WHOPR because GCC
streams GIMPLE bodies, performs its WHOPR magic on GIMPLE, and
generates RTL only after WHOPR is done.
I suppose it's theoretically possible to make a good initial guess of
what registers might be not-clobbered by a function even if the ABI
says so. For instance, perhaps it's possible to assume that a function
that doesn't touch any variables in a floating point mode also doesn't
use/clobber any floating point registers. This assumption could be
propagated via LTO/WHOPR. If the function turns out to clobber
registers that were assumed to be untouched, you could just save and
restore them in the function ("callee saved" so to speak). But I don't
know how useful that would be.
Another, IMHO more interesting, thing to investigate would be
allocating global variables to registers. This is not part of Vlad's
original patch and I have no real ideas right now how to do that, but
it would be an interesting optimization for single-thread programs
(like GCC itself).
Minimum Cost Interprocedural Register Allocation (1996)
by Steven M. Kurlander , Charles N. Fischer