This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Register Allocation issues with microblaze-elf

From: Michael Veksler <mveksler at tx dot technion dot ac dot il>
To: Vladimir Makarov <vmakarov at redhat dot com>
Cc: Michael Eager <eager at eagerm dot com>, GCC <gcc at gcc dot gnu dot org>
Date: Thu, 14 Feb 2013 21:41:17 +0200
Subject: Re: Register Allocation issues with microblaze-elf
References: <511B3460.8050100@eagerm.com> <511C15EB.3000801@redhat.com> <511C238E.6060207@eagerm.com> <511C3DAD.4050703@redhat.com> <511CA466.9020702@tx.technion.ac.il> <511D115A.5070205@redhat.com>

On 02/14/2013 06:31 PM, Vladimir Makarov wrote:

On 02/14/2013 03:46 AM, Michael Veksler wrote:

[snip]

I am reading this thread and getting more and more puzzled. The RA stuff is very complicated, having many constraints and many dependencies with other passes. Taking this into account, it seems that no heuristic algorithm can even get close to an optimal register allocation. A heuristic algorithm can't take all effects and side-effects into account simultaneously.

Considering all that, why GCC does not use generic optimization algorithms for RA? A generic optimization can take all issues into account, simultaneously. I am talking about ILP/MIP (Integer Linear Programming/Mixed IP), SAT and CSP [1] (Constraint Satisfaction Problem). There has been a lot of progress in these areas, and the solvers are much faster (orders of magnitude) than they were 10 years ago.

Actually, I tried ILP solver first time in 2003. I am returning to this about every 3 years and every time I fail to produce something useful (at least an acceptable in some cases solution which makes compiler only ten times slower). I've tried a lot of models. From more sophisticated ones to simpler ones, pure solvers or some hybrids of heuristics and ILP solutions. All solutions were too slow or generated not better code.

In many cases ILP is faster than CP (Constraint Programming), but there are quite a few cases when CP and SAT are much faster. Speed issues with modeling RA as ILP do not rule out neither SAT nor CP.

Last time, I had hope to use massive parallelism of GPUs for this. After investing of much my time to learning simplex algorithm theory (a base for ILP solvers), I found that simple tableu-based algorithm is parallelized very well but for RA tasks even using 1000 GPU elements is still behind revised simplex algorithm which is not parallelized. GPUs are not good for very sparse matrix problems which is the case of RA problems.

More and more it sounds like a CP or SAT.

Still some important RA problems (like rematerialization) is very hard to describe by ILP.

It sounds that your model requires many 0,1 variables. If it is so, then maybe SAT is a better choice than ILP.

I don't see any progress, ILP solver may be ten times faster but they still have exponential complexity. I don't know any industrial compiler which uses exact solver for RA (RA in GCC is even more complicated as it requires code selection too).

But the solver need not be exact. ILP uses branch and bound, which can be interrupted after some time giving the best solution so far. The best solution, so far, could be good enough for RA. I have seen huge SAT instances solved in a fraction of a second. These problems have a very clear structure that the solver takes advantage of, but why the RA problem should be different?

So why ILP/SAT/CSP aren't used in RA? I don't think they will work much slower than what RA does today ( they will be somewhat slower but not much). I do believe that there is a good chance to get much better results from RA with these technologies.

You know, some LLVM guys had the same thoughts and as remember they had PBQP (I tried this too) based RA (they had too many different RAs) and now they have finished with just a good heursitics RA which works best.

I don't how PBQP works with RA, but if it uses branch & bound over integer variables then it may have performance issues there.

I'd like to invest some time into a feasibility check, if someone is willing to work with me on modeling the RA problem (or at least several examples, such as the above).

Good luck with that. I am serious.

I need some help to get a basic model, even a hand-written model for a given input to RA just as example (before diving deeper into it). I'd be happy if you or somebody else can guide me through. At least, where should I look in the source of GCC to figure some of these things out, or where can I find some documentation/papers/presentations on RA and its input data-structure (and output requirements).

Michael

Follow-Ups:
- Re: Register Allocation issues with microblaze-elf
  - From: Vladimir Makarov

References:
- Register Allocation issues with microblaze-elf
  - From: Michael Eager
- Re: Register Allocation issues with microblaze-elf
  - From: Vladimir Makarov
- Re: Register Allocation issues with microblaze-elf
  - From: Michael Eager
- Re: Register Allocation issues with microblaze-elf
  - From: Vladimir Makarov
- Re: Register Allocation issues with microblaze-elf
  - From: Michael Veksler
- Re: Register Allocation issues with microblaze-elf
  - From: Vladimir Makarov

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]