GBE: A Gimple Back End for GCC
The Gimple Back End (GBE) project is going to build an alternative to the Register Transfer Language (RTL) backends that is currently used in GCC. The plan is to leverage off the extensive, modern, Gimple infrastructure that has been developed and extend that technology at least as far as the register allocation pass.
The technology used in optimizing compilers has improved greatly in the last 15 years. While GCC has continued to improve, the RTL back ends have been slow to adopt new technology. Meanwhile, the Gimple based middle end of GCC has kept pace with technology developments.
There have been many attempts to upgrade the technology in the RTL back ends. What progress that has been made has been very difficult and many of the projects have simply died trying. Part of the reason for this is complexity of writing an RTL back end. While the rest of the world uses mostly declarative descriptions to describe an architecture, the RTL back ends are filled with ad hoc C code. This makes the RTL back ends very brittle: virtually nothing can be changed or upgraded in the common part of the RTL back ends without invalidating some assumption made in some back end.
In the commercial world, compilers have a fairly short life cycle, which means that there is only a short time to accumulate the cruft that the RTL backends currently have. Contrast this with the Gimple middle end which is only a few years old, and in the short period of time has gone thru a least two generations of fundamental data structures.
Compilation for modern targets requires more than good instruction selection. While the ways of defining an RTL back end are complex, in the end, the RTL back ends do a good job of instruction selection. Where they fall down is in the area of resource management. It is virtually impossible in the current software stack to add passes that are aware of the exact number and kind of resources that are available on the target until after all of the final assignments have been made.
In other modern back ends, the loop and scheduling passes are aware of the number of registers that are not only available but are actually being used in the region being transformed. It makes no sense to unroll a loop or to move an item out of a loop if all that it going to do is add spill code. In the GCC software stack many of the optimization are now done in the middle end because of the availability of good aliasing and SSA form.
The Gimple Back End project's goal is to build an alternative path from the last of the Gimple middle end passes to the to the assembly language generation pass that reuses as much of the Gimple infrastructure as possible and incorporates modern code generation, scheduling, loop transformations and register allocation.
Building a backend for GCC using the Gimple middle end infrastructure is not a new idea. It was proposed about 6 years ago. The difference between then and now is that Gimple itself as well as the SSA form and aliasing are now mature and as good as similar implementations in any other production compilers. Furthermore, a large portion of the active GCC development community is familiar with the internals of these structures.
Building on this infrastructure is what makes this project reasonable. It is our intention to use whatever resources are available rather than reinventing the wheel.
To this end, the CGEN project is the current plan for being the back end description language. While it currently is not industrial strength, it will be a lot easier to fill in the missing parts than it would be to start from scratch. Doug Evans has already volunteered to to help with this and that may be enough to get the more of the original developers interested in finishing off the project.
We also have an implementation of the IBURG code generator contributed by Preston Briggs at Google. This has been put in a special directory pending modification to make it work with Gimple and take input from CGEN. This is currently in the iburg/briggs directory of the branch.
What about the Old RTL Back Ends
The plan is to build a parallel path to the existing RTL path. There is no reason or plan to fork the compiler. If we can demonstrate that this new path provides benefit, then at least the active backends will be encouraged to move. Unfortunately, moving any port will require starting from scratch; the are no plans to try to automatically convert a port from RTL to Gimple. However, if we are successful, doing a port will be much simpler that doing an RTL port. One of the specific goals is to avoid much of the ad hoc C code that is required for today's RTL ports and to rely on better analysis algorithms that can analyze high level descriptions of the target architecture.
Scope of Work
By simply lowering Gimple, we will automatically be able to rerun or move many passes from the Gimple middle ends with little or no changes. However, a large number of pieces of the back end will require building from scratch.
The first phase of the project will be to build the simplest shortest path thru the compiler that works. The current view is that that minimal path is:
- prologue/epilogue generation
- stack assignment/base register allocation
- IBURG code generation
- register assignment/coalescing
Later work will embellish this stack with existing Gimple passes or new passes to include scheduling, dag combining, tree balancing ...
One thing to note about this stack is that code generation, spilling and register assignment are all separate passes and that new passes may be added to this stack by inserting them between these passes. This represents a complete shift of thinking in how back ends are managed from the time when the RTL back end was first developed. It is based on two fundamental assumptions:
- If the program has no uninitialized variables, then register pressure is an exact predictor of the number or registers that is needed.
- If the program is in SSA form, the register assigner will only need to coalesce, it will not even need to split. This assumption is not true in the presences of preassigned hardware registers, which most architectures have. But it does mean that the amount of spilling can be localized to the areas around that usage.
These two facts mean that it is both reasonable and desirable to build a back end that is based on small simple passes, rather than the large monolithic passes like reload and combine that dominate the current RTL back end.
Short Term Projects
- Machine Description. We are currently using CGEN. It needs a lot of enhancement in both being broad enough to describe all of the GCC targets as well as being able to generate the descriptions that we will need to drive the optimization passes.
- Lowering. The new Gimple back end will most likely be spliced into the current stack of optimizations before the cfgexpand pass. However, cfgexpand is a complex pass and some of the functions inside of it will also be necessary for the Gimple back end.
Factoring cfgexpand into a series of separate passes is a necessary first step. Diego has indicated that doing this splitting is a generally good idea and so these patches are most likely to accepted onto the GCC trunk long before this branch is ready to be merged.
- Prologue/Epilogue generation. At the high level, this will not be that much different from current pass except that it will generate Gimple rather than RTL. There is still some discussion as to where this pass should be in the stack.
- Stack Assignment. There is an implementation of this bundled in the cfgexpand pass. This is one of the pieces of technology that should be unbundled. However, this pass is currently being run too early and assigns stack locations assuming that the live ranges of variables are known at that time. Any subsequent code motion could invalidate that assumption causing wrong code. The GBE version of this pass must be run all code motion is complete.
- IBURG Code Generation. We have a partial implementations of IBURG. The first step will be that the input language will need to come from CGEN and the code to pattern match against will be the Lowered Gimple form, not the some messaged version of RTL.
- Register Assignment/Coalescing. There has been a lot of recent work in this area. At one extreme is the algorithm of Hack that can quickly coalesce and assign the registers in a program in SSA form that has no uninitialized variables. The problem with this algorithm is that it does not handle register pairs or preassigned registers, both of which are common on real machines with real register sets. There have been several heuristic extensions to this.
The interesting part of register assignment is that if it does not have to ever spill, it is unlikely to ever grow the complexity of the current reload stack in our back ends. This will allow many different register assignment passes to be tried, and the ultimate solution may be that there are several to be chosen from when the compiler is generated. There is no particular reason to assume that the best possible choice for an X86 would look anything like the best possible choice for the PowerPC. If that is the case, keeping the assignment passes simple and configurable has a lot of appeal. The same will likely be true for scheduling (which is not one of the "need to do to get code working" passes).
You Can Help
Nothing here is carved in stone. This is a project that will involve people with all levels of GCC experience and expertise. All are welcome to join in.
On a project like this it is important to be both flexible and have a broad vision. By definition, for a project like this to be successful requires that a large number of different parties each gets what they need out of it.
Each of the people who have expressed an interest in doing some part or GBE are doing so either because they see upside in a platform that they have an interest or that they are just interested in doing some part of a project where they can have influence on BOTH the design and implementation. Many of the passes that one can envision in this framework will only be useful on a subset of the platforms, and it will be important to make sure that we develop a flexible framework that easily accommodates the sharing of passes between ports.
The initial stack of passes contains problems that require a broad range of expertise. Some of the passes just require a literature search to find and implement the best algorithms. Others will require algorithmic work from scratch.
In some ways, GCC has fallen behind other compilers largely because it has been difficult to make the loop optimizations, vectorization and scheduling as good as in other compilers. This project has the potential to solve many of those issues.
Where Are the Bits
So far there is not that much visible.
There is a "gimple-back-end" branch. This has the contributed copy of Preston Briggs IBURG implementation.
Status So Far
Ken Zadeck had been working on a private port and is writing a machine description in CGEN. When that is finished, he will then update the PowerPC CGEN description and make that public. Doug Evans has been working on a X86-32/64 CGEN port as well as extending CGEN. It is hoped that the PowerPC and the X86-32 represent different enough architectures to demonstrate the effectiveness and coverage of this project.
The changes to CGEN to support these ports are beginning to show up in the CGEN trunk on sourceware, see http://sourceware.org/cgen/.
Who Do I Contact?
Kenneth Zadeck ( firstname.lastname@example.org ) (zadeck@irc) is the prime cheerleader of this project.
I generally respond to email and irc very quickly. I am much slower with the news groups.
Doug Evans ( email@example.com ) is the prime cheerleader of CGEN.