LLVM/GCC Integration Proposal

Sat Nov 19 01:51:00 GMT 2005

Hi Everyone,

At the request of several members of the GCC community, I'm writing  
this email to describe some of my short-term plans with GCC and  
describe an alternative to the recent link-time optimization [1] and  
code generator rewrite [2] proposals.

For those who are not familiar with me, I'm one of the main  
developers working on the LLVM project (http://llvm.org/).  One  
important way that LLVM is currently used is as a back-end for GCC.   
In this role, it provides a static optimizer, interprocedural link- 
time optimizer, JIT support, and several other features.  Until  
recently, LLVM has only been loosely integrated with an old version  
of GCC (a 3.4 prerelease), which limited its effectiveness.

Recently, at Apple, I have been working on a new version of the llvm- 
gcc translation layer, built on GCC 4.  This implementation links the  
LLVM optimizers and code generator directly into the GCC process,  
replacing the tree-ssa optimizers and the RTL code generator with the  
corresponding LLVM components when enabled.  The end result is a  
compiler that is command line compatible with GCC: 'gcc -S t.c -o  
t.s' does exactly what you'd expect, and most standard command line  
options are supported (those that aren't are very specific to the  
design of the RTL backend, which we just ignore).  I plan to have  
this work committed to the Apple branch within a month.

Though not yet implemented, we intend to support link-time  
optimization with many design constraints that match the recent  
proposal [1].  Because LLVM already currently supports link-time  
optimization and has an architecture that makes it straight-forward,  
this work mainly amounts to changes in the GCC compiler-driver.  If  
you're interested in the link-time IPO architecture, there are  
documents that describe the high level ideas [10,11] with some  
(potentially out of date) implementation information.

In this email, I want to briefly talk about the strengths that LLVM  
brings to the table, some of the implementation details of my  
integration work, some of the important ongoing work that we are  
working on, and answer some of the hot questions that will inevitably  
come up. :)

==== Strengths of LLVM

LLVM is a modern and efficient compiler framework built out of  
libraries with well defined APIs.  As I mentioned above, LLVM  
provides an optimizer and code generator.  It also provides several  
other components I won't discuss here.  If you are interested, please  
see LLVM's extensive documentation [9] for more information.

The LLVM mid-level and interprocedural optimizer work on a common  
representation (the LLVM IR), which is a three-address SSA-based  
representation that is somewhat similar to GIMPLE.  It is fully  
specified [3], is easy to analyze and manipulate [4], and is very  
memory efficient (for example, it takes about 50M of memory on a 32- 
bit host to hold the IR for all of 176.gcc, which is about 230K  
LOC).  The IR has a text form (suitable for compiler dumps and fine- 
grained regression testing) and a 100% equivalent compressed binary  
form (suitable for interchange between compile-time and link-time  
optimization steps), both of which correspond to the in-memory IR.

The IR supports several features that are useful to various  
communities, including true tail calls, accurate garbage collection,  
etc.  The IR is continuously evolving to add new features, and we  
plan several extensions in the future.

The optimizer itself has a full suite of standard scalar  
optimizations and also includes a collection of interprocedural  
optimizations and interprocedural analyses, some of which are quite  
aggressive [5] (though this particular set is not enabled by  
default).  The optimizer is fully modular, which allows us to have  
nice tools for working with the IR and optimizer, including an  
automated bug finding tool [6] which makes tracking down  
miscompilations and ICEs really easy.

The LLVM code generator is built on modern techniques.  For example,  
it uses pattern-matching DAG-based instruction selection, maintains  
SSA form up until register allocation, represents code in a form  
quite similar to the "compressed RTL" proposal [7], and supports  
dynamically loaded targets.  We currently have stable code generators  
for PowerPC and X86, with targets for Alpha, IA-64, and Sparc also  
available, but less stable.  LLVM can also emit C code, which allows  
it to support systems for which we do not have a native code  
generator implemented.

The design of the register allocation components is very close to  
that proposed in Andrew MacLeod's recent proposal [2].  We have  
separate instruction selection, global coalescing, and register  
allocation stages, have a spiller that does simple local register  
allocation, etc.  We currently support multiple pluggable register  
allocators, to (for example) support quick -O0 compiles, and to  
support targets who want to have completely custom allocators.  The  
spiller is currently capable of folding spill code into instructions,  
but does not support rematerialization yet.

==== Implementation Details

My current work involves integrating the LLVM optimizer and code  
generator into GCC as mentioned above.  Here is a basic sketch for  
the major pieces this entails:

1. The build system is taught about C++ code.
2. A configure option (--enable-llvm) has been added to GCC's configure
     script.  This enables an ENABLE_LLVM #define.  Every LLVM-specific
     piece of code is protected by this macro.  Compiling without -- 
enable-llvm
     results in no behavior change.
3. Main is compiled as a C++ function, if ENABLE_LLVM is set.
4. A new header file, llvm.h, defines a small number of C functions  
which
     the extant GCC code calls into.  For example,  
tree_rest_of_compilation
     calls through this interface to compile a function body.  The  
GCC C code
     is all still compiled with a C compiler.  These calls are not  
particularly
     invasive, and are all protected with ENABLE_LLVM as described  
above.
5. The LLVM interface is implemented with 4 C++ files, which are  
currently
      about 4000 lines of code.  This code basically translates  
GIMPLE into LLVM.
6. The binary links in the standard LLVM libraries, including  
optimizations
      and code generators taken from an LLVM build tree.

This design point makes the work a completely optional opt-in  
component with a low impact on the existing GCC source base.  We  
clearly do not support all of the targets that GCC does, so those  
unsupported ones will not be able to use LLVM (currently, see below  
for more information).

====  Ongoing Work

While it has many virtues, LLVM is still not complete, and we are  
investing in adding the missing features.  In particular, this system  
is missing two key features to be truly useful: debug info support  
and inline asm support.  Apple is currently investing in adding both  
these features, as well as adding vector support.

In the future, I anticipate work to extend the LLVM to be capable of  
capturing higher-level information (e.g. that needed for dynamic  
dispatch optimizations and more alias information), as well as  
supporting new targets and the usual collection of miscellaneous  
improvements to the current system.

The project is in early stages of development, but code performance  
and compile times are currently comparable to an unmodified GCC  
compiler on PowerPC.

==== Answers to some hot questions

* What about licensing and copyright assignment issues?

The patch I'm working on is GPL licensed and copyright will be  
assigned to the FSF under the standard Apple copyright assignment.   
Initially, I intend to link the LLVM libraries in from the existing  
LLVM distribution, mainly to simplify my work.  This code is licensed  
under a BSD-like license [8], and LLVM itself will not initially be  
assigned to the FSF.  If people are seriously in favor of LLVM being  
a long-term part of GCC, I personally believe that the LLVM community  
would agree to assign the copyright of LLVM itself to the FSF and we  
can work through these details.

* LLVM is written in C++, eww yuck!

It is true.  However, the C++ness only exists in the LLVM portions of  
the resultant compiler, no other parts have to be converted to C++  
(well, except for main).  Though there is a lot of potential to  
misuse C++ in horrible ways (it is a sharp tool after all) LLVM has  
been quite successful at using it for good, not evil.  While I don't  
expect everyone to like use of C++, others will hopefully respect  
that it allows us to build good APIs, increase modularity, and get  
more done in less time.

* LLVM is missing support for representing high-level information!

First, this isn't true.  LLVM does capture some important pieces of  
high-level information, and is continuing to add more.  There is a  
fundamental design tension here between building a compiler that  
supports multiple languages (in a link-time context, that means it  
can link translation units from multiple languages together and  
optimize across the boundary) and building a compiler that supports  
only one.

Specifically, advocates of the recent link-time-optimization proposal  
[1] may claim that exposing all of the information that DWARF  
captures (e.g. about the type system) at link-time is a good thing:  
it makes it easier to do specific high-level optimizations, because  
all of the information is trivially available.  However, they are  
ignoring the issue of handling multiple languages at the same time:  
an optimizer with this design has to be aware of (and be able to  
update) all possible source-language information and must be extended  
in the future as new languages are added (this is the problem with  
universal compilers, dating back to the UNCOL design).  Also, the  
linker needs to know how to combine declarations defined in different  
languages, because an interprocedural optimizer only want to see one  
declaration for each logical variable (for example).  This quickly  
becomes difficult and messy, which is presumably why the link-time  
proposal allows the linker to "give up" linking two translation units.

In contrast, LLVM has historically leaned the other way: it exposes  
only the *important* high-level information in ways that are useful  
to the optimizers.  As we find that we need more information, we  
extend it to capture this information as needed.  This allows the  
optimizers to be simple and forces us to think about the key aspects  
of the information we need to capture before we add it.  LLVM will  
continue to evolve this way, being extended to support new  
information as it is needed.

* LLVM is missing X, Y, Z feature!

We are continuing to improve and extend LLVM.  As I mention above,  
three of the most important ones for Apple (debug info, inline asm,  
and vector support) are on the short-term todo list.  Others will be  
tackled in time (and help is always appreciated!).

* This will not support <some target>!

As describe above, we won't support every target that GCC currently  
does.  Three options are possible:

1. We (the GCC community) could build an LLVM to GIMPLE translator.   
This would probably take about as much work as the GIMPLE -> LLVM  
translator (about  4000 LOC), which is not a huge project.
2. We could build an LLVM to RTL translator.
3. Target maintainers can build an LLVM port for their target.

In any case, it is certain that GCC will not drop its existing RTL  
backend any time in the near future.

==== Conclusion

As I mentioned above, this work is currently underway.  I anticipate  
committing this patch to the Apple branch within a month, and am  
wondering if people are interested in having this functionality on  
the mainline GCC branch.

This work clearly overlaps with the two recent proposals [1,2].  I  
suggest that improving LLVM to address deficiencies in it would be  
easier and more man-hour-efficient than building a whole new  
interprocedural optimization component and rewriting large portions  
of the GCC backend.  Further, this proposal differs from the other  
two in that it is already largely implemented and works.  Finally,  
there is no reason that multiple different approaches cannot be  
developed in parallel, if people desire such an approach.

Thoughtful feedback appreciated,

-Chris

==== References

[1] http://gcc.gnu.org/ml/gcc/2005-11/msg00735.html
[2] http://gcc.gnu.org/ml/gcc/2005-11/msg00783.html
[3] http://llvm.org/docs/LangRef.html
[4] http://llvm.org/pubs/2004-09-22-LCPCLLVMTutorial.html
[5] http://llvm.org/pubs/2005-05-04-LattnerPHDThesis.html
[6] http://llvm.org/docs/Bugpoint.html
[7] http://gcc.gnu.org/wiki/Speedup%20areas
[8] http://llvm.org/releases/1.6/LICENSE.TXT
[9] http://llvm.org/docs/
[10] http://llvm.org/pubs/2004-01-30-CGO-LLVM.html
[11] http://llvm.org/pubs/2003-05-01-GCCSummit2003.html