Status Update 04/2012
GCC has been building stages 2 and 3 in C++ mode for a while. We are now working on switching the default to build in C++ mode from stage 1. This means testing that all the commonly used targets have C++ compilers able to handle GCC.
If you regularly build GCC, please see the C++ build status page for a list of targets that need to be validated. You can follow the instructions in that page to build your favourite target and report its status.
Additionally, the following efforts are underway:
Re-implement vec.h in C++.
Produce an initial version of C++ coding conventions.
Status Update 06/2010
Use of C++ in gcc has now been approved by the Steering Committee, please see the following links for additional details:
The gcc-in-cxx branch
This page is meant to eventually help document the ongoing effort in the "gcc-in-cxx" branch to make gcc compile in C++ mode, i.e. as C++ source code. So, the goal of this branch is to facilitate switching GCC's implementation language to C++.
The initial goal of the gcc-in-cxx branch will be to produce code which is quite close to mainline, but compiles gcc with C++ (as of 05/2009, this reflects the current status and scope of this effort).
Background: What matters for gcc going forward is that it continue to be comprehensible and maintainable. That is a struggle that gcc has faced for its entire existence as a free software project. It is certainly true that using C++ unwisely can make that struggle more difficult. But this issue is not qualitatively different from the issues we face today.
Whether we use C or C++, we need to try to ensure that interfaces are easy to understand, that the code is reasonably modular, that the internal documentation corresponds to the code, that it is possible for new developers to write new passes and to fix bugs. Those are the important issues for us to consider. The C++ features which are not present in C--features which are well documented in many books and many web sites--are not an important issue.
For additional background information on this effort and its scope, please check out http://airs.com/ian/cxx-slides.pdf This branch can be easily viewed and browsed using your browser http://gcc.gnu.org/viewcvs/branches/gcc-in-cxx/
Rationale
Migrating gcc to C++ as implementation language:
- C++ is a standardized, well known, popular language.
- C++ is nearly a superset of C90 used in gcc.
- The C subset of C++ is just as efficient as C.
- C++ supports cleaner code in several significant cases.
- C++ makes it easier to write cleaner interfaces by making it harder to break interface boundaries.
- C++ never requires uglier code.
- C++ is not a panacea but it is an improvement.
Approach
One possible approach would be to compile code on the branch. Where it fails, fix it so that it compiles. Then, if appropriate, move the patch back to mainline, test the patch there, and submit it for mainline.
Possibly useful Tools
Among Linux kernel hackers, the coccinelle tool has been increasingly used to do automated source code transformations using semantic patching. The tool itself is admittedly somewhat undocumented at the moment, but it is extremely useful and powerful and seems possibly like an ideal candidate for an effort like the gcc-in-cxx branch, simply because this effort does naturally involve lots of source code transformations, that would be much less tedious if they could be largely automated. Here are some related pointers:
Coccinelle: A Language-Based Approach to Managing the Collateral Evolution of Linux Device Drivers
Semantic Patches for Documenting and Automating Collateral Evolutions in Linux Device Drivers
The Semantics of “Semantic Patches” in Coccinelle: Program Transformation for the Working Programmer
Contributing
This development branch follows the usual gcc maintainership rules, except that any non-algorithmic maintainer may additionally approve or commit patches which permit compilation with C++.
Project Status (last updated 06/2009)
Phase 1 of gcc-in-cxx now complete (06/2009)
- gcc-in-cxx completes bootstrap as of 03/2009
Development Plan (last updated 06/2009)
- It would probably make sense to encourage new gcc contributions/patches to be valid C++, in order to reduce the maintenance burden of keeping the gcc-in-cxx branch up to date with new patches committed to HEAD.
- For each difference between trunk and gcc-in-cxx:
- Try to implement a -Wc++-compat warning which detects the change.If it is possible, implement the warning, and make the changes to let gcc bootstrap with the warning.If a warning is not possible for some reason, I will simply propose the change by itself. I expect this will be a small subset of the changes, mostly related to the build system and to low-level configuration like ansidecl.h.
- This process will eventually eliminate all differences between trunk and gcc-in-cxx, at which point gcc-in-cxx can be retired.
Implement a configure option, --enable-c++-build or something like that, which builds gcc with a C++ compiler (available in the form of --enable-build-with-cxx).Currently the generator programs (genattrtab, etc.) and libcpp are still compiled as C.
- Begin the lobbying process for changing the default value of the configure option.
- "it would be a good thing to try forcing the C++ host compiler requirement for GCC 4.5 with just building stage1 with C++ and stage2/3 with the stage1 C compiler. --disable-build-with-cxx would be a workaround for a missing C++ host compiler."
- Start running regular builds with that option, to avoid any regressions in C++ buildability for cases for which there is no -Wc++-compat warning.
- Develop some trial patches which require C++, e.g., convert VEC to std::vector.
- Test starting the bootstrap with earlier versions of the compiler to see which C++ compiler version is required, and document that.
- Petition the steering committee for formal approval to switch to C++.
Starting Points
- converting VEC to std::vector--is a good starting point. This is the interface in vec.h.
- Another easy starting point would be converting uses of htab_t to type safe C++ hash tables, e.g., std::tr1:;unordered_map. Here portability suggests the ability to switch to different hash table implementations; see gold/gold.h in the GNU binutils for one way to approach that.
- Another easy starting point is finding calls to qsort and converting them to std::sort, which typically leads to code which is larger but runs faster.
Quoting (1):"Longer term, we know that memory usage is an issue in gcc. In the old obstack days, we had a range of obstacks with different lifespans, so we could create RTL with a temporary lifetime which was given a longer lifetime when needed. We got away from that because we spent far too much time chasing bugs in which RTL should have been saved to a longer lifetime but wasn't. However, that model should permit us to run with significantly less memory, which would translate to less compile time.I think we might be able to do it by implementing a custom allocator, such as a pool allocator which permits allocating different sizes of memory, and never frees memory. Then the tree class could take an allocator as a template parameter. Then we would provide convertors which copied the tree class to a different allocation style. Then, forexample, fold-const.c could use a temporary pool which lived only for the length of the call to fold. If it returned a new value, the convertor would force a copy out of the temporary pool. If this works out, we can use type safety to enforce memory discipline, use significantly less memory during compilation, and take a big step toward getting rid of the garbage collector."
Remaining Issues (last updated 2009-05-14)
- In C an externally visible "inline" function (i.e., not "extern", not "static") is assumed to be accessible outside of the current translation unit (e.g., vn_nary_op_compute_hash in gcc/tree-ssa-sccvn.c). In C++ this is not the case. It is also not the case in C99, so this should be addressed anyhow, somehow, although it doesn't seem to be a good fit for -Wc++-compat.
- In C a const variable which is neither "extern" nor "static" is visible outside of the current translation unit. In C++ it is not,without an explicit "extern" declaration. I'm not sure how best to handle this with -Wc++-compat, since C does not permit initializing an "extern const" variable.
The C++ frontend does not support attribute ((unused)) on labels. The generator programs produce a lot of unused labels. Fixing this in the C++ frontend may be awkward because C++ syntax permits a declaration to follow a label, so it may not be clear which one gets the attribute.
- The C++ frontend emits some warnings on code which is known to be never executed, which the C frontend does not. This leads to some warnings compiling code in gcc. I think it is reasonable to fix this in the C++ frontend.
Major TODO
work out the details of using STL containers with GC allocated objects. This means teaching gengtype how to generate code to traverse STL containers, which would then be used during GC. This is not a task for the faint-hearted. But see also here Tom Tromey's hint.
Resources
Related mailing list discussions include (listed chronologically):
replacing qsort with std::sort (08/2009)
Trunk fails to bootstrap with --enable-build-with-cxx (08/2009)
zlib? (07/2009)