Alternatives to GC
Original proposal and discussion: http://gcc.gnu.org/ml/gcc/2012-11/msg00238.html
Contents
Introduction
As we continue adding new C++ features in the compiler, gengtype is becoming an increasing source of pain. In this proposal, we want to explore different approaches to GC that we could implement.
gengtype is a poor C parser. It is a long way from handling all of the C++ constructs what are used by the C++ standard library. We have a few potential solutions.
Current Implementation Plan
After the discussion from the original proposal, we will proceed with a two stage plan: a transitional plan to quickly reduce our reliance on gengtype, and a long term plan where we will attempt to introduce memory pools to replace GC. Below is a more detailed description (taken from http://gcc.gnu.org/ml/gcc/2012-11/msg00475.html)
- A transition plan to quickly remove gengtype out of the picture. This has become the main blocker for several C++ cleanups. The transition here is to move all the GTY() structures to use user-provided markers. With this, we are pretty sure that we can minimize the role of gengtype to the point that it doesn't need to understand C++. We would keep our current GC and PCH implementations unchanged. This will be ready for the 4.9 release.
- Start implementing memory pools for data structures that do not need to be in PCH images. It is still not clear what types of memory pools we will need, but at a minimum we are thinking of a permanent pool, per-pass pools and at least one or two stage-based pools (e.g., front ends). We may be able to have some implemented for the 4.9 release.
As far as PCH is concerned, we are considering 2 alternatives: (a) move to a type streamer based on the work in the PPH branch, and (b) get rid of it completely, as this feature will be obsoleted by C++ modules. We do not really think that option (b) is advisable until C++ modules are a more concrete reality.
We will attempt to re-implement PCH to use the streaming infrastructure from the PPH branch. This will allow us to share most of the streaming code used by LTO.
Options
The table below summarizes the different approaches that we are considering. These approaches are described in more detail further down. Together with each approach, we include a popularity indicator, based on the discussion in the original proposal.
The popularity column is a tally of the "votes" expressed by the participants of the original thread. People voted for more than one approach, so there are more votes than participants.
This does not necessarily mean that the most popular choice will be the one we end up implementing. We may experiment with other options, and may even start with short/medium term solutions, in preparation of a longer term one.
Approach to memory management |
Popularity |
Replace GC with pool allocation |
8 |
Implement GTY using manually-implemented markers |
4 |
Replace GC with reference-counted pointers |
4 |
Use conservative GC with boehm-gc |
2 |
Upgrade gengtype to be a full C++ parser |
2 |
Limit the language used so gengtype understands it |
2 |
Move GTY marker processing to cc1plus |
1 |
Continue using gengtype unchanged |
0 |
Since the choice of a GC strategy affects PCH, we also need to consider approaches to PCH:
Approach to PCH |
Popularity |
Permanent addresses for PCH |
1 |
PCH via streamable types |
1 |
Approaches to memory management
Continue using gengtype unchanged
This is not a viable long term approach. As gengtype currently does not handle what we need, something must change. If we continue using gengtype we must make changes to it or the language constructs we use for it to continue working.
Limit the language used
We could avoid the problem by limiting the language we use to near that which gengtype currently understands. This approach has significant consequences. It will make the standard library incompatible with GTY. It will prevent the use of common idioms, such as iterators, within GTY types. These limitations are significant and not desirable.
Upgrade gengtype to be a full C++ parser
Full C++ support would essentially require building a new C++ parser, or using another one (e.g., Clang).
Both limit and upgrade gengtype
We can try upgrading gengtype to handle a more extensive subset of C++, without trying to handle the full language. See Thoughts on Gengtype and Single Inheritance. Taking this approach would likely mean we would be unable to use the C++ standard library within GCC.
Move GTY to cc1plus
Instead of a separate weak parser, we would make cc1plus understand GTY attributes. The compiler would emit IL in the object files instead of generating source.
This solution would require a first bootstrap stage that did not support PCH, because we cannot rely on the seed compiler supporting GTY. We would probably need to use the Boehm collector during the first stage as well.
Because the first stage would be fundamentally different from the second stage, we may need to add an additional stage for correct object file comparisons.
Additionally, this approach would not work with cross compilation or canadian crosses. We do not consider this a viable approach.
Do GC marking manually
In this approach we would be converting GTY from a declarative form into a procedural one using the C++ type system and manually-implemented structure field handlers. At the highest level, this roughly means that all the structures using GTY(()) markers will start using GTY((user)).
The marking code dealing with the semantics of the marked data structure is co-located with the data structure. When GC problems occur, this simplifies debugging. The user is no longer faced with a mass of auto generated code that is unfamiliar and hard to trace. The code to implement is straightforward. For every pointer field in the given instance pointer, a single "mark" function needs to be called.
Common facilities to mark arrays will be provided via a base GC class. No generated header files to #include at the end of the source file. No new dependencies to add to the Makefile. No need to parse C++.
The problem with this approach is that the declarative approach puts the field additions and the GTY annotations in the same place, as opposed to in separate code. While it is possible to use standard library containers with GC pointers in them, we would not be able to write these containers to PCH images.
Replace GC with pool allocation
This approach engenders more choices. The pre-compiled header implementation uses gengtype, so this approach is not viable unless we have an alternate implementation for PCH. We will also need another approach to memory management.
In this approach, we would create different memory pools from which allocation occurs. This essentially means using obstacks for all memory management in the compiler, but we would take full advantage of C++
Use the Boehm collector
The general approach is to define allocation and deallocation functions to the Boehm collector that handle memory within the PCH range. We would also provide a class to register global roots at the point of declaration of those roots. We would likely configure the collector to collect only on command, not automatically. Laurynas says previous efforts showed that the peak memory usage was slightly higher that with the existing collectors due to Boehm's conservativeness. The run time was comparable.
Replace GC with reference-counted pointers
We need to identify all pointers that result in circular structures and not use smart pointers for them. We would probably still have some circular structures resulting from the compilation of mutually referencing structs.
In this approach we would make use of C++'s shared_ptr and weak_ptr.
Approaches to PCH
Permanent addresses for PCH
In this approach we would create an alternate implementation of PCH using mmap and a region allocator. The essential difference is that we need to allocate the data structures in their final location, rather than the current approach of allocating someplace and then walking the trees to move them to the desired and fixed mmap address. Because this implementation would not do a final copy, it may leave allocation holes in the memory region. To reduce this cost, we can zero all unallocated memory in the region and then compress it before writing the PCH file. Compressing the file has already proven effective on existing PCH files, and so the compression work would not be a problem.
We still need to indicate which objects go into PCH memory. For compiler-specific types that always go into PCH memory, we can create a base class that provides class-member operators new and delete. These would allocate from the region allocator. For C++ standard library structures parameterized by allocators, we could provide an allocator that does the same thing. For other types, we would require more explicit allocation and deallocation.
PCH based on streamable types
Every type knows how to build a bytecode vector for its contents. A stream manager asks all the instances to build their bytecode vectors and writes them to disk. This build on the streaming support implemented for LTO.