C++ Modules

A module system is coming to C++, this page describes the GCC implementation state.

The goal of the module system is to avoid huge header files, thus speeding up compilation. What distinguishes it from things like precompiled headers are:

Implementation State

Development branch: 'c++-modules' (svn://gcc.gnu.org/svn/gcc/branches/c++-modules).

The branch was created, by Nathan Sidwell, Jan 2017, so it is very early days, and I expect it to be several months before there's something of interest.

Random Cleanups

I've been making some random cleanups to the code base. Now stage 1 is open, I'm pushing these to trunk:

Design

I aim to reuse (with suitable abstraction) as much LTO machinery as possible. LTO currently writes out both type trees and gimple instructions, encapsulating the information into additional sections of the output files. Modules needs to write out both type trees and FE AST (before genericization), but not gimple. It also needs to read that information back into the FE. The data will probably be emitted into not-the-object-file, which will be similar to PCH behaviour (PCH machinery will not be used).

It turns out that the overlap with LTO is 'not much'. Just types, and not the same bits of types. So I'm implementing a separate streamer. Oh well.

Mangling

Name mangling needs to be adjusted to deal with module-linkage. This is a compiler-interoperability and toolchain issue, as we want objects from different compilers to be link-compatible, and the debugger able to understand module symbols.

Current thoughts are described in module-linkage.pdf.

Interface Designation

The current specification for modules shows no special syntax for denoting the interface TU of a module. However, implementations need to know immediately after seeing the module declaration whether the TU is the interface or one of the implementation TUs (the latter need to effectively import the interface). It is not until an export declaration is seen that it can be positively determined that the TU is the interface. If there is no export declaration, then the TU is either an implementation, or perhaps an interface that doesn't export anything.

One way of solving this would be a special compilation flag, or file suffix, to denote interface compilation. Either of these approaches would require development changes with things like additional make file rules and editor mode selection, which are not ideal and detracts from a 'just drops in' feature set.

I have taken the approach of requiring a [[interface]] attribute on the module-declaration for the interface TU. C++Kona'17 update: Jason & I are working with Gaby dos Reis on standardizing a way of distinguishing interface from implementation.

module foo [[interface]]; // foo's interface TU

module foo; // one of foo's implementation TUs

Module Linkage

Using inline namespace capability to wedge an invisible namespace for all things with module linkage (with some internal compiler magicness). This namespace is put just inside the innermost regular namespace, so there's one of these per regular namespace. The namespace name is a flat concatenation of th module name with a prefix to put it out of the way of regular namespaces. I.e. 'module foo.bar' will get a namespace '_Mfoo.bar'. Exported objects are in the parent namespace. Thus mangling just drops out. We may well want to do something about demangling and debugging, so that the module name is shown at the start of the symbol, rather than right in the middle of it.

I'm going to change this approach. While making this particular piece trivial, it complicates a number of other pieces in awkward ways. I have a better approach.

Global Module

Declarations before the module-declaration are in the global module. While this is clear enough, it has a complicated interaction with a module interface:

void Foo ();
module Quux [[interface]];
export void Bar ();
void Baz ();

module Quux; // implementation of Quux
void Bar () {
  Baz (); // Baz's declaration visible from purview Quux interface
  Foo (); // ERROR global module decls from interface NOT visible
}

import Quux; // user of Quux
void Baz ()
{
  Bar (); // Quux's Bar
  Baz (); // ERROR: Quux's non-exported Baz not visible.
  Foo (); // ERROR: Foo not visible from Quux interface
}

I have not yet got a good handle on how to approach this.

This is probably the same problem as an interface TU importing (but not exporting) some other random other module:

module Bob [[interface]];
import Baz; // Baz visible to module Bob, but not to importers of Bob
export inline int Frob (int x) {
  from SomeBazFunction (x);
}

Module Files

While not fixed in design yet, I think it possible to write the module file at the end of compiling the interface TU. It's not necessary to write it incrementally. As the data will necessarily be a self-referential recursive structure, write at end is probably the better approach and the usual hashing techniques can be used to remember back-references.

Theorem: when reading a compiled module, it is always safe to reorder the reading of its imports . Specifically, an interface that imports some modules, may be written out such that all the imports are processed before the body of the interface itself:

module bob [[interface]];
... whatever
import baz;
... more stuff
export module bar;
... and finally

is equivalent to:

module bob [[interface]];
import baz;
export module bar;
... whatever
... more stuff
... and finally

This theorem must be correct, otherwise it would not be possible to import two modules that themselves import a third module.

Module re-exporting must be done by reference. Again, this is necessary so that a module may import a module and also import a second module that re-exports that first module.

Documentation