C++ Modules

A module system is coming to C++, this page describes the GCC implementation state.

The goal of the module system is to avoid huge header files, thus speeding up compilation. What distinguishes it from things like precompiled headers are:

Implementation State

Development branch: 'c++-modules' (svn://gcc.gnu.org/svn/gcc/branches/c++-modules). Reporting bugs

The branch was created, by Nathan Sidwell, Jan 2017, so it is very early days, and I expect it to be several months before there's something of interest.

Random Cleanups

I've been making some random cleanups to the code base. Now stage 1 is open, I'm pushing these to trunk:


Due to the experimental nature of the implementation, I'm not very interested in bug reports just yet. If you're directly working with me, you'll already know how to get my attention.

Here's a list of known not-working significant features:


There are two main pieces of work, (a) streaming to disk, (b) name lookup.

The original plan was to try and reuse LTO's streaming technology for the former. But that turned out to be impractal as there is not much overlap. LTO streams GIMPLE and language-agnostic type information. Modules need AST representation and FE type information. So I went the hand-written auto-numbering streaming route.

Name lookup started by abusing inline namespaces, but that too proved impractical. We'd need the ability to turn these namespaces on and off, and to do that requires changes to name-lookup. Once you're making that kind of change, one may as well do it properly. As a benefit, name-lookup has gotten a lot cleaner.


Name mangling needs to be adjusted to deal with module-linkage. This is a compiler-interoperability and toolchain issue, as we want objects from different compilers to be link-compatible, and the debugger able to understand module symbols.

Current thoughts are described in module-abi-2017-09-01.pdf.

Interface Designation

At the start of implementation, there was no special syntax for denoting the interface TU of a module. But implementations need to know immediately after seeing the module declaration whether the TU is the interface or one of the implementation TUs -- they cannot defer that decision. This has now been resolved with the 'export module foo;' syntax designating the interface TU.

Compiling the interface TU generates a Binary Module Interface. This BMI is read in by each implementation TU and each importer of the module. There's clearly a dependency between these things, which is different from header files because we have to invoke the compiler to generate the BMI. I have now implemented a hook in the compiler that can determine what to do if a BMI is not found. The default implementation of this wrapper script invokes the compiler to generate the BMI.

The BMI is not a distributable artifact. Think of it as a cache, that can be recreated as needed.

Module Linkage

I am not presuming any new linker technology. Module ownership is a new concept, and at least for module-linkage names, must be reflected in the name mangling. Exported names need not reflect this ownership.

I am working with the Clang developers to define interoperable changes here. To facilitate migration of code, mangling of exported entities does not change.

Invoking the Compiler

There are several new options for modules:

BMI Search Path

When a BMI file is needed (i.e. from an import declaration), the module path is searched. This is exactly like the header include path, but without the distinction of user & system portions. The module path is generated as follows:

Duplicates are removed in the same manner as the header path. The multilib suffix (if any) is appended to each entry added to the module path.

Wrapper Script

If a BMI file cannot be found, a wrapper script is used. The wrapper script is found by

to disable the wrapper, either set CXX_MODULE_WRAPPER to 'false' or specify -fmodule-wrapper=false (this invokes /bin/false, which always fails).

The wrapper is invoked as:

  wrapper <module-name> <bmi-file> <source-file> <current-file>

I'm not sure if there should be trailing arguments showing the complete import path -- an import can itself need a BMI.

The wrapper should generate the BMI such that it is located on the module path. If the wrapper is successful (returning a zero exit code), the search path will be searched again. If the BMI is not found this time compilation will fail.

The default wrapper is a bash script that looks for a C++ source file in $CXX_MODULE_PATH that matches the bmi-file name with the suffix replaced by a recognixed C++ source suffix. It then reinvokes the compiler using $COLLECT_GCC and $COLLECT_GCC_OPTIONS.

The wrapper machinery is not fully generalized yet, and the default is a bash script.


Put the following in hello.cc:

#include <stdio.h>
export module hello;
export void greeter (const char *name)
  printf ("Hello %s!\n", name);

and put the following in main.cc:

import hello;
int main (void)
  greeter ("world");
  return 0;

Now compile with:

   g++ -fmodules main.cc hello.cc

You should see the following output:

cc1plus: note: invoking module wrapper to install 'hello'
Compiling module interface hello (hello.cc)
cc1plus: note: completed module wrapper to install 'hello'

Because I specified the input files with main.cc first, it met the 'import hello;' declaration before having compiled 'hello.cc'. So the wrapper script was involved in building hello.cc. Note though, that the outer compilation is unaware of that, and will rebuild hello.cc because it was named on the command line.

You can run the a.out:

Hello world!

WARNING the wrapper script is not a build system. If you change hello.cc, but don't delete the BMI file, the compiler will be unaware of the change when importing the BMI, and consequently you'll get version skew. The intent of the wrapper script is (a) to make simple cases like this 'just work' and (b) allow build systems to hook into the compilation process so they do not have to preprocess everything before the first build. Also, if you parallelize your build, you must provide your own synchronization in the wrapper program.

Binary Module Interface Files

As mentioved above, a BMI is generated during the compilation of a module interface unit. For GCC I'm generating it as an on-the-side entity, but it could be stashed as a special section in the output assembly file, or even be a new stage of compilation. (Clang is taking this last approach.) It's a simple <tag> <contents> representation that uses automatic node numbering to deal with back-references. That means only a single pass is needed to write it out at the end of parsing.

The BMI does not contain timestamps. Thus recompiling a TU with exactly the same options will produce an identical BMI -- that's what you want with a cache. It does contain CRCs, which are used to detect corruption. I've not made corruption detection cryptographically strong or anything. If we detect corruption, you should get an error and then compilation terminates with a fatal error -- the likelihood of any further diagnostics being meaningful is negligible.

I have not thought about optimizing the format to allow lazy loading. That may become necessary.

Global Module

Declarations before the module-declaration are in the global module. While this is clear enough, it has a complicated interaction with a module interface:

void Foo ();
export module Quux;
export void Bar ();
void Baz ();

module Quux; // implementation of Quux
void Bar () {
  Baz (); // Baz's declaration visible from purview Quux interface
  Foo (); // ERROR global module decls from interface NOT visible

import Quux; // user of Quux
void Baz ()
  Bar (); // Quux's Bar
  Baz (); // ERROR: Quux's non-exported Baz not visible.
  Foo (); // ERROR: Foo not visible from Quux interface

I have not yet got a good handle on how to approach this.


None: cxx-modules (last edited 2018-01-22 14:09:47 by NathanSidwell)