CMIs are an additional artifact when compiling named module interfaces, partitions or header units. These are read when importing. CMI contents are implementation-specific, and in GCC’s case tied to the compiler version. Consider them a rebuildable cache artifact, not a distributable object.
When creating an output CMI, any missing directory components are created in a manner that is safe for concurrent builds creating multiple, different, CMIs within a common subdirectory tree.
CMI contents are written to a temporary file, which is then atomically renamed. Observers either see old contents (if there is an existing file), or complete new contents. They do not observe the CMI during its creation. This is unlike object file writing, which may be observed by an external process.
CMIs are read in lazily, if the host OS provides mmap
functionality. Generally blocks are read when name lookup or template
instantiation occurs. To inhibit this, the -fno-module-lazy
option may be used.
The --param lazy-modules=n parameter controls the limit on the number of concurrently open module files during lazy loading. Should more modules be imported, an LRU algorithm is used to determine which files to close—until that file is needed again. This limit may be exceeded with deep module dependency hierarchies. With large code bases there may be more imports than the process limit of file descriptors. By default, the limit is a few less than the per-process file descriptor hard limit, if that is determinable.3
GCC CMIs use ELF32 as an architecture-neutral encapsulation mechanism.
You may use readelf
to inspect them, although section
contents are largely undecipherable. There is a section named
.gnu.c++.README
, which contains human-readable text. Other
than the first line, each line consists of tag:
tuples.
value
> readelf -p.gnu.c++.README gcm.cache/foo.gcm
String dump of section '.gnu.c++.README':
[ 0] GNU C++ primary module interface
[ 21] compiler: 11.0.0 20201116 (experimental) [c++-modules revision 20201116-0454]
[ 6f] version: 2020/11/16-04:54
[ 89] module: foo
[ 95] source: c_b.ii
[ a4] dialect: C++20/coroutines
[ be] cwd: /data/users/nathans/modules/obj/x86_64/gcc
[ ee] repository: gcm.cache
[ 104] buildtime: 2020/11/16 15:03:21 UTC
[ 127] localtime: 2020/11/16 07:03:21 PST
[ 14a] export: foo:part1 foo-part1.gcm
Amongst other things, this lists the source that was built, C++
dialect used and imports of the module.4 The timestamp is the same value as that
provided by the __DATE__
& __TIME__
macros, and may be
explicitly specified with the environment variable
SOURCE_DATE_EPOCH
. For further details
see Environment Variables Affecting GCC.
A set of related CMIs may be copied, provided the relative pathnames are preserved.
The .gnu.c++.README
contents do not affect CMI integrity, and
it may be removed or altered. The section numbering of the sections
whose names do not begin with .gnu.c++.
, or are not the string
section is significant and must not be altered.
Where applicable the soft limit is incremented as needed towards the hard limit.
The precise contents of this output may change.