[PATCH v5 4/5] c++modules: report imported CMI files as dependencies

Ben Boeckel ben.boeckel@kitware.com
Wed Jul 19 00:01:44 GMT 2023


On Tue, Jul 18, 2023 at 16:52:44 -0400, Jason Merrill wrote:
> On 6/25/23 12:36, Ben Boeckel wrote:
> > On Fri, Jun 23, 2023 at 08:12:41 -0400, Nathan Sidwell wrote:
> >> On 6/22/23 22:45, Ben Boeckel wrote:
> >>> On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote:
> >>>> On 1/25/23 16:06, Ben Boeckel wrote:
> >>>>> They affect the build, so report them via `-MF` mechanisms.
> >>>>
> >>>> Why isn't this covered by the existing code in preprocessed_module?
> >>>
> >>> It appears as though it is neutered in patch 3 where
> >>> `write_make_modules_deps` is used in `make_write` (or will use that name
> >>
> >> Why do you want to record the transitive modules? I would expect just noting the
> >> ones with imports directly in the TU would suffice (i.e check the 'outermost' arg)
> > 
> > FWIW, only GCC has "fat" modules. MSVC and Clang both require the
> > transitive closure to be passed. The idea there is to minimize the size
> > of individual module files.
> > 
> > If GCC only reads the "fat" modules, then only those should be recorded.
> > If it reads other modules, they should be recorded as well.

For clarification, given:

* a.cppm
```
export module a;
```

* b.cppm
```
export module b;
import a;
```

* use.cppm
```
import b;
```

in a "fat" module setup, `use.cppm` only needs to be told about
`b.cmi` because it contains everything that an importer needs to know
about the `a` module (reachable types, re-exported bits, whatever). With
the "thin" modules, `a.cmi` must be specified when compiling `use.cppm`
to satisfy anything that may be required transitively (e.g., a return
type of some `b` symbol is from `a`). MSVC and Clang (17+) use this
model. I don't know MSVC's rationale, but Clang's is to make CMIs
relocatable by not embedding paths to dependent modules in CMIs. This
should help caching and network transfer sizes in distributed builds.
LLVM/Clang discussion:

    https://discourse.llvm.org/t/c-20-modules-should-the-bmis-contain-paths-to-their-dependent-bmis/70422
    https://github.com/llvm/llvm-project/issues/62707

Maybe I'm missing how this *actually* works in GCC as I've really only
interacted with it through the command line, but I've not needed to
mention `a.cmi` when compiling `use.cppm`. Is `a.cmi` referenced and
read through some embedded information in `b.cmi` or does `b.cmi`
include enough information to not need to read it at all? If the former,
distributed builds are going to have a problem knowing what files to
send just from the command line (I'll call this "implicit thin"). If the
latter, that is the "fat" CMI that I'm thinking of.

> But wouldn't the transitive modules be dependencies of the direct 
> imports, so (re)building the direct imports would first require building 
> the transitive modules anyway?  Expressing the transitive closure of 
> dependencies for each importer seems redundant when it can be easily 
> derived from the direct dependencies of each module.

I'm not concerned whether it is transitive or not, really. If a file is
read, it should be reported here regardless of the reason. Note that
caching mechanisms may skip actually *doing* the reading, but the
dependencies should still be reported from the cached results as-if the
real work had been performed.

--Ben


More information about the Gcc mailing list