[PATCH v5 4/5] c++modules: report imported CMI files as dependencies

Nathan Sidwell nathan@acm.org
Wed Jul 19 21:11:08 GMT 2023


On 7/18/23 20:01, Ben Boeckel wrote:
> On Tue, Jul 18, 2023 at 16:52:44 -0400, Jason Merrill wrote:
>> On 6/25/23 12:36, Ben Boeckel wrote:
>>> On Fri, Jun 23, 2023 at 08:12:41 -0400, Nathan Sidwell wrote:
>>>> On 6/22/23 22:45, Ben Boeckel wrote:
>>>>> On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote:
>>>>>> On 1/25/23 16:06, Ben Boeckel wrote:
>>>>>>> They affect the build, so report them via `-MF` mechanisms.
>>>>>>
>>>>>> Why isn't this covered by the existing code in preprocessed_module?
>>>>>
>>>>> It appears as though it is neutered in patch 3 where
>>>>> `write_make_modules_deps` is used in `make_write` (or will use that name
>>>>
>>>> Why do you want to record the transitive modules? I would expect just noting the
>>>> ones with imports directly in the TU would suffice (i.e check the 'outermost' arg)
>>>
>>> FWIW, only GCC has "fat" modules. MSVC and Clang both require the
>>> transitive closure to be passed. The idea there is to minimize the size
>>> of individual module files.
>>>
>>> If GCC only reads the "fat" modules, then only those should be recorded.
>>> If it reads other modules, they should be recorded as well.
> 
> For clarification, given:
> 
> * a.cppm
> ```
> export module a;
> ```
> 
> * b.cppm
> ```
> export module b;
> import a;
> ```
> 
> * use.cppm
> ```
> import b;
> ```
> 
> in a "fat" module setup, `use.cppm` only needs to be told about
> `b.cmi` because it contains everything that an importer needs to know
> about the `a` module (reachable types, re-exported bits, whateve > With
> the "thin" modules, `a.cmi` must be specified when compiling `use.cppm`
> to satisfy anything that may be required transitively (e.g., a return

GCC is neither of these descriptions.  a CMI does not contain the transitive 
closure of its imports.  It contains an import table.  That table lists the 
transitive closure of its imports (it needs that closure to do remapping), and 
that table contains the CMI pathnames of the direct imports.  Those pathnames 
are absolute, if the mapper provded an absolute pathm or relative to the CMI repo.

The rationale here is that if you're building a CMI, Foo, which imports a bunch 
of modules, those imported CMIs will have the same (relative) location in this 
compilation and in compilations importing Foo (why would you move them?) Note 
this is NOT inhibiting relocatable builds, because of the CMI repo.


> Maybe I'm missing how this *actually* works in GCC as I've really only
> interacted with it through the command line, but I've not needed to
> mention `a.cmi` when compiling `use.cppm`. Is `a.cmi` referenced and
> read through some embedded information in `b.cmi` or does `b.cmi`
> include enough information to not need to read it at all? If the former,
> distributed builds are going to have a problem knowing what files to
> send just from the command line (I'll call this "implicit thin"). If the
> latter, that is the "fat" CMI that I'm thinking of.

please don't use perjorative terms like 'fat' and 'thin'.

> 
>> But wouldn't the transitive modules be dependencies of the direct
>> imports, so (re)building the direct imports would first require building
>> the transitive modules anyway?  Expressing the transitive closure of
>> dependencies for each importer seems redundant when it can be easily
>> derived from the direct dependencies of each module.
> 
> I'm not concerned whether it is transitive or not, really. If a file is
> read, it should be reported here regardless of the reason. Note that
> caching mechanisms may skip actually *doing* the reading, but the
> dependencies should still be reported from the cached results as-if the
> real work had been performed.
> 
> --Ben

-- 
Nathan Sidwell



More information about the Gcc mailing list