#import and PCH

Fergus Henderson fjh@cs.mu.OZ.AU
Mon Jan 27 03:23:00 GMT 2003


On 24-Jan-2003, Stan Shebs <shebs@apple.com> wrote:
> Geoffrey Keating wrote:
> >#import worked by setting a flag bit on cpplib's hash table.  This is
> >somewhat annoying from a PCH perspective, and is kind of ugly too.
> >This patch changes the behaviour so that #import uses macro
> >definitions to track whether a header is included or not, and then
> >PCH works with it automatically.
> >
> >Bootstrapped & tested on powerpc-darwin.
> >
> >I'll wait a bit before committing it so that people can comment.
>
> Does this work right with symlinks?

What do you mean by "work right"?  Do you mean "work the way that Apple's
compiler works"? ;-)

As far as I can see, the current language specification is silent on
the issue of whether #import should treat a symlink as "the same" as
the file that the symlink refers to.  So any code which relies on this
is assuming more than the language specification guarantees.

Symlinks are just one particular system-specific way of making two file
names refer to the same storage.  But in general, with network file
systems or user-space file systems, there is no way that the compiler
can be sure whether two file names refer to the same storage.
The answer could even depend on the phase of the moon!

IMHO:
	#import is a feature of Objective C. 
	It is documented in the language reference manual.  
	It is a useful feature in widespread use.
	Although the current specification is ambiguous,
	it is possible to give it a precise, useful, and
	implementable specification.
	Therefore, GNU Objective C should support it.

	Distinguishing which files are "the same" is ambiguous.
	Therefore, the language specification should be changed to
	clarify exactly what is required here (or to explictly
	state the range of permitted behaviours).

	There is a spectrum of possible meanings for "the same":
		- the same include file name
			(this runs into problems with e.g. "#include ../foo.h"
			meaning different things when it occurs in
			files located in different directories)
		- the same fully-qualified canonicalized path name
		- the same fully-qualified path name, with symlinks resolved
		  where possible
		  	(in some cases the symlinks may not be visible,
			e.g. consider files accessed via http)
		- the same storage (e.g. device and inode number)
			(again, in some cases it may be impossible to
			determine this)
		- the same file contents

	IMHO the most natural interpretation of "the same file" would be
	files that refer to the same storage.  But since in general it is
	impossible for a compiler to tell when two file names refer to the
	same storage, the language specification should not require this.

	The interpretation that would lead to the best portability would
	be to require implementations to treat files as "the same" if they
	have the same contents.  (Even with this, there is the issue of
	different file encodings.  However, I think it would be acceptable
	to leave it unspecified as to whether files which have the same
	contents but different encodings are treated as the same.  I don't
	think that would cause any real portability problems in practice.)
	But this would have an efficiency impact, and would also require
	changes to Apple's compiler as well as to GCC.

	Standardizing on some particular interpretation based on
	the idea of files being treated as the same if they the same
	(canonicalized) file/path names seem a bit problematic, since
	the specification will either canonicalize too little (and hence
	cause users problems with "foo.h" vs "./foo.h", symbolic links,
	hard links, etc.) or will canonicalize too much (and hence be
	difficult or impossible to implement on some systems).
	So if this route is taken, the specification would have to be
	quite loose, e.g. "implementations are required to treat files
	with the same (fully-qualified) path name as ``the same'', and
	are permitted but not required to treat files that refer to the
	same storage or have the same contents as ``the same''."
	Such a loose specification could possibly lead to portability
	problems, because users might accidentally write code which
	works with one compiler but not with another.
	Nevertheless, agreeing on a loose specification seems
	like the most practical approach, since it corresponds most
	directly with the current ambiguous specification, and anyway I
	suspect Apple are unlikely to be willing to change their compiler
	to meet a different specification.

> Unfortunately this is not a hypothetical issue, a build of Mac OS X
> will likely go down in flames if you get this wrong, since there
> are cases of multiple #imports through different paths but all
> symlinking to a single file.

I'm sure Mac OS X could be fixed to work with a compiler that did not
treat symlinked files as "the same".  I don't think that should be
seen as a blocking issue.  I think that in practice, few users would
be seriously inconvenienced if GCC did not treat symlinks as referring
to "the same" file for the purposes of #import.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.



More information about the Gcc-patches mailing list