This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: #include_next and absolute pathnames


Geoff Keating <geoffk at geoffk dot org> writes:

>> This is similar to the way it used to work before I rewrote it in June
>> 2000.  See http://gcc.gnu.org/ml/gcc-patches/2000-06/msg00720.html.
>> Please consider carefully the bugs which were fixed by that rewrite
>> and make sure you are not reintroducing them.  Please also make sure
>> you are not making the code even harder to understand than it is
>> already.
>
> I can avoid the bugs, but can't help about the code-complexity problem.
> If I leave the code as is, the performance is unacceptable in some
> cases; I believe that Apple's local version of this patch gave a 15%
> speedup on one real-world testcase, between the failing open() calls
> and the splay tree lookups, and we've made the rest of GCC faster
> since then, so it'd be even more now.

An 15% speedup is worth a bit more complexity, and complex doesn't
necessarily translate to harder to understand -- if the data
structures make sense and there's comments explaining the tricky
bits, we're good.

> Yup.  I'm not sure where the threshold will be, that'll have to wait
> on timing numbers when I finally get a patch.  I think timing results
> on the local version of the patch found that readdir() didn't hurt
> even "hello-world" programs. 

Interesting.  I'd like to see data taken on multiple platforms; I bet
this is highly OS- and libc-dependent.  (Happy to help with testing.)

> There will be some rare cases where the new code looks to see if a
> file exists even though it doesn't plan to open it, in order to keep
> its data structures efficient.  I'm planning to use open() for this,
> though.

The situation to avoid is stat() immediately followed by open() on the
same file, which can be up to 2x slower than open() followed by fstat()
due to the double namei operation.

>>  A somewhat pie-in-the-sky idea I've had for awhile is, hold open all
>> the directories on the search path, plus the normal working directory,
>> and fchdir() into each directory to access the files there -- this
>> avoids having to concatenate path names in cpplib and should be less
>> work for the kernel to boot.  (If you can persuade your kernel
>> developers to provide openrelative(dirfd, path, flags, mode) then that
>> would be even niftier.)  But it would have to be suitably
>> conditionalized for the sake of OS-s without fchdir, and could run
>> into problems with max-simultaneous-open-file limits.
>
> The problem with that would be that instead of one syscall, you'd end
> up with two.  The new code shouldn't rely so much on the kernel's
> path-parsing code, so should avoid the need for this.

Right; but I suspect the additional system call will wind up being
cheaper than the string bashing and additional directory lookups that
get avoided.  Needs benchmarking though.

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]