This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: ideas for cpplib


> I can't find this ChangeLog.  gcc 2.7.2 stops at Nov 95 and CVS only has
> back to 1997 or so.

Our cvs logs go back further!  This may be rather intimidating ...

Sun Dec 17 06:57:02 1995  Paul Eggert  <eggert@twinsun.com>

        * cccp.c: Try harder not to open or stat the same include file twice.
        Simplify include file names so that they are more likely to match.
        E.g. simplify "./a//b" to "a/b".  Represent directories with simplified
        prefixes, e.g. replace "./a//b" with "a/b/", and "." with "".
        (absolute_filename): New function.
        (do_include): Use it.
        (read_name_map): Likewise; this makes things more consistent for DOS.
        (main, do_include, open_include_file): -M output now contains
        operands of -imacros and -include.
        (skip_to_end_of_comment): When copying a // comment, don't try to
        change it to a /* comment.
        (rescan, skip_if_group, skip_to_end_of_comment, macarg1): Tune.
        (rescan, skip_if_group, skip_to_end_of_comment, macarg1):
        If warn_comments is nonzero, warn if backslash-newline appears
        in a // comment.  Simplify method for finding /* /* */ comment.
        (skip_if_group): Optionally warn if /* /* */ appears between # and
        a directive inside a skipped if group.
        (macarg): Optionally warn if /* /* */ appears in a macro argument.
        (strncat, VMS_strncat, vms_ino_t, ino_t): Remove.
        (INCLUDE_LEN_FUDGE): Add 2 if VMS, for trailing ".h".
        (INO_T_EQ, INO_T_HASH): New macros.
        (struct file_buf): New member `inc'.
        (expand_to_temp_buffer): Initialize it.
        (struct file_name_list): New member `inc'.
        (struct file_name_list): New member `st'.
        c_system_include_path is now 1 if not 0.
        fname is now an array, not a pointer.
        (struct include_file): New members `next_ino', `deps_output', `st'.
        Remove members `inode' and `dev'; they are now in `st'.
        (INCLUDE_HASHSIZE): Rename from INCLUDE_HASH_SIZE.
        (include_hashtab): Rename from include_hash_table.
        (include_ino_hashtab): New variable.
        (main): Store file status in struct stat, not in long and int pieces.
        Use base_name to strip prefixes from file names.
        When printing directory prefixes, omit trailing / and print "" as ".".
        Fatal error if the input file is a directory.
        (main, path_include): Regularize operands of -include, -imacros,
        -isystem, -iwithprefix, and -iwithprefixbefore.
        Regularize default include directories.
        (do_include):
        Allocate dsp with alloca, since fname is now dynamically allocated.
        Use -3 to represent a never-opened file descriptor.
        Make copy of file name, and simplify the copy.
        Use base_name to identify the end of fname's directory.
        Do not prepend dir for "..." if it matches the search list's first dir.
        open_include_file now subsumes redundant_include_p and lookup_import.
        Use bypass_slot to remember when to skip directories when including
        a file that has already been seen.
        Instead of using 0 to represent the working directory, and ""
        to represent a directory to be ignored, use "" for the former,
        and assume the latter has been removed before we get here.
        Assume the directory prefixes have already been simplified.
        Report as errors all open failures other than ENOENT.
        Fatal error if fstat fails.
        Use new deps_output member to avoid printing dependencies twice.
        (bypass_hashtab): New variable.
        (do_include, open_control_file, record_control_macro): New convention:
        control_macro is "" if the file was imported or had #pragma once.
        (pragma_once_marker): Remove.
        (redundant_include_p, include_hash, lookup_include, lookup_import,
        add_import, file_size_and_mode): Remove; subsumed by open_include_file.
        (skip_redundant_dir_prefix): Remove; subsumed by simplify_filename.
        (is_system_include, read_name_map, remap_include_file):
        Assume arg is a directory prefix.
        (base_name, simplify_filename, remap_include_file,
        lookup_ino_include, new_include_prefix): New functions.
        (open_include_file): New arguments `importing' and `pinc'.
        Move filename mapping into new remap_include_file function.
        First try to find file by name in include_hashtab;
        if that doesn't work, open and fstat it and try to find it
        by inode and dev in include_ino_hashtab.
        (finclude): Get file status from inc->st instead of invoking fstat.
        Store inc into fp->inc so that record_control_macro doesn't
        need to do a table lookup.
        (finclude, record_control_macro): Accept struct include_file *
        instead of char * to identify include file.  All callers changed.
        (check_precompiled): Get file status from new argument `st'.
        (do_pragma): Output at most one warning about #pragma implementation.
        Always return 0 instead of returning garbage sometimes.
        (do_pragma, hack_vms_include_specification):
        Use base_name for consistency, and remove redundant code.

        From Per Bothner:
        Unify the 3 separate mechanisms for avoiding processing
        of redundant include files: #import, #pragma once, and
        redundant_include_p to use a single more efficient data structure.
        (struct file_name_list):  Remove no-longer needed field control_macro.
        (dont_repeat_files, all_include_files):  Remove, no longer used.
        (struct import_file):  Renmed to struct include_file, moved earlier
        in file, renamed field name to fname, and added control_macro field.
        (pragma_once_marker):  New constant.
        (import_hash_table):  Renamed to include_hash_table.
        (import_hash):  Renamed to include_hash.
        (IMPORT_HASH_SIZE):  Renamed to INCLUDE_HASH_SIZE.
        (main, path_include):  Don't clear removed control_macro field.
        (lookup_include):  New function - look up fname in include_hash_table.
        (redundant_include_p):  Re-write to use lookup_include.
        (lookup_import, record_control_macro):  Likewise.
        (add_import):  Defer fstat to caller.  Combine two xmallocs into one.
        (do_once):  Use pragma_once_marker in include_hash_table.
        (do_pragma):  Re-implement to scan include_hash_table.
        (do_include):  Use new lookup_include and add_import.

> Current cccp does almost as badly as cpplib on glibc.  Short of saving
> stat() information across compiles, I don't think anything will help much;
> the problem is having to examine something like 30 directories on every
> #include, most of which don't have the header it's looking for.

Maybe;  when I looked into, I convinced myself (and Paul) that
ceaning this up could make a noticable difference.  (That cccp
is almost as badly at cpplib could mean that more cruft has crept
into cccp.  Inany case, it implies cccp *is* faster, and better
include file management is I believe one reason.)

> Hm.  Trouble with that is that backslash-newline can appear anywhere, even
> in the middle of a token.  I thought about replacing it with an escape
> that meant 'bump line number here', postponed till whitespace.  That would
> be the only way to get lines and cols right in the middle of a long series
> of escaped newlines.

Not necessarily.  The key to getting the line and col numbers right
is to *not mess with the input buffer*, and get line and col numbers
directly from there.  In addition, there is a stack of other input buffers
(such as those from macro expansions).  For each input buffer
(cpp_buffer) corresponding to a file, you remember the current
input pointer, plus the previous line beginning.  Calculating
the column number is then just a subtraction.  At least, I think
that is the basic idea (it's been a while since I looked at this).

> The commonest use of escaped newlines is big hairy macros.  cc1 doesn't even
> try to look into big hairy macros for error messages; you get the error on
> the line where the macro was called.  If cpplib could fix that, that would
> be nice.

Probably not when using cppmain, but I think we should try to do
better when using cpplib linked into cc1.

> On a 32Mb machine (pretty common these days) running X and 2.0 kernel, 
> things start hitting swap at around 2 megs of working set.

Which should tell you that you need at least 48MB to be productive.

> Ulrich says he wants to put mmap of read-only files into stdio, and I'd
> rather not worry about whether we have mmap support inside cpplib.

Most of the hosts we support are not using Linux, let alone glibc.

	--Per Bothner
Cygnus Solutions     bothner@cygnus.com     http://www.cygnus.com/~bothner


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]