This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: stage1 bootstrap failuer to build gcc.3.3: cppfiles.c:1168: error: parse error before ']' token etc.
On Thu, Sep 05, 2002 at 07:05:19PM +0100, 'Neil Booth' wrote:
> Although doing a preprass on logical lines to do stages 1 and 2 first
> would make some things nice, it makes some things harder. For example,
> it's hard to avoid warning about trigraphs in comments as it has no
> idea whether it's in a comment. Issues like this make me wonder whether
> a prescan is worth it.
I was thinking of doing stage 3 (comment strip) in there too. It's
pretty easy to enumerate the cases to be dealt with:
case 1: Logical line contains no characters in the set /\?, no
characters outside the basic source set, and is entirely inside the
read() buffer. We scan for the newline, and return the interval to
the phase-4 lexer.
case 2: Logical line contains / not immediately followed by * or /; \
not immediately followed by whitespace and newline; or ? not
immediately followed by the rest of a trigraph. Catch this out of
line, jump back into the scanner loop.
case 3: Logical line ends with a // comment. Overwrite the leading
slash with a \n terminator, advance the read pointer past the comment,
and return the narrowed interval to phase 4. (This saves copying the
line.)
case 4: Logical line contains a trigraph, backslash-newline, block
comment, or character outside the basic source set. Jump to
particular logic to handle that case. They'll all probably involve
copying the logical line to a second buffer, with some sort of
annotation on the side so we don't lose track of the original source
position.
We actually get a break from the committee, in that neither \[uU]
escapes nor raw multibyte characters can validly represent characters
in the basic source set, which means we don't need to worry about
e.g. \u002F\u002A starting a comment.
If I get some free time (ha) I'll look at coding this up...
zw