This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gcc compile-time performance

> Are you referring to use *any* tool as opposed to hand-written
> parsers, or specifically Yaccs?

There are certainly better tools around, in fact it sort of amazes me that
these ancient tools are still in use. For example, see Gerry Fisher's work
in the mid 80's at NYU. 

> As I said, Bison will have GLR, which is superior to backtracking.

Well the point is that in a hand written parser, you can do intelligent
backtracking which can never be done by any automatic tool. That's my
opinion, and the GNAT parser was really written as a demonstration of
the sort of thing that can be done in a hand written parser (for example,
the error messages are sensitive to the layout, using the precise layout
on the page to guess the intention of incorrect code).

> src/tiger/src % echo "1 ++ 1" | ./tc -l -
> standard input:1.4: parse error, unexpected "+"
> src/tiger/src % echo "for i = 1 to 10 do i" | ./tc -l -
> standard input:1.7: parse error, unexpected "=", expecting ":="
> src/tiger/src % echo 'let in 1' | ./tc -l -
> standard input:2.1-0: parse error, unexpected "end of file", expecting ";"

This kind of error detection and correction is trivial. The hard thing is
to do structural repair, e.g. when {} brackets do not match up, or when
semicolons present or missing derail the structure. The bottom up parser
used for Ada Ed had quite good structural repair (certainly far beyond
anything in BISON), but I still think it is clear from GNAT that you can
do better with a hand written parser.

Another thing that seems problematic in at least GNU-C (with which I am
more familiar than g++ for error message handling) is that the error messages
seem to know nothing about preprocessing. They seem to work on the preprocessed
source with no knowledge of the original source structure. This means that
simple things like an extra semicolon or missing semicolon in a macro
definition can cause highly confusing messages. It seems essential to me
that error message processing understand the original structure of the
source (including its layout).

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]