This is the mail archive of the
mailing list for the GCC project.
Re: gcc compile-time performance
- From: dewar at gnat dot com (Robert Dewar)
- To: akim at epita dot fr, gcc at gcc dot gnu dot org
- Date: Mon, 20 May 2002 08:29:58 -0400 (EDT)
- Subject: Re: gcc compile-time performance
> The sentence is correct, but I fail to make practical sense of it: we
> use deterministic parsing methods, and we certainly don't parse the
> whole CFG set. Bison will soon be able to address CFG via GLR, but
> for the time being, you have to do with LALR(1).
Requiring LALR(1) is indeed an annoying restriction (one we avoid in GNAT,
partly because it causes some distortion of the grammar, and we prefer to
use exactly the grammar in the RM, even though it is ambiguous).
> Additionally, you can't say that C++, or even C, are CFG, for
> foo (bar);
Well of course none of these language are context free, that goes without
saying, but in terms of syntax the above is not an issue, it can be parsed
simply as function_call_or_declaration and the semantic analyzer can
sort things out. That's the normal approach in compilers, and is what for
example is used in GNAT where the above construct can be any of a
type conversion, function call, or indexed component.
That particular example is not even an instance of supersetting, merely of
ambiguity in the tree to be resolved by semantic analysis.
Occasionally it is worth putting minor semantic kludges into parsing (e.g.
in Fortran to deal with spaces, or in C to deal with (a)-(b)), but I still
think you are better off if you keep these to a minimum.
> can be a function call or a variable declaration, depending upon the
> context. There are definitely not CFG. But I can see your point with
> `superset' here, although `superset' is certainly not a very useful
> concept, given that `.*' is also a superset of C++, and better yet: of
> all the languages GCC supports and will support (yeah! a single
> simple parser for all of them :).
.* is obviously not useful. The usual (text book and real life) idea here is
to choose a CFG that is as useful as possible for semantic analysis. Note
that it is often a good thing for it to be a superset. For example, in GNAT
we allow an others clause to appear anywhere in a case statement even though
the syntax requires it to come last, and we could enforce this in the parser.
It is smoother to do this check in the semantic analyzer, to avoid the parser
going into unnecessary error recovery mode here.
To me by the way, deciding to use a tool like BISON is tying your hands
behind your back before you start :-) In particular, there is nothing
terrible about a little bit of backup here and there, just so long as
you don't get anything that's exponential or takes noticeable time.
I also find the syntax error messages from these BISON parsers terrible.