This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
C++ lexer (GCC 3.1.1) requires knowledge of other C dialects
- From: "Gary Funck" <gary at Intrepid dot Com>
- To: "Gcc Mailing List" <gcc at gcc dot gnu dot org>
- Date: Wed, 31 Jul 2002 00:17:23 -0700
- Subject: C++ lexer (GCC 3.1.1) requires knowledge of other C dialects
While moving changes made to GCC to support an experimental dialect of C, known
as UPC, I ran into a problem: After adding the new language support in a
fashion similar to Objc, the C++ compiler no longer was able to properly lex
and parse C++ programs. It turns out that the difficulty was the result of the
method used in cp/lex.c to recognize reserved words:
481
482 /* Table mapping from RID_* constants to yacc token numbers.
483 Unfortunately we have to have entries for all the keywords in all
484 three languages. */
485 const short rid_to_yy[RID_MAX] =
486 {
487 /* RID_STATIC */ SCSPEC,
488 /* RID_UNSIGNED */ TYPESPEC,
489 /* RID_LONG */ TYPESPEC,
The rid_to_yy table requires an entry for each reserved word in *all* supported
C dialects, and the table is ordered by increasing RID_* values. A few
observations:
1) This dependency on other languages makes it more difficult to add a new
dialect and violates modularity.
2) If the dependency must exist, it would make the job of adding another
dialect easier, if for example, a new type of *.def file were introduced under
gcc which would generate both the values currently in c-common.h and to fill in
the table required by cp/lex.c.
3) At a minimum, would it be possible to add a check in the C++ parser
initialization routine, which somehow checks for consistency (perhaps the
number of elements in rid_to_yy can be checked against the value of RID_MAX?),
and aborts with a diagnostic if something in the definition of rid_to_yy seems
inconsistent or incorrect?
4) I haven't had a chance to read the new internals document yet, but if there
is a section on adding a new dialect, it should discuss this dependency between
C++ and the other C dialects.