gengtype
As C does not have any means of reflection (and even C++981 does not have enough reflective abilities), gengtype was introduced to support some GCC-specific type and variable annotations, which in turn support garbage collection inside the compiler and precompiled headers. As such, gengtype is a one big kludge of a rudimentary C lexer and parser.
gengtype output
gtype-desc.h: globally visible declarations: enumeration of GTY-annotated types, GC marker declarations, PCH type walker declarations.
gtype-desc.c: global definitions: GC markers and PCH type walkers.
gt-sourcefile.h: sourcefile.c-specific definitions of GTY-annotated types walker functions and definitions of lists of GC roots defined in that file.
gtype-frontend.h: frontend-specific declarations and definitions of lists of GC roots, also definitions of frontend-specific GTY-annotated types walker functions.
On a long-term perspective, we could take inspiration from Qt moc machinery2 to systematically generate runtime accessible meta-data describing most important GCC datatypes. This could permit useful extensions, like generic dumpers for debugging the compiler, generic serialization of GCC data, generic or generated browser code for inspecting GCC internal data (i.e. a much improved replacement of tree-browser.c which is probably rotten, etc..). Such a powerful system is useful, since GCC is a huge code.
Improvement areas
Please contribute with your ideas here!
Keep and enhance it3, even if changing to C++. Tom Tromey suggested here a useful hint. The possible changes were discussed in this thread.
Make gengtype really available to plugins, so GNU-ify its program argument conventions (i.e. accept --version and --help etc..) and be able, for plugins, to run it independently of the availability of GCC source or build trees. Have it installed, probably as gcc-gengtype by the installation procedure.
- Make gengtype usable without a huge program argument list (its argc, argv formals to main).
Currently gengtype overwrites its targets, also in the case they were not changed. Make gengtype use temporary files and then use move-if-change to conditionally move them to the final target files.
Preprocess source files before feeding them to gengtype. This would solve issues with conditionally defined GTY-annotated variables, struct fields, etc. See here for an example of issues that lack of preprocessing causes.
Split out definitions from gtype-frontend.h to gtype-frontend.c, so that the header file could be included more than once per front end. Right now there is no place for front-end specific declarations! This means that all the front-end specific types must share the same declarations in gtype-desc.h.
Consider turning gt-sourcefile.h into gt-sourcefile.c, it enables dropping dependency between sourcefile.c and GGC internal implementation. Be careful not to kill performance with out-of-line functions, though.
Consider splitting out type-specific declarations from gtype-desc.h to new gt-sourcefile.h
Implement dump and load of its state for plugin support. Jeremie Salvucci and Basile Starynkevitch are working on that (adding a persistent state to gengtype), and have sent a serie of patches.
- Diagnose GTY option mismatches between the same-name types that are defined in different frontends.
Add a GTY option so that the generated GGC marking and PCH walking routines chunks are conditionned by a preprocessor #if. So for example an hypothetical field tree GTY((cppcond(ENABLE_CHECKING))) checktr; in some GTY-ed struct would have its marking & walking generated statements wrapped with #if ENABLE_CHECKING and #endif. This is low-tech, easy to implement, and definitely useful. Of course, it would be better if gengtype itself would accept preprocessed input, but that is much more difficult to achieve.
Notes
One cannot code, even with complex templates, a rather generic serializer or debug-printer in C++. This would require accessing a description of fields inside classes, which C++ templates or RTTI does not provide. (1)
Or even Smalltalk or Common Lisp meta-classes or object systems (2)
Basile believes we cannot and should not get rid of gengtype or garbage collection even if all of GCC was in C++; other people apparently suggested we get rid of gengtype while C++-ifying GCC. (3)