GCC Documentation: Overview, Issues, Plans

Taxonomy

Manuals, Reference Material

End User

User documentation is composed of user manuals for the compilers (gcc, g++, etc.), support tools (cpp), various language front ends (java, fortran) and runtime libraries (c++, OpenMP, ada), and support libraries (quadmath).

These manuals are packaged for every release and put on gcc.gnu.org. Standard GNU Coding Convention rules are used to generate the documentation, ie make html and maintainer-scripts/make install-html. Manual content is authored in either texinfo (all but libstdc++) or docbook (libstdc++). Texinfo is a markup created by RMS and is the official markup language used by GNU projects. Reference content for libstdc++ is generated with doxygen.

Documents are packaged for each release via a script, maintainer-scripts/update_web_docs_svn

Printed output requires TeX.

Known Weaknesses

Developer

A separate set of documents exist that are focused on the internals of GCC.

However, most of the recent internals documentation takes place on the wiki . Subjects include how to use git for source code control, and various higher-level overviews of the source tree and or specific passes or sub-components.

In addition, there are source code comments. These comments vary widely in terms of completeness and relevance. A visualization of gcc sources via doxygen has been sporadically generated, but is currently not maintained.

Self-reported status of internal docs can be found here.

Known Weaknesses

Web Site

Main domain is gcc.gnu.org, although several other domains are also hosted, including cygwin.com. Content includes release notes, install and configuration details, FAQ, build results, porting information, historical release dates and version info, mission statement and information on the GCC steering committee, and links to additional resources. Content is authored in "bare" html without CSS, and then a CSS layer is applied when content is "published" on gcc.gnu.org. Sources to the website are available via CVS and are maintained by Gerald Pfeifer.

Traffic can be estimated via quantcast/compete. Roughly, compete guesses around 12.8k unique visitors for the current month, with a year variance of between 11-22k unique vistors per month. Demographic data is guessed by quantcast to be 65% male, high education levels, evenly split between 25-35, 35-45, 45-55 age levels, asian geography heavyweight.

Comparing, cygwin.com is around 21k, llvm.org is around 3.4k, sourceforge.net is around 2M, gnu.org is around 105k, python.org is around 69k, stackoverflow.com is around 750k.

Mailing List Archives

Extensive design and development history of new features is archived on various mailing lists. Partial archives of the mailing list are available, with the last twelve years easy accessible and well-indexed by search engines. Problems, bugs, and issues are tracked in bugzilla. Before 1997, GCC development mailing lists (ie gcc2) were closed: no known archives exist of the first ten years of development.

GDFL

GCC Manuals were switched from GPL to GDFL in 2001. Adoption of GDFL has not been without controversy: Debian has issues with invariant sections, Wikipedia with license incompatibilities. GDFL is incompatible with Creative Commons and GPL licenses. Many of these problems have been well-known for around 10 years: there has been no progress on resolution.

GPL

Generated documents take the same license as the originating sources. So, for the libstdc++ doxygen API reference, the files are licensed under the GPL. To be precise, they are licensed under the GPLv3 with the runtime exception.

Misc

The content on the gcc.gnu.org website is not licensed per se but instead copying is allowed as long as the FSF copyright is preserved.

GFDL vs. GPL

Literate programming, via Knuth, is the placement of code and commentary in the same context. When the code is GPL, and the commentary is GFDL, the license incompatibilities prevent this natural combination.

There are lots of ways this impacts GCC developers.

Plans

Have a Plan

Make a list of what's wrong. Prioritize. Take the top three and fix in a month, quarter, year. Re-evaluate progress and priorities on a yearly basis.

Removal of Literal Programming Restrictions

Essential for progress on multiple fronts.

Create a Doc Stage

Instead of requiring complete documentation when a new feature is being developed in stage one, let it slide until stage two, which would now be freeze on new features and documentation of new features and changes to old capabilities. Then stage three would be the normal stabilization phase. Admittedly, the boundary is fuzzy. The goal is for every release, have accurate developer commentary on contemporary state of implementation. Regularly-scheduled documentation "checkpoints" may allow the GCC community to incorporate the "expert help" from doc writers and editors outside the normal development community.

Questions