This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

"Documentation by paper"


I've been noticing for a while that there are an increasing number of files
in GCC where the only overview documentation is a reference to a paper or
textbook.

I think this is totally unacceptable documentation and that we need to have a
policy about this sort of documentation.

My reasons are as follows:

(1) It's unreasonable for any person who wants to work on that file to have
to locate and read the paper.  Sure, if you are doing significant algorithmic
work on the file, you need to have read the reference.  But for small things,
or most debugging, you don't need that much information and most changes are
small things.

(2) It's rare to implement an algorithm *exactly* as presented in the paper,
so we'd need a list of changes.  That means the description is a combination
of a reference and a set of changes, which is complex.

(3) A critical part of the overview is identifying what part of the code and
data structures does what.  If you have a complete description of the
algorithm in the file, this flows very naturally because you intersperse
references to the functions and structs into your description of the
algorithm.  Otherwise, you have to use odd language to link the
implementation with the algorithm.  This link is perhaps the most critical
part of the documentation but is the part most commonly left out.


Certainly the reference needs to be there as well, both for credit purposes
and to supply further details.  For example, normally a critical part of
documentation is not just what's being done but *why* it's being done and why
other things *aren't* being done.  Here, the paper can serve those purposes.

As I've been getting into some of the newer parts of the compiler, I've been
very hampered by the lack of proper documentation.  I think improving this
documentation ought to be one of the major goals of 3.5 aside from any other
changes.

I'd like to get agreement on the following documentation standard for the
cases where papers or texts are referenced:

(1) The algorithm be fully-enough described in a blocks of comments in the
front of the file that the goals and methods of the algorithm can be
completely understood just from those comments.

(2) As part of that narrative, any differences between the algorithm in the
reference and the code should be explained.  Likewise, any implementation
choices should be pointed out.

(3) Again as part of the narrative, each major function and data structure
should be mentioned.

(4) The reference should be supplied in a clear manner.  If it available
online, a URL should be supplied.

Of course, each file should also meet the minimum documentation requirements
in other areas:

(1) There should be a block of comments in front of every function giving the
external specification of each function, including the meaning of every
argument.

(2) Within each function, there should be enough comments to explain the role
of each part of the function in implementing those external specifications.
At a minimum, this means a sentence or two for each non-trivial "if"
statement or loop.  These should not be a translation of the code into
English, but provide the linkage between specification and implementation.

(3) Any design choices, especially choices about *not* doing something, need
to be clear documented.

Does everybody agree with these standards?  If we can't get concensus, I'd
like to ask the SC to look at this issue.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]