This is the mail archive of the
mailing list for the GCC project.
Re: Rant about ChangeLog entries and commit messages - better to do something than just complain
- From: Tim Josling <tejgcc at westnet dot com dot au>
- To: Richard Kenner <kenner at vlsi1 dot ultra dot nyu dot edu>
- Cc: njn at csse dot unimelb dot edu dot au, andi at firstfloor dot org, gcc at gcc dot gnu dot org
- Date: Sat, 23 Feb 2008 20:52:41 +1100
- Subject: Re: Rant about ChangeLog entries and commit messages - better to do something than just complain
- References: <email@example.com> <firstname.lastname@example.org> <Pine.GSO.email@example.com> <firstname.lastname@example.org> <Pine.GSO.email@example.com> <10712041305.AA18081@vlsi1.ultra.nyu.edu>
- Reply-to: tejgcc at westnet dot com dot au
On the principle that it's better to do something than just complain...
I monitored the time I spent looking for the emails associated with a
given patch and I found it takes high single digit minutes to find them.
Sometimes you can't find them (which takes a lot longer). I do this a
I wrote a little proof-of-concept script to take the mailing list
archives and the ChangeLog files and annotate the ChangeLog files with
the URLs of the probable email containing the patch.
Sample output is here (annotation of the current ChangeLog file).
The program is here (not much internal documentation at all). Testing
has been limited - in any case, with processing of text written by
people, perfection is not possible.
It runs in about 25 minutes on my system and uses a few hundred MB of
Things I learned:
1. There is a lot of data. It's a good thing Ruby 1.9 is a lot faster
than Ruby 1.8.
There are over 100 ChangeLog files in the GCC source, with over 600,000
lines in total. The gcc patches mailing list archives are over 2 GB in
size, and take a considerable time to download.
2. Most patches to ChangeLog have an identifiable email in the archive.
Things get spotty with branches in some cases, also as you go back in
time, and also there is a large gap in the email archives from a while
3. I think this may be a useful thing. If a place could be found to put
the 30MB of files I would be happy to maintain them on a weekly basis or
so. Alternatively I could update the ChangeLog files themselves but I
have reason to suspect that may not be popular.
If nothing else happens I will keep it up-to-date for my own use.
On Tue, 2007-12-04 at 08:05 -0500, Richard Kenner wrote:
> > I didn't say you cannot or should not use these tools. But a good comment
> > on a piece of code sure beats a good commit message, which must be looked at
> > separately, and can be fragmented over multiple commits, etc.
> I don't see one as "beating" the other because they have very different
> purposes. Sometimes you need one and sometimes you need the other.
> The purpose of COMMENTS is to help somebody understand the code as it
> stands at some point in time. In most cases, that means saying WHAT the
> code does and WHY (at some level) it does what it does. Once in a while,
> it also means saying why it DOESN'T do something, for example, if it might
> appear that there's a simpler way of doing what the code is doing now but
> it doesn't work for some subtle reason. But it's NOT appropriate to put
> into comments the historical remark that this code used to have a typo
> which caused a miscompilation at some specific place. However, the commit
> log IS the place for that sort of note.
> My view is that, in general, the comments are usually the most appropriate
> place to put information about how the code currently works and the commit
> log is generally the best place for information that contrasts how the code
> currently works with how it used to work and provides the motivation for
> making the change. But there are exceptions to both of those generalizations.