This is the mail archive of the
mailing list for the GCC project.
Re: Repository for the conversion machinery
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: esr at thyrsus dot com
- Cc: DJ Delorie <dj at redhat dot com>, GCC Development <gcc at gcc dot gnu dot org>
- Date: Thu, 17 Sep 2015 12:44:48 +0200
- Subject: Re: Repository for the conversion machinery
- Authentication-results: sourceware.org; auth=none
- References: <55F98EF7 dot 4030401 at redhat dot com> <20150916170644 dot GA412 at redhat dot com> <xneghxlplr dot fsf at greed dot delorie dot com> <CAFiYyc0xegRpgj+60+d15eQGx3=GhVr2swvabKw6yY+DnaVoQQ at mail dot gmail dot com> <20150917104130 dot GB1161 at thyrsus dot com>
On Thu, Sep 17, 2015 at 12:41 PM, Eric S. Raymond <firstname.lastname@example.org> wrote:
> Richard Biener <email@example.com>:
>> Not sure why we can't label the individual commits with Authors scraped
>> from the ChangeLog entries in that commit. Some commits even have
>> multiple authors after all! And if that fails I'd rather use the @gcc.gnu.org
> Because associating ChangeLog entries with repo commits is really
> hard. You talk as though there's a neat 1-1 mapping with every commit
> containing one correctly-written ChangeLog comment. That's never the
> case in the wild, and any plan that assumes it will be is doomed.
> I've been to this rodeo before on other GNU projects and the problem
> is pretty much AI-complete. That is, a human can do it relatively easily by
> applying contextual knowledge, a computer program can't.
> We can't count on the dates to match. There's a whole world of pain
> there beginning with the fact that the ChangeLog timestamp and the
> commit timestamp can easily be generated across opposite sides
> of a top-of-second even if the ChangeLog timestamp was made by a Lisp
> hook in Emacs. And continuing with timezone and DST fooups.
> We also can't count on the Subversion username of the commit to match any
> address in the ChangeLog comment. In fact this is the exact problem we
> started out trying to solve.
Maybe I'm missing sth but apart from the CVS imported revisions each
SVN revision should contain the actual change plus the changes to the
ChangeLog files (you can't count on the commit message itself I guess
as not all people replicate the ChangeLog entries there).
There may be cases we can't handle and then doing some commit ID
mapping might be ok, but I expect 95% of the cases to work out nicely
so we should preserve what is in the ChangeLog entry (note that we have
very strict formatting requirement for the authors there).
> <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>