Commit messages and the move to git
Eric S. Raymond
Sat Nov 9 06:01:00 GMT 2019
Richard Earnshaw (lists) <Richard.Earnshaw@arm.com>:
> Which makes me wonder if, given a commit log of the form:
> 2019-10-30 Richard Biener <email@example.com>
> PR tree-optimization/92275
> * tree-vect-loop-manip.c (slpeel_update_phi_nodes_for_loops):
> Copy all loop-closed PHIs.
> * gcc.dg/torture/pr92275.c: New testcase.
> Where the first line is a ChangeLog style date and author, we could spot the
> PR line below that and hoist it up as a more useful summary (perhaps by
> copying it rather than moving it).
> It wouldn't fix all commits, but even just doing this for those that have
> PRs would be a help.
Speaking from lots of experience with converting old repositories that
exhibited similar comment conventions, I would be nervous about trying
to do this entirely mechanically. I think the risk of mangling text
that is not fornatted as you expect - and not noticing that until the
friction cost of fixing it has escalated - is rather high.
On the other hand, reposurgeon allows a semi-neechanized attack on
the problem that I think would work well, because I've done similar
things in ither coversions.
There's a pair of commands that allow you to (a) extract comments from
a range of commits into a message list that looks like an RFC822
mailbox file, (b) modify those comments, and (c) weave the the message
list reliably back into the repository.
If it were me doing this job, I'd write a reposurgeon command that
extracts all the comments containing PR strings into a message box
Then I'd write an Emacs macro that moves to the next nessage and
hoists its PR line.
Then I'd walk through the comments applying the macro and keeping an eye on
them for cases where what the macro doesn't do quite the right thing and
using undo and hand-editing to recover. Human eyes are very good at
spotting anomalies in an expected flow of textm and once you've gotten
into the rhythm of a task like this is is easily possible to filter
approximately a message per second. In round numbers, providing
the anomaly rate isn't high, that's upwards of 3000 messages per hour.
The point is that for this kind of task a hnman being who undertands
what he's reading is likely to have a lower rate of mangling errors than
a program that doesn't.
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
More information about the Gcc