This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Repository for the conversion machinery
On Mon, 10 Oct 2016, Eric S. Raymond wrote:
> I strongly recomend that if you want to try this, you separate it from the
> initial repo conversion. That is, get the project to git first. Then
> see if you can data-mine author information out of the history. If,
> and only if, you get results that look reasonable, then you patch the repo
> and force-push it, warning everyone there'll be a flag day.
>
> The reason I recommend this is that I think you're going to have serious
> trouble getting clean authorship data with good coverage. The data
> mining will be messy and take longer than you expect.
I also think it would be too messy, and don't think having such a flag day
would be a good idea - once we've done the conversion we should keep
commit ids stable (while having the commit objects from the existing git
mirror in a disjoint set of branches not connected to the cleanly
converted history, whether in a separate repository or not, so existing
references to those commit ids continue to work as well - but I don't want
to add a third set of commit ids for the same history as well).
In practice there are a lot of ways people have messed up ChangeLog
commits or commit messages that I would expect to confuse such author
extraction, even before you get to the parts of the history converted from
CVS.
--
Joseph S. Myers
joseph@codesourcery.com