This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Repository for the conversion machinery

On Mon, 10 Oct 2016, Eric S. Raymond wrote:

> I strongly recomend that if you want to try this, you separate it from the
> initial repo conversion.  That is, get the project to git first.  Then
> see if you can data-mine author information out of the history. If,
> and only if, you get results that look reasonable, then you patch the repo
> and force-push it, warning everyone there'll be a flag day.
> The reason I recommend this is that I think you're going to have serious
> trouble getting clean authorship data with good coverage.  The data
> mining will be messy and take longer than you expect.

I also think it would be too messy, and don't think having such a flag day 
would be a good idea - once we've done the conversion we should keep 
commit ids stable (while having the commit objects from the existing git 
mirror in a disjoint set of branches not connected to the cleanly 
converted history, whether in a separate repository or not, so existing 
references to those commit ids continue to work as well - but I don't want 
to add a third set of commit ids for the same history as well).

In practice there are a lot of ways people have messed up ChangeLog 
commits or commit messages that I would expect to confuse such author 
extraction, even before you get to the parts of the history converted from 

Joseph S. Myers

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]