This is the mail archive of the
mailing list for the GCC project.
Re: Proposal for the transition timetable for the move to GIT
On Thu, 26 Dec 2019, Maxim Kuvyrkov wrote:
> Reposurgeon creates merge entries on trunk when changes from a branch
> are merged into trunk. This brings entire development history from the
> branch to trunk, which is both good and bad. The good part is that we
> get more visibility into how the code evolved. The bad part is that we
> get many "noisy" commits from merged branch (e.g., "Merge in trunk"
> every few revisions) and that our SVN branches are work-in-progress
> quality, not ready for review/commit quality. It's common for files to
> be re-written in large chunks on branches.
Seeing "noisy" or possibly confusing commits in "git log" output for
master is simply a consequence of the possibly confusing defaults for how
git log behaves (showing all commits in the ancestry in reverse committer
date order). I often find "git log --first-parent" output less confusing
when dealing with any git repository making heavy use of branches (but
there are other options as well to control how it shows such histories).
If we don't want merge commits on git master for the cases where people
put merge properties on trunk in the past, we can use a reposurgeon
"unmerge" command in gcc.lift to stop the few commits in question from
being merge commits (while keeping all other merges as-is). (The merges
of trunk into other branches that copied merge properties from trunk into
those branches will still be handled correctly, with exactly two parents
rather than regaining the extra parents corresponding to the merges into
trunk that Bernd noted in an earlier version of the conversion, because
the processing that avoids redundant merge parents takes place well before
any unmerge commands are executed - so at the time of that processing,
reposurgeon knows that those other branches are in fact in the ancestry of
trunk, even if we remove that information in the final git repository.)
> Also, reposurgeon's commit logs don't have information on SVN path from
> which the change came, so there is no easy way to determine that a given
> commit is from a merged branch, not an original trunk commit. Git-svn,
I think it's idiomatic in git for a branch commit not to say "this is a
commit on X branch", i.e. this is a general property of branchy git
histories (and unmerge is the solution if we don't want a branchy history
of master, or use of smarter git tools for viewing the history that people
may well make more use of when dealing with repositories with that kind of
> It appears that .gitignore has been added in r1 by reposurgeon and then
> deleted at r130805. In SVN repository .gitignore was added in r195087.
> I speculate that addition of .gitignore at r1 is expected, but it's
> deletion at r130805 is highly suspicious.
I suspect this is one of the known issues related to reposurgeon-generated
.gitignore files. Since such files are not really part of the GCC
history, and the .gitignore files checked into SVN are properly preserved
as far as I can see, I don't think it's a particularly important issue for
the GCC conversion (since auto-generated .gitignore files are only
nice-to-have, not required). I've filed
https://gitlab.com/esr/reposurgeon/issues/219 anyway with a reduced test
for this oddity.
> Reposurgeon uses $email@example.com for committer email addresses even
> when it correctly detects author name from ChangeLog.
I think that's logically accurate (and certainly harmless) as a
description of commits made to a central repository on gcc.gnu.org,
although using committer = author would also be OK.
> == Bad summary line ==
> While looking around r138087, below caught my eye. Is the contents of
> summary line as expected?
> commit cc2726884d56995c514d8171cc4a03657851657e
> Author: Chris Fairles <firstname.lastname@example.org>
> Date: Wed Jul 23 14:49:00 2008 +0000
> acinclude.m4 ([GLIBCXX_CHECK_CLOCK_GETTIME]): Define GLIBCXX_LIBS.
Yes. This seems to be Richard's script working exactly as intended, by
extracting the first bit of the ChangeLog entry *after* the date/author
header as a better description than "2008-07-23 Chris Fairles
<email@example.com>" (i.e. it certainly gives more distinctive
information about the commit and is more useful than having a date/author
line as the summary line). I don't think it's a bad summary line (but
Richard's script supports hardcoding new summary lines for individual
commits where desired).
Joseph S. Myers