This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Acceptance criteria for the git conversion


On Tue, 1 Sep 2015, Eric S. Raymond wrote:

> Joseph Myers <joseph@codesourcery.com>:
> > Indeed.  Ideally the tree objects in the git conversion should have 
> > exactly the same contents as SVN commits, and so be shared with the 
> > git-svn history to reduce the eventual repository size (except where there 
> > are defects in the git-svn history, or the git conversion fixes up cvs2svn 
> > artifacts and so some old revisions end up more accurately reflecting old 
> > history than the SVN repository does).
> 
> I don't think sharing with the git-svn history will be possible.  git-svn
> is a terrible whole-history converter; the odds of getting the same
> topology out of reposurgeon are basically nil, and the problem of matching
> different topologies is quite hard.

I'm not proposing sharing topology (commit objects).  Only blob and tree 
objects.  If two files have the same hash they will share the same blob 
object, and if two trees have files with the same hashes at the same paths 
then the tree objects will also have the same hash, and will be shared.  
Now, git-svn may well have made mistakes meaning some trees in the git-svn 
repository do not accurately correspond to any SVN revision of any branch 
(and so the objects aren't shared), but I'd expect most to be shared (even 
without disabling smart ignore handling, lots of tree objects for 
subdirectories would be shared, if those subdirectories don't have any 
ignore files or svn:ignore properties).

The point is that since the git-svn repository has been in use for years, 
and there are many git-only branches there with lots of development on 
them, there are also many git commit references in list archives etc. 
which need to remain meaningful.  While it would be possible to move the 
existing repository to a different URI (or put the new repository at a 
less-obvious URI), it seems simpler to put both sets of objects (with many 
objects in common) in the same repository (with appropriately renamed refs 
from the git-svn repository so that the objects aren't garbage-collected).

This isn't something for reposurgeon to do.  It's something that should be 
easy to do at the pure git level.  At a minimum, I think it might be just 
one command to add the git-svn objects to a repository converted with 
reposurgeon.  Untested, but should give an idea of what I'm thinking of:

git fetch git://gcc.gnu.org/git/gcc.git \
    'refs/heads/*:refs/heads/git-old/*' \
    'refs/remotes/*:refs/heads/git-svn-old/*' \
    'regs/tags/*:refs/tags/git-old/*'

(OK, you want to git gc afterwards to repack the whole repository.)

-- 
Joseph S. Myers
joseph@codesourcery.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]