This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Offer of help with move to git


On Mon, 24 Aug 2015, Eric S. Raymond wrote:

> Joseph Myers <joseph@codesourcery.com>:
> > Hence my suggestion in <https://gcc.gnu.org/ml/gcc/2015-08/msg00150.html> 
> > of reconverting and then combining with the existing git-svn history via 
> > renaming all the refs in the existing git repository, so as to preserve 
> > the validity of commit references and git-only branches there while having 
> > the main copy of the history properly converted.
> 
> Sorry, but I can't even imagine how to recombine in that way with the tools
> I have.  If you still think it's worth trying after seeing the reposurgeon
> conversion I deliver, we can investigate that I suppose.

I'm pretty sure it should be doable with pure git.  Do something with git 
for-each-ref on (a copy of) the old repository to script renaming of all 
refs so the two repositories don't have any conflicting refs.  Then use 
git fast-import to import all the content of one repository into the other 
(or add one repository as a remote to the other, fetch and then rename all 
the refs from remotes/).  Then git gc --aggressive to repack it all.  
Optionally, add a "git merge -s ours" commit to new master to show it as 
merged from the git-svn master, if it's considered beneficial to converge 
the history like that.

People with existing git-only branches might need to take extra care the 
first time they merge after this is done (maybe merge up to the last 
revision of the git-svn master then do their own -s ours merge from the 
new master, if git doesn't get it right automatically given such a merge 
commit on master), but I don't see any reason this approach shouldn't work 
to keep existing references to git-svn commit hashes meaningful (without 
needing to have a renamed git-svn repository sit on the side for that 
purpose) and to keep existing git-only branches (of which there are lots) 
usable.  And with blobs and hopefully most tree objects shared between the 
two histories, I hope this won't make the repository too much larger.

> The GCC repo is pretty huge, but I've been hunting mastodons like it
> for years now - there's a row of trophy heads in the reposurgeon
> documentation.  I ended up building a machine with a processor and
> cache specifically designed to handle non-parallelizable graph-theory
> computations multiple gigabytes wide - SMP is no help here and you
> want extra-large primary memory caches. On this hardware, conversion
> runs will merely be painfully slow rather than die-of-old-age
> interminable.

FWIW, Jason's own trial conversion with reposurgeon got up to at least 
45GB memory consumption on a 32GB repository.

-- 
Joseph S. Myers
joseph@codesourcery.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]