This is the mail archive of the
mailing list for the GCC project.
Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git
- From: Maxim Kuvyrkov <maxim dot kuvyrkov at linaro dot org>
- To: Jason Merrill <jason at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Paolo Bonzini <pbonzini at redhat dot com>
- Date: Fri, 2 Aug 2019 11:41:44 +0300
- Subject: Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git
- References: <E8A06A10-5BBC-4C2F-9C09-D5413B98D2DC@linaro.org> <8C62F814-2F57-4D1A-B66F-5C5ACFF37D6C@linaro.org> <4E46E435-F95C-46AD-87F0-8220D2BF4CD4@linaro.org> <CADzB+2nTUSH+i-XzAavnL3BfZjXLm53d0e3JgPfKZi5X8ijA9g@mail.gmail.com> <BC4A0163-3A45-4C3D-AA79-5DCEB6BF524A@linaro.org> <7FA7C370-04F5-448E-95D2-426607B99CF4@linaro.org> <CADzB+2=B=Fv34nqt+D103YCQocBTsVs80CCNFHkv_4cJ0gKfWQ@mail.gmail.com>
> On Aug 1, 2019, at 11:43 PM, Jason Merrill <firstname.lastname@example.org> wrote:
> On Mon, Jul 22, 2019 at 5:05 AM Maxim Kuvyrkov
> <email@example.com> wrote:
>>> On Jul 16, 2019, at 5:14 PM, Maxim Kuvyrkov <firstname.lastname@example.org> wrote:
>>>> On Jul 16, 2019, at 3:34 PM, Jason Merrill <email@example.com> wrote:
>>>> On Tue, Jul 16, 2019 at 12:18 PM Maxim Kuvyrkov
>>>> <firstname.lastname@example.org> wrote:
>>>>> Hi Everyone,
>>>>> I've been swamped with other projects for most of June, which gave me time to digest all the feedback I've got on GCC's conversion from SVN to Git.
>>>>> The scripts have heavily evolved from the initial version posted here. They have become fairly generic in that they have no implied knowledge about GCC's repo structure. Due to this I no longer plan to merge them into GCC tree, but rather publish as a separate project on github. For now, you can track the current [hairy] version at https://review.linaro.org/c/toolchain/gcc/+/31416 .
>>>>> The initial version of scripts used heuristics to construct branch tree, which turned out to be error-prone. The current version parse entire history of SVN repo to detect all trees that start at /trunk@1. Therefore all branches in the converted repo converge to the same parent at the beginning of their histories.
>>>>> As far as GCC conversion goes, below is what I plan to do and what not to do. This is based on comments from everyone in this thread:
>>>>> 1. Construct GCC's git repo from SVN using same settings as current git mirror.
>>>>> 2. Compare the resulting git repo with current GCC mirror -- they should match on the commit hash level for trunk, branches/gcc-*-branch, and other "normal" branches.
>>>>> 3. Investigate any differences between converted GCC repo and current GCC mirror. These can be due to bugs in git-svn or other misconfigurations.
>>>>> 4. Import git-only branches from current GCC mirror.
>>>>> 5. Publish this "raw" repo for community to sanity-check its contents.
>>>> Why not start from the current mirror? Perhaps a mirror of the mirror?
>>> To check that git-svn is self-consistent and generates same commits now as it was several years ago when you setup the current mirror.
>> Unfortunately, current mirror does not and could not account for rewrites of SVN commit log messages. For trunk the histories of diverge in 2008 due to commit message change of r138154. This is not a single occurrence; I've compared histories only of trunk and gcc-6-branch, and both had commit message change (for gcc-6-branch see r259978).
>> It's up to the community is to weigh pros and cons of re-using existing GCC mirror as conversion base vs regenerating history from scratch:
>> Pros of using GCC mirror:
>> + No need to rebase public git-only branches
>> + No need to rebase private branches
>> + No need to rebase current clones, checkouts, work-in-progress trees
>> Cons of using GCC mirror:
>> - Poor author / committer IDs (this breaks patch statistics software)
>> - Several commit messages will not be the current "fixed" version
> I'm still inclined to stick with the mirror. I would expect patch
> statistics software to be able to be taught about multiple addresses
> for the same person.
Patch tracking software breaks on emails like <fxcoudert@138bc75d-0d04-0410-961f-82ee72b054a4> , where 38bc75d-0d04-0410-961f-82ee72b054a4 is not a reasonable domain name.
For completeness, I'll generate and upload a repo based on current mirror with all branches and tags converted.
In the end, I don't care much to which version of the repo we switch, as long as we switch.