This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Git conversion: fixing email addresses from ChangeLog files


On 28/12/2019 20:11, Segher Boessenkool wrote:
> On Sat, Dec 28, 2019 at 04:34:20PM +0000, Richard Earnshaw (lists) wrote:
>> On 28/12/2019 14:54, Segher Boessenkool wrote:
>>> On Sat, Dec 28, 2019 at 01:05:13PM +0000, Joseph Myers wrote:
>>>> On Sat, 28 Dec 2019, Segher Boessenkool wrote:
>>>>
>>>>> On Fri, Dec 27, 2019 at 07:47:02PM +0000, Richard Earnshaw (lists) wrote:
>>>>>>       1 Author: Segher Boessenkool <segher@kernel,crashing.org>
>>>>>> *    730 Author: Segher Boessenkool <segher@kernel.crashing.org>
>>>>>>       2 Author: Segher Boesssenkool <segher@kernel.crashing.org>
>>>>>
>>>>> The first and third are only in changelogs.  The second even happened
>>>>> only once, afaics?
>>>>>
>>>>> These errors only happen in the reposurgeon conversion.
>>>>
>>>> This is about extracting attributions from changelogs when unambiguous 
>>>> there, and then correcting mistakes or otherwise making minor variants 
>>>> more uniform.
>>>
>>> Yes, and I'm saying you probably shouldn't do that.
>>
>> Why, for heavens sake?  Even Maxim's conversion is doing this.
> 
> No, it doesn't.  If people sometimes mispel their own name in a changelog
> it does not put that mispeling as Author: in the git commit.

Then either it's psycic, or Maxim is already doing what I suggest.  The
information must come from *somewhere*.

> 
>>> Note that these errors did not exist in the changelog in the commit
>>> message, for example.
>>
>> Yes, they did.  Or at least, they did at the time of the original commit.
> 
> No, they never did.  I always cut off the date/name/email line from the
> changelog in the commit message.

the changelogs command does not extract the data from the commit
message.  I never suggested that it did.

> 
>>> Since people very often typo their own name (as the evidence shows), the
>>> heuristic for deriving it should be robust against that.
>>
>> And the statistics show that it's not hard to identify the odd cases and
>> fix them up.  Only committers with just a single commits are really hard
>> to spot since we don't have data to compare against other entries.
> 
> Sure, so do that?  :-)
> 

Which is the very purpose of this email thread ;-)

R.
> 
> Segher
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]