This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GCC Commit Stats [was: [GCC Steering Committee attention] [PING] [PING] [PING] libgomp [...]]


On 5 August 2016 at 18:34, James Greenhalgh <james.greenhalgh@arm.com> wrote:

> I've given the 2012-2015 numbers below, just to show that (for the files
> in gcc/*.[ch]) your hypothesis doesn't hold. The vast majority of
> committers make <20 commits in a year.

My hypothesis is that fewer people are increasingly doing most of the
work. If the vast majority of committers do very few commits, that
doesn't disprove (neither necessarily prove) my hypothesis. Either
very few commiters do most of the commits, which is consistent with my
hypothesis, or most of the commits are done by a very large number of
committers who individually do very few commits, which contradicts my
hypothesis. Either fact is consistent with your numbers.

However, your numbers do not seem to indicate any sudden changes in
trends, which argues against my hypothesis of "increasingly". Yet, I
would argue that the trend not changing despite the continuous
increase in code is itself worrying. But yes, the "increasingly" is
definitely the weakest part of my hypothesis.

>> * 100 commits is less than 2%. Quite a low threshold. Perhaps 1%, 25%,
>> 50%, 75%, 90% are more informative.
>
> Again, just done for time. I've changed the last two buckets to 100-199
> and 200+ in this run. If you'd like to do, I'd be happy to see the
> results.

200 is around 10% of commits. Thus, 2 people do at least 20% of commits.

>> that is, most of the commits are done by smaller fraction of the
>> total.
>
> For 2015 I found the 4 "25%" marks to be:
>
>   26%    1-4
>   25%    5-13
>   25%    14-39
>   23%    40+

> So 75% of the work is being done by people who commit fewer than 40
> patches in a year. Encouragingly 50% of the people who committed in
> 2015 committed at least one patch per month (on average).

Sorry, what is each column? If 26% of people commit between 1-4, this
does not mean that 26% of the work is done by people who commit
between 1-4 patches.

> Personally, I think that looks like a fairly stable and healthy community,
> but you're welcome to draw your own conclusions from the data.

It looks more stable than what I would have expected. I'm not totally
convinced about the thresholds, though. I believe the best way to
measure is more similar to how openhub does: Sort committers by number
of commits, then calculate their respective percentage w.r.t. to the
total, and summarize at given thresholds the cumulative percentage,
then print "%total commits" "%commiters". Do the same for every year.
Exclude Ada, Fortran, and Go if possible.

It would be interesting if someone could generate those numbers. If
you get me the raw numbers of commits per committer per year, I can
easily generate the stats and even nice plots if you wish.

I'd be happy to see my hypothesis disproved.

Cheers,

Manuel.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]