This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
"ways of accomplishing ... the basic goal"
- From: Tom Lord <lord at emf dot net>
- To: gcc at gcc dot gnu dot org
- Cc: mark at codesourcery dot com
- Date: Thu, 6 Feb 2003 15:35:05 -0800 (PST)
- Subject: "ways of accomplishing ... the basic goal"
Mark writes:
I am very willing to consider other alternative ways of
accomplishing what I consider the basic goal: making high-quality
releases on a frequent enough basis that most development is
focused on the FSF versions of GCC, rather than on various forks.
[....]
I'd expect these to be 3-5 page proposals. When we've got one
or two ready, let's talk about them a bit, present them to the
SC, and decide.
The enclosed proposal is about 5 pages, though appended to it is a QnA
from another mailing list that increases it's length. I included that
since it seems to cover the kinds of questions that I think the
proposal is likely to generate.
-t
GCC Development Process Proposal
Author: Thomas Lord (lord@emf.net)
With contributions from: Robert Collins (rbcollins@cygwin.com)
Mark Mitchell, the GCC release manager, posed these questions
regarding proposed improvements to the GCC development process:
- What methodology to you have in mind?
- When do release branches get made?
- When do releases get made?
- Are there development stages? If so, what are they, and
how are the transitions managed?
- Who will make decisions?
- How will conflicts and disagreements be dealt with?
- How will this be an improvement over our current system?
- What primary problems are you trying to solve? How are
they solved?
- What new problems do you foresee? How will they be dealt
with?
________________________________
* "What methodology do you have in mind?"
** Source Management Refinements
The GCC should adopt a source management model which combines two
simpler models in a clever way. The simpler models can be called:
Continuous Release Management (CRM)
and
Hierarchical Software Management (HSM)
In CRM, changes are made only after they pass rigorous acceptance
tests. A hurdle is set -- and changes which do not make it over
that hurdle are rejected. If the hurdle is high enough to define
the quality criteria for a release or release candidate, then the
tree being managed by CRM is maintained in a perpetually releasable
state and, by a useful metric, increases monotonically in quality
over time.
The problem with CRM is that rigorous acceptance testing becomes
quite expensive as the requirements grow stricter. In effect, CRM
is a limit on the rate at which changes can be made. (See the
endnote [GCC change rate].)
In HSM, a single, primary tree is replaced by a "tree of trees" with
the primary tree at the root. Changes flow from the leafs towards
the root of a tree. Each node in in the tree is managed by CRM
principles, with the acceptance requirements becoming progressively
stricter, closer to the root.
HSM overcomes the rate-limiting problems of CRM by: (a) transforming a
large number of changes from leaf nodes into a small number of changes
to parent nodes; (b) vetting changes before they reach parent nodes
and thus reducing the probability that they will be rejected.
Diagrammatically, we might have an HSM tree:
gcc--devo--5.2
|____gcc--simple-red--5.2
. |____ gcc maintainer Alice
. |____ gcc maintainer Bob
. |____ gcc maintainer Candice
. ...
|____gcc--simple-blue--5.2
. |____ gcc maintainer Derick
. |____ gcc maintainer Ellen
. |____ gcc maintainer Frederick
. ....
|____gcc--critical--5.2
|____gcc--java--5.2
|____gcc--fortran--5.2
|____gcc--ada--5.2
|____gcc--big-changes
. |____gcc--feature-X--5.2
. |____gcc--feature-Y--5.2
|____gcc--vendor-A--5.2
|____gcc--vendor-B--5.2
...
In this situation, Alice, Bob, and Candice, rather than committing
their simple changes directly to `devo', commit them to `simple',
which undergoes nightly testing. At intervals (say, O(1/day) since
these are allegedly "simple" changes), their accumulated changes are
considered for merging into `devo'.
The situation is similar for Derick, Ellen, and Frederick -- except
that they have a separate branch.
As the `devo' tree changes, the changes made their are merged back
into the next level of the tree (`simple-red', `simple-blue',
`critical', `java', etc.)
In this way, if Alice, Bob or Candice make a mistake -- their
mistake is more likely to be resolved in the red team branch
(`simple-red') and has no impact on the blue team branch
(`simple-blue').
One problem with HSM is that it requires a tool more powerful than
CVS to manage the flow of changes. To be practical, it
requires a revision control tool with better support for branching
and for repeated, bi-directional merges between branches.
Another problem with HSM is that, to be practical for GCC, it
requires enhanced testing facilities for the various branches, to
overcome the oft observed problem that the primary tree gets the
best testing.
(See the endnote [Branching is not Heavyweight].)
The ideal fan-out and partitioning of efforts into sub-projects is
an open and fluid issue. The political question of how the teams
for various branches is formed is also not addressed in this short
proposal.
** Political Refinements
The SC is charged with protecting the project against the conflicts
of interest experienced by major contributors and with advancing the
goals of the FSF and the GNU project (see http://gcc.gnu.org/steering.html
and http://gcc.gnu.org/gccmission.html)
Therefore the SC, in cooperation with and seeking the endorsement of
the FSF should:
1) Develop a more detailed long-term mission for GCC with
a rationale that states how that mission relates to the
goals of the GNU project.
2) For each upcoming release, develop a set of criteria
stating what tests must be passed, what features must
be present, and what features may be present before a
release candidate is declared. This criteria should
include a rationale that states how the criteria relate
to the goals of the GNU project in general, and the long
long-term mission of the GCC project in particular.
By taking these steps, should anyone ask someday "Is the GCC project
suffering from conflicts of interest among the developers?" the SC
can point to these documents, their timing, and their endorsement by
the FSF and say "Certainly not."
** "When do release branches get made?"
When (a) criteria are developed for a release, (b) contributors want
to work on features for that release, (c) release manager and
acceptance testing bandwidth is available for that release.
** "When do releases get made?"
Candidates are released when the release criteria are satisfied.
Final releases are made after a period of quiet in critical bug
reports for the latest candidate. Critical bug reports for a
candidate, of course, become new criteria for that release.
** "Are there development stages? If so, what are they, and
how are the transitions managed?"
There do not need to be project-wide development stages. An RM may
decide, upon making a first candidate release, to limit the kinds of
subsequent mainline changes that are permitted -- but as the
effectiveness of the acceptance tests for the mainline increase, the
need for such restrictions diminishes.
Other nodes in the hierarchy might or might not impose cycles of
their own. As each change passes passes up the hierarchy of nodes
towards the mainline, it is subject to increasingly strict review
and testing. In effect, each _change_ goes through cycles as it
moves towards the mainline, but it does so asynchronously with many
other changes.
Most importantly, the acceptance tests will eliminate one of the
most significant problems with the current development stages: that
of maintainers checking in changes that are known (or easily
discovered) to introduce regressions, thus increasing the amount of
work that needs to be done before the sources are stable enough for
a candidate or actual release.
* "Who will make decisions?"
The SC should be the author of the long-term and per-release goals,
however those documents should be endorsed by the FSF.
Decisions about patch acceptance are made by a combination of
factors:
1) Maintainers of the hierarchical nodes immediately
below the mainline apply their judgement and testing
resources, seeking to ensure both that the changes
are consistent with the project goals and are likely
to pass acceptance tests.
2) The RM (gatekeeper of the mainline) primarily relies
on the acceptance tests and judgement of of the
maintainers of immediately subordinate nodes in the
hierarchy, but should also review incoming changes
to ensure that they are consistent with the project
goals.
* "How will conflicts and disagreements be dealt with?"
Through reasoned discourse, moderated if necessary, aimed at
a mixture of persuasion and comfortable compromise.
* "How will this be an improvement over our current system?"
It will reduce the rate of commits (not the rate of lines changed)
to the GCC mainline, enabling frequent and fine-grained acceptance
testing.
It will increase the quality of commits to the GCC mainline.
It will clarify the FSF's interests in the projects and help to
ensure that they are well served by development.
It will create new opportunities for cooperation among contributors
at various vendors.
It will improve the "emergency responsiveness" of GCC by lowering the
cost of making quick releases from the latest sources.
It will objectify many decisions that are currently the "judgment"
of the RM or (alleged) "consensus" of maintainers.
It will increase the number of opportunities to review and test
changes before they reach the mainline.
* "What primary problems are you trying to solve? How are
they solved?"
1) To depoliticize the _timing_ of changes to the mainline
by: (a) imposing acceptance tests on any proposed changes,
(b) maintaining the mainline in a (nearly) continuously
releasable state so that change timing depends more on
the quality of the change than on any vendor's schedule.
In other words -- a change is made when it is _ready_, not
when it is politically expedient to force it to appear in
the next release.
2) To firmly and unambiguously avoid conflict of interest issues
by making a higher priority of relating the project's goals
to the goals of the FSF and the GNU project. It is worth
noting that, by all accounts, the FSF's goals are not only not
inherently hostile to the goals of the corporations who are the
major contributors to GCC, but are in fact often synergistic
with those goals.
3) To intelligently divide up the work needed to make changes
to the mainline -- to organize the work of fixing new
regressions _off_ the mainline. In other words, to decrease
the burden on the RM by creating a set of "trusted lieutenants"
who take on some of that responsibility.
4) To increase the incentive of contributors to cooperate with
one another as their changes are "grouped" at intermediate
nodes in the hierarchy.
5) To increase the emergency responsiveness of the GCC project.
* "What new problems do you foresee? How will they be dealt
with?"
There are three key challenges.
* Developing the acceptance testing infrastructure. This can,
of course, grow largely out of existing testing resources --
but the challenge is to automate that infrastructure for
this proposal.
* Forming effective teams for subprojects.
* Arming the teams with tools and processes that enable smooth
operation of the HSM and CRM software release processes described
here.
It is proposed to:
1) See tentative approval from the FSF for the concepts presented
here
2) With the FSF's tentative approval, ask the major corporate
participants in the project to pledge resources to develop the
tools and formalise the processes.
The developement of the tools and processes will require careful
dovetailing with the existing and final process's - a close working
relationship will be essential. If carried out with care, the
resulting tools and process documentation will be a blueprint for
managing many large scale free software projects and will be
reusable in such projects, allowing significant amortization of the
resource costs contributed by the commercial participants.
________________________________
Endnotes:
[Branching is not Heavyweight]
Judging by the `gcc' mailing list, developers object to branches for
two reasons:
1) Development on branches doesn't get as much testing as
development on the trunk.
2) CVS tagging is slow and, in general, branches are awkward
to manage with CVS.
One of the premises of this proposal is that a greater level of
investment is called for in further automating the existing testing
resources. That effort should be sure to make ample testing of
branches a high priority.
Modern revision control systems are getting far handier at making
branches convenient. As an example of a revision control
paradigm for HSM branches, please see:
http://regexps.srparish.net/tutorial/elementary-branches.html
as background and:
http://regexps.srparish.net/tutorial/development-branches.html
for how I think most HSM branches should work.
[GCC change rate]
The change rate of GCC appears to be notably high:
Avg commits/day
May 2002 38
Dec 2002 33
Jan 2003 51
01..04 Feb 2003 47.5
In addition, there are currently about 250 people with some form of
write access to the mainline, roughly half of which do not require
prior approval to commit.
In an ideal world, with "infinite" and "infinitely fast" testing
servers, we might consider imposing a requirement that every change to
the mainline must pass some rigorous, automated acceptance tests
before the `commit' completes.
Not all of those commits need to be tested, of course. For example,
documentation changes and commits to non-critical components.
However, let's be pessimistic and assume a commit rate that peeks at
75/day, all of which need testing.
Unfortunately, testing GCC is not cheap. Let's be pessimistic and
assume that on target platforms, bootstrapping the compiler takes two
hours. Let's suppose that we set an upper bound on acceptance test
times at an additional two hours. Therefore, for each acceptance test
target platform, to test each commit individually requires:
75 commits * 4 hours / commit * 1 testing server / 24 hours
~= 13 test servers (per target platform)
and we haven't yet begun to consider the lag time this introduces
between initiating a commit, and its appearance in the mainline
sources.
If individual acceptance testing for each change is economical at
all, it is barely so. That is why this proposal combines CRM (based
on acceptance changing) with HSM (aimed at reducing the frequency of
changes, though not the rate of lines changed).
________________________________
Postscript: a good QnA from the arch-users list
From: Tom Lord <lord@emf.net>
To: arch-users@lists.fifthvision.net
In-reply-to: <1044506763.15599.82.camel@lan1> (message from Robert Anderson on
05 Feb 2003 20:46:02 -0800)
Subject: Re: new draft Re: [arch-users] towards a free software process
Sender: arch-users-admin@lists.fifthvision.net
X-BeenThere: arch-users@lists.fifthvision.net
X-Mailman-Version: 2.0.13
Precedence: bulk
Reply-To: arch-users@lists.fifthvision.net
List-Help: <mailto:arch-users-request@lists.fifthvision.net?subject=help>
List-Post: <mailto:arch-users@lists.fifthvision.net>
List-Subscribe: <http://lists.fifthvision.net/mailman/listinfo/arch-users>,
<mailto:arch-users-request@lists.fifthvision.net?subject=subscribe>
List-Id: Arch users discussion list. <arch-users.lists.fifthvision.net>
List-Unsubscribe: <http://lists.fifthvision.net/mailman/listinfo/arch-users>,
<mailto:arch-users-request@lists.fifthvision.net?subject=unsubscribe>
List-Archive: <http://lists.fifthvision.net/pipermail/arch-users/>
Date: Wed, 5 Feb 2003 21:46:13 -0800 (PST)
X-UIDL: 6f5271ba5127a63fbc220d123f3d2cec
Good questions.
Bob:
1) Who writes the acceptance tests, and what motivates them to
do it?
The GCC developers (currently) write such tests. They seem to be
motivated by the fact that it improves the quality of GCC and reduces
the cost of GCC maintenance.
Specifically, GCC has a large and actively maintained/extended test
suite which is most of what would make a very good starting point for
acceptance tests. In addition, many contributors have their private
tests with varying degrees of automation that could conceivably be
made part of acceptance testing.
There is already a patch submitter guideline along the lines of "make
sure your patch passes these tests, at least on your platform" and a
general presumption that maintainers already do that themselves or
have very good judgement about safe changes when they skip testing.
There is an almost-acceptance-test policy that regression-inducing
patches that aren't fixed within 48 hours can be reverted -- however
the mechanism for doing so, while getting better with the binary
regression searching effort, is still expensive and politicized.
The policy is politicized because it isn't strict. It only really
kicks in if a regression gets in the way of some maintainer or
important contributor and that party makes noise. Otherwise,
regressions are allowed to persist until "later" -- after all, there
will always be a bug-fixing phase before the next release. Witness
Mark just tut-tutting about the number of regressions as they
accumulate while also noting that there is no supervisory role in the
process that can force those to be fixed in a timely manner. Note
that there also seems to be a game just below the surface here: if
Apple forces IBM's change to be reverted, will the team at IBM then
engage in tit-for-tat? Would an avalanche of strict reversions bring
progress to a halt?
It appears that, for large changes, it is de rigeur to merge into
the mainline first, then fix some of the new regressions second.
This happens when, for example, a vendor providing such a change
extracts a promise from the release manager that a certain change
will appear in a particular release -- then the merge happens
before the release cycle switches to bug fixing. As I said, as long
as the regressions don't interfere with all the other developers
scrambling to merge -- the regressions can just pile up until the
feature freeze.
In other words -- the acceptance tests are basically there -- they
just are currently used ex post facto as "how bad are we doing" tests
instead of "how good are we required to do".
There is a tension in the current development environment between
developing on branches and on the mainline (or -- I guess they call
it, "the trunk"). The feeling seems to be that development on
branches don't get testing and that only by merging things onto the
mainline can you get testing. Investment in automated testing
(applicable at any node in the hierarchy) along with a strict policy
of acceptance testing can help to resolve that tension.
The topic comes up often, in one form or another, on the gcc list --
one example is the thread containing this message:
http://gcc.gnu.org/ml/gcc/2001-07/msg01574.html
Test results aren't uniform across all platforms. Thus, one way to
turn the existing development tests into acceptance tests is simply to
dedicate a farm of servers, one per official target, on which the
tests can be run. The HSM organizational structure of the proposal is
aimed, in part, at reducing the required capacity of such a test farm
to a reasonable size.
2) Who organizes the working groups?
It's supposed to be a 3-5 page proposal. I think that within that
constraint, that has to be left as an open question. There are too
many people involved to make up a pat answer.
Currently, there are over 200 people with write privileges for GCC,
about half of which don't require review before commit. In my
informal sampling, the number of committers per day was around half
the number of commits.
While I don't claim it will be easy, I don't think it's a radical
claim that that many committers, making 30-50 commits per day, is
inherently destabilizing. (Yes, yes, it's a nice hypothesis that
these folks do near-perfect work, that regressions are rare and
easilly reverted -- but in practice, people are "gaming" the process
of what regressions they can get away with in order to take advantage
of anemic testing resources and to impose forcing functions on the
scheduling of features wrt. releases).
3) Are you really proposing a process for which the tools
don't exist?
"aren't finished" rather than "don't exist" -- Yes.
4) Even if the proposal was accepted, how would it be phased in?
Get a budget. Hire some hackers. Hire some software managers who are
good at managing large scale projects. Make sure those hires include
plenty of people who are good field engineers. Treat it as a 1-2 year
practical research problem whose success will be judged by technology
transfer.
The section that Rob rewrote is critical. I don't think it can be
done grass roots. GCC hackers at various vendors have to know that
their employer wants this to happen; the SC has to see it as part of
their mission.
-t
_______________________________________________
arch-users mailing list
arch-users@lists.fifthvision.net
http://lists.fifthvision.net/mailman/listinfo/arch-users