This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Beyond GCC 3.0

To: gcc at gcc dot gnu dot org
Subject: Beyond GCC 3.0
From: Mark Mitchell <mark at codesourcery dot com>
Date: Wed, 27 Jun 2001 23:11:06 -0700
Some of you have started to ask about 3.1.  Here is the proposal I put 
before the SC about how to manage the 3.1 and subsequent releases.  It has 
been slightly modified to incorporate feedback from the SC.

The SC is favorably disposed to this proposal; there have been no major
objections.  Therefore, it is likely that we will end up with a process
similar to this one.  There is definitely agreement on the SC that our 
current process (or lack thereof) has become far too unwieldy and is not
a good way to produce quality software.  Thus, it seems almost certain
that the SC will decide to do *something* different.

The SC would very much like to hear from developers and
users to hear what you think is good and what you think is bad about
this proposal.  I'm sure that there are lots of ways it could be
improved.  Any process needs refinement as you work with it; I'm confident
that if/when we switch to this process we will find that we want
to make some changes.  So, nothing is meant to be set it stone here.

If you don't like this proposal at all, that's fine too.  The SC
would definitely welcome alternative proposals as well.  It could
be that there's some other process that would do a much better
job accomplishing the same goals.

If this process stuff seems a little bureaucratic, it is.  It's just
that we're about an order of magnitude too big, as a project, to
get by without it. :-(

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

I think the single biggest problem in our current release process
is that we have a total lack of predictability: of effort required,
delivery date, and quality.

In particular, our method is:

  - Develop for a long time.

  - Decide we have done enough, and branch for a release.

  - Try desparately to fix bugs.

This is sort of like trying to have a nice garden by planting new
plants for a year or two, and then trying to pull all the weeds
right before company comes to visit.  You end up with 8 foot tall
thistles.  I know, because last weekend I had to cut down some
thistles that were much bigger than me.  Ouch.

This method works OK for smaller projects, but GCC has hundreds
of developers, supports multiple languages and tens of platforms,
and is very complex.  I think the method does not scale to the
kind of development we are doing now.

Lots of things go wrong:

  - Many bugs creep in during the first phase.  They are harder
    to fix when you find them in the last phase because you don't
    know what caused them, and if you can track down the changes
    that caused them, the people responsible no longer remember
    what was going on.

  - Developers are afraid there will not be another release for
    years, so they rush to try to get in all of their
    changes in.

  - Users are unhappy because the bugs they find in one release
    remain unfixed for eons.  Often, they are fixed in the
    development sources, so they end up using the development
    sources in production situations.  Then, they find other bugs,
    and are disappointed.

  - Quality is not what it should be because there are two many
    bugs to fix.  Do y'all know that we fixed literally several
    *hundred* regressions from GCC 2.95 on the GCC 3.0 release
    branch?  We introduced *hundreds* of new regressions in the
    development period between 2.95 and 3.0.  In all parts of the
    compiler.  And, for the most part, we didn't notice until
    we started working on the release.

  - Volunteers are (empirically) most energetic before a release.
    They are eager to make sure that the final product is good;
    when it is a year away, they do not feel the sense of
    urgency.  As a result, we are typically in the state when
    they are not too energetic.

  - We burn out release managers.  I can see why Jeff got exhausted.
    There is no way to know when you are going to be done.  As RM, you
    feel responsible for the release, and you try to fix all kinds
    of problems that you did not create.  You work 16-hour days
    for weeks straight.  There is no way to estimate "it will take
    this much work to do the release" or "I can work on this
    for N hours a day and be successful".  And all this while you
    get flamed for not doing one thing or another, branching too
    early or too late, not fixing people's favorite bugs, and such.

I think that we need a new method, one that gives better
predictability for all:

  - Do a new release every six months.

  - Do bug fix releases two and four months after the major
    releases, as required.

I think we can achieve this by making a few changes to the mainline
development process, in order to maintain greater stability.  In
particular:

  - All development of major new features (e.g., a new register
    allocator, a new C++ parser, a new Makefile scheme) must be
    done on a branch, or in a local tree -- not on the mainline.
    That way, partially complete features do not show up on the mainline.
    (I spent a not insignificant amount of time removing command-line
    options on the branch for things that people started to implement,
    but did not finish,resulting in user and developer confusion.)

    Before merging to the mainline, the merger should confirm
    non-regression in the testsuites on three or more platforms,
    using volunteers if they themselves do not have access to
    three platforms.  Also, all documentation should be in order.

  - Patches that cause regressions, even on the mainline, must
    be in the process of being fixed within 48 hours, or else can be
    reverted by anyone with global write privileges, if they think
    that's best. The idea is that you musn't check stuff in and then
    leave problems lying around for a week.  This happenned more
    than once in our 3.0 development cycle.

    One of the reasons for this is that other developers cannot
    make progress if things don't build or work, people can't
    remember which test-cases are known to pass and which aren't,
    and so forth.

  - Our development is broken into two-month windows.  Here
    is how it works, first for the mainline:

    Months 1,2: "Free for All"
    --------------------------

    Merges from development branches to the mainline are allowed,
    assuming testing is successful.  Anything goes during this
    period -- this is just like our normal mainline development
    process right now.

    Months 3,4: "Hack On"
    ---------------------

    No merges are allowed from development branches.

    However, improvements of a less major nature (say, refinements to
    the pipeline description for a chip, or enhancements to the loop
    optimizer to make it recognize more GIVs, or to speed up
    the preprocessor by making it use a hash table instead of a list)
    are fine.

    The idea is that we are beginning to reduce the influx of risky
    changes.

    Months 5,6: "Stabilize"
    -----------------------

    The only patches allowed during this time are bug-fixes.  These
    need not be for regressions from the previous release, but they
    must be bug-fixes.  This makes sure that we are paying attention
    to quality on a regular basis.  When we start writing bug-free
    code, this period can be eliminated.

    On the release branch:

    Months 1,2: "Release"
    ---------------------

    Create the branch.
    Fix regressions from the previous release.
    Prepare prereleases.
    Test.
    Release.

    Note that this time will immediately follow a "Stabilize" period
    on the mainline.  Therefore, it is reasonable to expect that we
    are going to be in decent shape by the time we branch -- we will
    have just spent two months fixing bugs.

    Months 2,3: "Dot release 1"
    ---------------------------

    Fix critical bugs.

    Months 3,4: "Dot release 2 (optional)"
    --------------------------------------

    Fix critical bugs, if necessary.

Note that particular goals do not play a part in this strategy.
The goal will always be monotonic improvement: more features, more
platforms, fewer bugs, faster code.  I think that setting concrete
release goals (in terms of new features) is a mistake for us because
we cannot ensure that they get done.  I was a strong proponent of
release goals, and I think that I was wrong.  Unlike a typical
corporate development process, we have no way of actually ensuring
that we meet goals. We cannot open a req to get additional staffing,
or pull people off of one project to work another.  We have to go
with what people decide to do.

The goal we should strive for is that every release should be
better than the one before.
Follow-Ups:
- Re: Beyond GCC 3.0
  - From: Piotr Kasztelowicz
- Re: Beyond GCC 3.0
  - From: Brian Beuning
- Re: Beyond GCC 3.0
  - From: Joseph S. Myers
- Re: Beyond GCC 3.0
  - From: H . J . Lu
- Re: Beyond GCC 3.0
  - From: Stephane Carrez
- Re: Beyond GCC 3.0
  - From: Roman Lechtchinsky
- Re: Beyond GCC 3.0
  - From: Nathan Sidwell
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]