This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: GCC 3.3, GCC 3.4
- From: Tom Lord <lord at emf dot net>
- To: kenner at vlsi1 dot ultra dot nyu dot edu, mark at codesourcery dot com
- Cc: gcc at gcc dot gnu dot org
- Date: Fri, 31 Jan 2003 14:58:25 -0800 (PST)
- Subject: Re: GCC 3.3, GCC 3.4
- References: <10301311356.AA01395@vlsi1.ultra.nyu.edu>
I would like to try to step lightly around questions and issues that
are likely to lead to, at best, flammage -- and cut right to proposing
a positive agenda for improving the relationship between corporate
contributors and the GCC project.
My goals for the agenda are:
1) It's self-evidently good for the FSF, the GNU project, and the free
software movement. We don't need a deep philosophical discussion
about free software to evaluate this.
2) It's self-evidently good for the corporate contributors (thus
encourages them to keep contributing and to contribute more).
3) It's self-evidently good for the individual contributors.
4) It results in a project management model and technology
infrastructure that benefit other free software projects as well.
5) It relieves Mark from as much as practical of the "very hard
balancing act" he faces as RM.
* Problem Statement
I'll just take this straight from Mark himself:
I find [being RM] a very hard balancing act. Do releases
often, and you're always working on fixing regressions,
testing, packaging, etc. Do releases rarely and the bugs get
so bad it's incredibly hard to fix them all. Do them on a
major GNU/Linux vendor's release schedule and you get more
help from those vendors, but maybe not at a time that's
terribly natural for the pace of other development. Decide
that there have been enough major changes for one release, and
you risk irritating the people who have been working on the
branch that didn't get in. Take the code anyhow and you risk
introducing more bugs and exposing users to immature
technology. You get the picture. :-)
I completely agree that more automated testing, and
non-automated testing, would be extremely helpful. I also
agree that more bug-fixing would be good; right now we *know*
about a lot of regressions -- but we don't have fixes for most
of them. Architecture cleanups that make it harder to
introduce bugs also help a lot.
* Problem Analysis
** The RM and volunteer costs of making releases are too high.
Mark says:
Do releases often, and you're always working on fixing
regressions, testing, packaging, etc. Do releases rarely and
the bugs get so bad it's incredibly hard to fix them all.
I believe that that is a symptom of the phased development
practices of GCC which are themselves a necessary consequence of
the overloading of the purpose of the mainline, which in turn is a
consequence of underinvestment in process automation.
The GCC mainline rises and falls dramatically in quality over
fairly short periods of time. It currently _must_ behave that way
because of two of the primary uses of the mainline: (1) The
mainline is (rougly speaking) the focus of testing. Regressions,
for example, are often not discovered until after they appear on
the mainline. (2) The mainline is (roughly speaking) the locus of
merging activities. Major new feature additions must appear on the
mainline before they propogate out to developers working in
parallel.
A third use of the mainline is that it is the source of releases.
Generally speaking, it is a project goal that releases be of
monotonically increasing quality over time. Yet the first two
uses of the mainline ensure that the mainline itself does not have
that monotonic property. Consequently, development must be broken
up into phases: periods of time must be established during which
the goal is to ensure that the mainline sources are unambiguously
better than the previous release. During such phases, time is
taken up by the tasks Mark mentions ("fixing regressions, testing,
packaging, etc.").
The best solution I know of to the problems of phased project
management is *continuous release management*:
In theory, the situation could be improved by separating the
concerns of the mainline into separate branches. Perhaps, for
example, a primary development branch, in which most merging
activity takes place. A testing branch from the development
branch, on which regressions are continuously identified and fixed.
And finally a release branch, protected by invariants that assure
it's continuous increase in quality, which is perpetually ready to
cut a new release at the "push of a button".
In practice, that ideal is difficult to achieve because of
established investments in tools and practices and because of
deficiencies in the tools that are available to the project.
Yet if the ideal could be achieved: an "orthogonalization" of
merging, testing, and release cutting, then contributor labor could
be used more opportunistically. For example, a vendor anxious to
see a feature that is stuck in testing appear in a release could,
at any time, simply contribute fixes to existing regressions and,
thus, promote the wanted feature to the release branch. There
would be no need to pressure or wait for the RM to declare a
release phase -- anyone could cut a release at any time from
sources that have already and continuously been vetted as being of
release quality.
** The fear of forks is too high.
There is difference between "fragmenting forks" and "cooperative
forks". Fragmenting forks divide the loyalties of contributors.
Cooperative forks preserve project coherence, but enable users
to choose their own release points.
Mark says:
Do [releases] on a major GNU/Linux vendor's release schedule
and you get more help from those vendors, but maybe not at a
time that's terribly natural for the pace of other
development. Decide that there have been enough major changes
for one release, and you risk irritating the people who have
been working on the branch that didn't get in. Take the code
anyhow and you risk introducing more bugs and exposing users
to immature technology. You get the picture. :-)
Here again, practices based on continuous release management (CRM)
rather than phased development would largely solve the problem.
Any vendor could, with CRM, cut a release at any time. Any branch
author wanting to reach a releasable state could contribute or
muster the effort to get that branch through the test branch to the
release branch. In other words, demands on releases and
contributions to releases would be balanced and matched in a very
fine-grained way.
The natural worry is that if vendors begin cutting their own
releases, how can bug reports be interpreted and where do bug fixes
go? Won't a plethora of custom releases fragment the feedback from
the user community?
A modest amount of discipline exhibited by the major contributors
can alleviate that concern. Major contributors need simply make a
commitment to, for each regression reported for a custom release:
add a test to the project's test branch and contribute the bug-fix
there (even if it is also made on a branch that represents a custom
release).
** The SC and RM are unfunded mandates.
Mark says (note: I'm cutting up his text non-linearly here):
I completely agree that more automated testing, and
non-automated testing, would be extremely helpful.
[...] Architecture cleanups that make it harder to introduce
bugs also help a lot.
These appear to be socio-political problems more than technical
ones. Mark can identify these needs clearly -- yet who is
listening and responding?
In the glorious unix-industry days of the 80s, the compiler groups
at each vendor had real clout. They had the ears of VPs. They
could put their foot down and get money.
Things are more complicated now. The SC and RM do not appear on
any corporation's org chart. They have no legal standing to
receive or administer a budget.
Those circumstances are a bug. They're a regression in the
organizational "program". It sounds corny but I'm quite serious
when I say that the core maintainers, and the SC (including the RM)
need to develop a better sense of themselves as Labor and as
custodians of an important vendor-neutral project. They need a
sense of solidarity.
How many years are we going to lament that the project can't
properly do it's job because of underfunding? As engineers,
especially engineers working on critical, public projects -- we
have an obligation to find ways to put our foot down. I don't
personally care whether it happens quietly on back-channels or
openly through a general strike, but there *is* a funding crisis
here. The free software vendors are behaving like the free-riders
that so many of their theorists once worried would undercut them.
** Volunteers need not be volunteers at all.
Mark:
I also agree that more bug-fixing would be good; right now we
*know* about a lot of regressions -- but we don't have fixes
for most of them.
There is a huge number of skilled, unemployed techies.
It doesn't cost anything (on a balance sheet) for Mark to declare
how important those regressions are. It doesn't cost anything to
promote the myth that GCC volunteerism (still) leads to a career.
Yet I think we have to question the ethics of relying on those
solutions when even modest "bug bounties" would be ethically
unamiguous.
* Towards a Solution
The analysis points towards the two components of solutions to
Mark's "very hard balancing act":
1) Tools and policies for continuous release management.
2) Socio-political efforts to change the way corporations spend
money towards free software projects such as GCC.
Now, you know me: Give me half a chance and I'll start telling you
how `arch' can contribute a heck of a lot towards continuous release
management. This isn't, at the moment, the right form to say more
about that. I'd also add that codesourcery's project QMTest, while
I find there to be some design problems with that system, does at
least have the fairly unique virtue of being designed with lots of
good ideas about improving testing automation. But, ultimately,
all of this stuff is "easy", or at least straightforward. This
isn't rocket science -- just common sense.
No, problem (2), the funding situation, is the root problem.
Mark, SC -- it's time to put your foot down.
Regards,
-t