This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Testing GCC & OSDL


On Fri, 17 Sep 2004, Laurent GUERBY wrote:

> On Fri, 2004-09-17 at 10:43, Joseph S. Myers wrote:
> > A regression tester that tests every mainline commit rather than batching 
> > them would be feasible on a small cluster of machines (about 30 commits a 
> > day to mainline, a bit more if you want to test release branches as well), 
> 
> BTW, is there a script that assumes a local CVS repository (rsync'ed)
> and gives you the list of CVS dates in between commits? (so that
> "cvs co -D X" for X in this list gives the interesting list of sources)

Watching gcc-cvs was the most obvious method that occurred to me for such 
a tester to identify changesets.  Things would be easier with a version 
control system that actually had changesets, but I don't think this is a 
major obstacle if the machines were there to devote to testing every 
commit (covering *every* branch would take about twice as much resources 
as just mainline plus release branches); you can approximate changesets 
well enough from CVS.  For extracting past changesets, scripts to convert 
existing repositories to different version control systems (e.g. the one 
that converts CVS to Subversion) may be of use, though I think too slow to 
use in a realtime tester.

> 1. keep only one out of N so you still speed up the binary search but
> need to finish by building a few compilers

We do indeed effectively have something as good as this - Phil's 
Regression Hunter <http://www.devphil.com/~reghunt/> tracks problems down 
to a daily build and the scripts in contrib/reghunt then serve to do a 
binary search (on timestamps rather than changesets) to locate the 
individual patch that caused a problem, and these methods get used to fill 
out regression bugs with details of what caused the regression.

Keeping builds for every changeset would be a minor refinement to speed up 
the process of locating the cause of a newly found regression that may 
have been there for a while.

Having regression testers test every changeset as it is made and report 
regressions within a few hours would mean that people are told of 
regressions their patches have caused while the patches are still fresh in 
their minds, rather than getting a regression report covering many patches 
at once and suspecting it's probably someone else's patch.  (Checkins 
while there's a build failure cause trouble in this regard, but enough 
regressions appear while the tree stays building throughout that I think 
the principle's still useful.)

> 4. compress and/or use binary deltas (documentation, language, platform
> and library patches will affect only one part of the installed files),
> but you will have to uncompress before use (not much of a problem given
> CPU/disk performance ratio these days).

I think this would be a useful practical approach - e.g. keep one in 32 
builds (so approximately one a day for mainline), keep those half way 
inbetween as binary deltas from those 16 before, etc., so a regression 
test would finish with copying an endpoint tree and applying a delta to 
get a midpoint tree to test (five times).  The 1.5GB disk writing involved 
could be saved (reduced to 0.3GB reading of the startpoint tree) by using 
a big enough filesystem in RAM.

Experiment would be needed to determine how much disk space is needed to 
store trees in deltas in this fashion, and how fast regression testing 
could be with this approach.

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
  http://www.srcf.ucam.org/~jsm28/gcc/#c90status - status of C90 for GCC 4.0
    jsm@polyomino.org.uk (personal mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]