This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Notes from the testing BOF at the summit


On Sat, 5 Jun 2004, Ben Elliston wrote:

> Expect is also independently maintained, yet an old (and potentially
> patched) version is carried in src/expect.  This, too, can be removed,
> however it would be a good idea to review all local patches made to
> Expect since its initial import into sources.redhat.com and send those
> patches upstream to Don Libes.

What of HJ's patch for output truncation (an intermittent problem with
various of my tests that generate large amounts of output)?  (Bug 12096;
for further references see
<http://kegel.com/crosstool/current/patches/expect-5.31/pr12096.patch>.)
That doesn't seem to be in src/expect, but I don't know whether src/expect
has the problem with output truncation.

> Once this code has been flushed, developers will have to install
> Dejagnu and Expect themselves.  Many distributions and packaging
> systems now carry the correct versions.  In the case of GCC, we need
> to ensure that the requirements for testing are clearly stated, if
> they are not already.

The requirements on Tcl and Expect versions aren't stated because no-one
knows what they are (or if someone does, they haven't said), and "a
version with this old patch applied" isn't something particularly good to
document (standard versions ought to work without special patches needed);
I don't know if HJ or someone else who understands the patch has submitted
it upstream.

Desirable Dejagnu enhancement: dg-error should match errors only (not
warnings), dg-warning should match warnings only (not errors); this would
simplify many testcases with a more complicated idiom (documented in
sourcebuild.texi) for making sure something is an error.

> The idea of requiring more coverage from new test cases (perhaps with
> a coverage report included with patches submitted) was discussed, but
> no real consensus was reached.  Ditto for the idea of creating a
> collection of unit tests (for which code such as real.c would be a
> good candidate).

I'm all for more thorough testsuite coverage (having written the
recommendation in codingconventions.html for every feature to have
"testcases thoroughly covering both its specification and its
implementation"), and indeed for systematically adding tests for existing
features using coverage testing as a guide to what is inadequately tested.

I'd just be concerned about further increase in the minimum time required
to test a patch (to a part of the compiler requiring a full bootstrap):

At present this needs two bootstraps + testsuite runs (each taking 3.5
hours for me; longer when Ada is restored to operation), one to establish
a baseline and one with the patch applied; often more if a problem with
the patch is shown up by the testing, requiring a third (etc.) bootstrap
with that problem fixed.

This can be optimized slightly when more than one patch is being tested,
by testing multiple patches in succession against the same baseline
compiler version, so needing n+1 bootstraps for n patches rather than 2n,
at the expense of increasing the distance between the version against
which a patch is tested and the version against which it is committed.  
Simultaneous testing of multiple patches at once is safe only for
straightforward patches which are clearly independent and going to be
committed in quick succession anyway, and is still risky if one then needs
to be reverted, having ipso facto turned out not to be so straightforward.

A formal requirement for coverage information (presumably with a full
--enable-coverage bootstrap and testsuite run; after all, a patch could
affect the coverage of code not touched in that patch) would increase the
minimum requirement to four bootstraps for a patch (or 2n+2 or 4n for n
patches), two to establish the baseline for test results and coverage
results and two to test the patch itself.  (You could try to save one of
the baseline bootstraps by presuming that the test results from a coverage
bootstrap are the same as from a normal one, but that seems like a bad
idea to me; the *only* variable changed when testing should be the patch
you are trying to test, and that it does work in a normal non-coverage
bootstrap is a fundamental thing to test, i.e. tests should be in matched
pairs with and without the single patch being tested.)

-- 
Joseph S. Myers
jsm@polyomino.org.uk


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]