GCC Buildbot Update - Definition of regression
Joseph Myers
joseph@codesourcery.com
Tue Oct 10 21:25:00 GMT 2017
On Tue, 10 Oct 2017, Paulo Matos wrote:
> ANY -> no test ; Test disappears
No, that's not a regression. Simply adding a line to a testcase will
change the line number that appears in the PASS / FAIL line for an
individual assertion therein. Or the names will change when e.g.
-std=c++2a becomes -std=c++20 and all the tests with a C++ standard
version in them change their names. Or if a bogus test is removed.
> ANY / XPASS -> XPASS ; Test goes from any status other than XPASS
> to XPASS
> ANY / KPASS -> KPASS ; Test goes from any status other than KPASS
> to KPASS
No, that's not a regression. It's inevitable that XFAILing conditions may
sometimes be broader than ideal, if it's not possible to describe the
exact failure conditions to the testsuite, and so sometimes a test may
reasonably XPASS. Such tests *may* sometimes be candidates for a more
precise XFAIL condition, but they aren't regressions.
> new test -> FAIL ; New test starts as fail
No, that's not a regression, but you might want to treat it as one (in the
sense that it's a regression at the higher level of "testsuite run should
have no unexpected failures", even if the test in question would have
failed all along if added earlier and so the underlying compiler bug, if
any, is not a regression). It should have human attention to classify it
and either fix the test or XFAIL it (with issue filed in Bugzilla if a
bug), but it's not a regression. (Exception: where a test failing results
in its name changing, e.g. through adding "(internal compiler error)".)
> PASS -> ANY ; Test moves away from PASS
No, only a regression if the destination result is FAIL (if it's
UNRESOLVED then there might be a separate regression - execution test
becoming UNRESOLVED should be accompanied by compilation becoming FAIL).
If it's XFAIL, it might formally be a regression, but one already being
tracked in another way (presumably Bugzilla) which should not turn the bot
red. If it's XPASS, that simply means XFAILing conditions slightly wider
than necessary in order to mark failure in another configuration as
expected.
My suggestion is:
PASS -> FAIL is an unambiguous regression.
Anything else -> FAIL and new FAILing tests aren't regressions at the
individual test level, but may be treated as such at the whole testsuite
level.
Any transition where the destination result is not FAIL is not a
regression.
ERRORs in the .sum or .log files should be watched out for as well,
however, as sometimes they may indicate broken Tcl syntax in the
testsuite, which may cause many tests not to be run.
Note that the test names that come after PASS:, FAIL: etc. aren't unique
between different .sum files, so you need to associate tests with a tuple
(.sum file, test name) (and even then, sometimes multiple tests in a .sum
file have the same name, but that's a testsuite bug). If you're using
--target_board options that run tests for more than one multilib in the
same testsuite run, add the multilib to that tuple as well.
--
Joseph S. Myers
joseph@codesourcery.com
More information about the Gcc
mailing list