This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: GCC Buildbot Update - Definition of regression

From: Paulo Matos <pmatos@linki.tools>
To: gcc at gcc dot gnu dot org
Date: Wed, 11 Oct 2017 11:03:08 +0200
Subject: Re: GCC Buildbot Update - Definition of regression
Authentication-results: sourceware.org; auth=none
References: <3b204c53-7613-ffe2-172b-69ad2a781ceb@linki.tools> <alpine.DEB.2.20.1710102111080.20946@digraph.polyomino.org.uk> <726e85ce-9ded-e9b4-6412-edb1da76c768@linki.tools> <CAKdteOYgCJarP+XAvJyYCt3Ei-8-=g8miFJrw_YSc_LBO4gm6A@mail.gmail.com>


On 11/10/17 10:35, Christophe Lyon wrote:
> 
> FWIW, we consider regressions:
> * any->FAIL because we don't want such a regression at the whole testsuite level
> * any->UNRESOLVED for the same reason
> * {PASS,UNSUPPORTED,UNTESTED,UNRESOLVED}-> XPASS
> * new XPASS
> * XFAIL disappears (may mean that a testcase was removed, worth a manual check)
> * ERRORS
> 

That's certainly stricter than what it was proposed by Joseph. I will
run a few tests on historical data to see what I get using both approaches.

> 
> 
>>> ERRORs in the .sum or .log files should be watched out for as well,
>>> however, as sometimes they may indicate broken Tcl syntax in the
>>> testsuite, which may cause many tests not to be run.
>>>
>>> Note that the test names that come after PASS:, FAIL: etc. aren't unique
>>> between different .sum files, so you need to associate tests with a tuple
>>> (.sum file, test name) (and even then, sometimes multiple tests in a .sum
>>> file have the same name, but that's a testsuite bug).  If you're using
>>> --target_board options that run tests for more than one multilib in the
>>> same testsuite run, add the multilib to that tuple as well.
>>>
>>
>> Thanks for all the comments. Sounds sensible.
>> By not being unique, you mean between languages?
> Yes, but not only as Joseph mentioned above.
> 
> You have the obvious example of c-c++-common/*san tests, which are
> common to gcc and g++.
> 
>> I assume that two gcc.sum from different builds will always refer to the
>> same test/configuration when referring to (for example):
>> PASS: gcc.c-torture/compile/20000105-1.c   -O1  (test for excess errors)
>>
>> In this case, I assume that "gcc.c-torture/compile/20000105-1.c   -O1
>> (test for excess errors)" will always be referring to the same thing.
>>
> In gcc.sum, I can see 4 occurrences of
> PASS: gcc.dg/Werror-13.c  (test for errors, line )
> 
> Actually, there are quite a few others like that....
> 

That actually surprised me.

I also see:
PASS: gcc.dg/Werror-13.c  (test for errors, line )
PASS: gcc.dg/Werror-13.c  (test for errors, line )
PASS: gcc.dg/Werror-13.c  (test for errors, line )
PASS: gcc.dg/Werror-13.c  (test for errors, line )

among others like it. Looks like a line number is missing?

In any case, it feels like the code I have to track this down needs to
be improved.

-- 
Paulo Matos

Follow-Ups:
- Re: GCC Buildbot Update - Definition of regression
  - From: Christophe Lyon

References:
- GCC Buildbot Update - Definition of regression
  - From: Paulo Matos
- Re: GCC Buildbot Update - Definition of regression
  - From: Joseph Myers
- Re: GCC Buildbot Update - Definition of regression
  - From: Paulo Matos
- Re: GCC Buildbot Update - Definition of regression
  - From: Christophe Lyon

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]