This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: comparing DejaGNU results

From: James Lemke <jwlemke at wasabisystems dot com>
To: Mike Stump <mrs at apple dot com>
Cc: gcc at gcc dot gnu dot org
Date: Thu, 01 Jun 2006 15:35:22 -0400
Subject: Re: comparing DejaGNU results
References: <1149099224.7265.396.camel@winch.thelemkes.ca> <C330B40D-FA23-40B6-8A56-1F5967285D6F@apple.com>

> Please do.  I'd welcome it (and scripts to generate html, to track  
> known issues, to trim log files, to drive things and do on)...  I  
> think having a few different styles would be good, then people can  
> try them all out and see which ones they like and why.  Anyway, for  
> me, it isn't yet better:
:
> I tried all the results and couldn't find any that your script could  
> analyze, except libgomp, but only because libgomp doesn't run on my  
> system, so there are no results in it?  Is it just me?  I hand edited  
> my results to be the first 30 or so from the Objective C++ testsuite,  
> and then I got it to analyze and not dump out.
> 
> When I tried gcc, I had a chance to notice the timings, your version:
> 
> real    8m44.413s
> user    2m0.714s
> sys     7m54.847s
> 
> mine:
> 
> real    0m1.994s
> user    0m1.756s
> sys     0m0.556s

As I said I had only used it on Linux, I assume you're using Darwin.
I tried it there and had to fix the following.  You got the first two
but not the last so my script was way non-functional.
sed uses -E instead of -r (as you noted)
sort uses -m but not --merge (as you noted)
sed doesn't honour \t or \s

> :-)  Maybe you only run the script on toy results?  Or, do you just  
> drink lots of coffee?  Now, I know mine is well more than 10-100x  
> slower than it could be, but waiting 2 seconds isn't a hardship for  
> me.  Waiting 8 minutes strikes me as just a little slow.
I said it was slow :-)>
On NetBSD your approach is at least 50x faster.  Strangely, on FC3 the
two have similar elapsed times:

time ./compare_tests gcc-sim-20050503.sum gcc-sim-20060528.sum >compare2.lst
real    4m0.729s
user    4m5.880s
sys     0m0.231s

time ./dg-cmp-results.sh -v "" gcc-sim-20050503.sum gcc-sim-20060528.sum >dg-cmp2.lst
real    4m25.941s
user    1m29.850s
sys     3m8.021s

> The output of yours doesn't seem to be targeted for human eyes, the  
> verbosity (at the least verbose setting) is about 121x more than mine  
> for two random sets I had lying around, and that is with it cutting  
> the output off really early due to the above problem.  I predict that  
> in normal day to day use, it is well better than 120,000x larger.   
> What use do you get out of it?
It obviously wasn't working properly for you due to linux - netbsd
differences.
wc -l *2.lst
  66 compare2.lst
  63 dg-cmp2.lst

Results from mine should have looked like:
dg-cmp-results.sh: Verbosity is 1, Variant is ""

Older log file: gcc-sim-20050503.sum
Test Run By jim on Mon May  2 12:29:08 2005
Target is xscale-unknown-elf
Host   is i686-pc-linux-gnu

Newer log file: gcc-sim-20060528.sum
Test Run By jim on Mon May 29 12:55:01 2006
Target is xscale-unknown-elf
Host   is i686-pc-linux-gnu

PASS->FAIL: gcc.c-torture/execute/920428-2.c compilation,  -O1
PASS->FAIL: gcc.c-torture/execute/920428-2.c compilation,  -O2
PASS->FAIL: gcc.c-torture/execute/920428-2.c compilation,  -O3 -fomit-frame-pointer
PASS->FAIL: gcc.c-torture/execute/920428-2.c compilation,  -O3 -g
PASS->FAIL: gcc.c-torture/execute/920428-2.c compilation,  -Os
PASS->UNRESOLVED: gcc.c-torture/execute/920428-2.c execution,  -O1
PASS->UNRESOLVED: gcc.c-torture/execute/920428-2.c execution,  -O2
:
PASS->UNRESOLVED: gcc.c-torture/execute/nestfunc-6.c execution,  -Os
FAIL->PASS: gcc.dg/20030324-1.c execution test
FAIL->PASS: gcc.dg/debug/dwarf2/dwarf-die7.c scan-assembler 1.*DW_AT_inline
FAIL->PASS: gcc.dg/range-test-1.c execution test

> Mine was designed to do two things, figure out if the results are  
> interesting and not send email, if they are not, and to show  
> engineers the `interesting' detailed results in priority order.  It's  
> meant to be run daily, and on good days, it produces no output.  On  
> normal days, output is 20-30 lines at most.  It tries to balance the  
> complaints with the occasional atta boy, to help with moral.
Mine is used at the end of full build & test runs so a few minutes
didn't bother me.  A look at output like that above tells me what I
need.

Your approach is faster, esp. on Darwin / NetBSD.
The only advantages I see to mine is handling variants (Richard's patch
fixes that), verbosity control, and detail -- compare_tests only looks
at X?(PASS|X?FAIL).

Jim.

-- 
Jim Lemke   jwlemke@wasabisystems.com   Orillia, Ontario

Follow-Ups:
- Re: comparing DejaGNU results
  - From: James Lemke

References:
- Re: comparing DejaGNU results
  - From: Mike Stump

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]