This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: gzip performance test
- To: Joe Buck <jbuck at synopsys dot COM>, jfm2 at club-internet dot fr
- Subject: Re: gzip performance test
- From: Jean Francois Martinez <jfm2 at club-internet dot fr>
- Date: Fri, 25 May 2001 23:18:01 +0200
- Cc: dje at watson dot ibm dot com (David Edelsohn), jbuck at synopsys dot COM (Joe Buck), gcc at gcc dot gnu dot org
- References: <200105251948.MAA09862@toledo.synopsys.com>
- Reply-To: jfm2 at club-internet dot fr
On Friday 25 May 2001 21:48, Joe Buck wrote:
> [ Re: tests ]
>
> > There are some remarks I would like to make:
> >
> > 1) You should try tio eliminate "machine noise": benchmarks should be run
> > in single user mode (avoids daemon activity) and with network modules
> > unloaded to avoid interrupts generated by network activity. And don't
> > forget to do a sync before each run
>
> It suffices to run on a relatively inactive machine, run the test multiple
> times, and discard outliers (that's why I reported the median and not the
> average of five runs -- occasionally I got one test that was much slower
> due to such activity, so I'd have one slow and four fast results).
>
I already do this as additional safety measure
> > 2) Ideally each round orf benchmarks should use a glibc compiled by its
> > own compiler (and don't use the glibc shipped with your system since
> > vendor could be using more agressive compil parms than you).
>
> I disagree (and of course this is only relevant for Gnu/Linux and maybe
> Gnu/Hurd folks). It's valid for all tests to use the same glibc, because
> we're only looking for performance improvement or regression, we are not
> trying to come up with numbers for formal publication. And since the
> person who installs gcc 3.0 in /usr/local/bin will probably continue to use
> the vendor supplied glibc, what you advocate would give a distorted result:
> end users will, at first, continue using their glibc and just update their
> compiler.
>
If you get a worsening of 2% you are going to tell it is acceptable. Howver
if test was spending 90% of time in glibc that means that the real perforance
hit was 20% and this is definitely not acceptable. So unless you recompile
libc or can evaluate how much time is spent in it you can be led to wrong
conclusions.
> > 3) You ran a gzip compied with gcc 2.95 and then a gzip compiled with gcc
> > 3.0. Right? Wrong. Problem is processor temperature.
>
> The effects you're seeing in this area probably get amplified because
> you're doing 1), and thus starting with a relatively cold machine, plus
> you're using typically crappy PC hardware. Again, such effects are
> handled by doing multiple runs, and the runs can be interleaved as well.
Hum no, box runs test many times and when it has finished with
one hypothesis it recompiles libm (this allows partial cooling) and goes to
test another hypothesis so processor does not start cool. Agreed desktop
PCs are built for being quiet not for handling high loads for extended
periods of time. But it will still be a good idea to run "sensors" and start
runs when processor is roughly at same temperature for every test.
JFM