This is the mail archive of the
mailing list for the GCC project.
- To: <gcc at gcc dot gnu dot org>
- Subject: Benchmarking theory
- From: "Joseph S. Myers" <jsm28 at cam dot ac dot uk>
- Date: Sat, 26 May 2001 23:35:02 +0100 (BST)
Benchmark results seem to get posted to the gcc list as single figures for
a test and old and new compilers, with assertions that results seem
significant or are consistent between runs. Why are benchmarks done on
this basis rather than using actual statistical significance tests?
Could someone point me to appropriate references on the theory of
benchmarking that explain this?
Joseph S. Myers