This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
| Other format: | [Raw text] | |
On Mon, 28 Nov 2005, Mark Mitchell wrote: > As a strawman, perhaps we could add a small integer program (bzip?) and > a small floating-point program to the testsuite, and have DejaGNU print > out the number of iterations of each that run in 10 seconds. Please make it the other way round (time for a fix number of iterations, perhaps with the number being settable); it's generally easier to adopt that to testing with simulators. Simulator testing seems generally regarded as a poor cousin here, but has some disctinct advantages: the results carry over between different test-beds/hosts and are not subject to noise from e.g. system load as long as the number of cycles is the unit of measure of run time. On the other hand, digging out the number of cycles is done slightly different depending on target. I hope to provide tools for that. > Again, that's a strawman. I'm just looking for suggestions about what > we might to do -- or even feedback that there's no need to do anything. I'm working on a csibe.exp for use with CSiBE-2.1.1, focusing on simulator testing. Native testing works too of course, but I haven't really solved the problems with system noise and (still) getting a usable time-scale. I admit csibe isn't aimed at being an execution time performance regression tool: I chose csibe rather than something homegrown mainly so I'd not have to invest time in a discussion regarding the choice of benchmarks and test input. (FWIW, I do have a number of homegrown tests too, but none that work within the gcc testing framework.) Anyway, the testing framework isn't supposed to be tied to CSiBE (lots can and should be extracted as generic tools), it just seemed sane enough to start with. I've attached the work-in-progress so I don't have to get into detail about what it does :-) except noting that you'll see in gcc.sum something like: PASS: csibe -O1 runtime zlib-1.1.4:minigzip not slower than best PASS: csibe -O1 runtime zlib-1.1.4:minigzip not more than .1% slower than best PASS: csibe -O1 runtime zlib-1.1.4:minigzip not more than 1% slower than best PASS: csibe -O1 runtime zlib-1.1.4:minigzip not more than 10% slower than best PASS: csibe -O1 runtime zlib-1.1.4:minigzip not slower than milestone PASS: csibe -O1 runtime zlib-1.1.4:minigzip not more than .1% slower than milestone PASS: csibe -O1 runtime zlib-1.1.4:minigzip not more than 1% slower than milestone PASS: csibe -O1 runtime zlib-1.1.4:minigzip not more than 10% slower than milestone PASS: csibe -O1 runtime zlib-1.1.4:minigzip not slower than previous PASS: csibe -O1 runtime zlib-1.1.4:minigzip not more than .1% slower than previous PASS: csibe -O1 runtime zlib-1.1.4:minigzip not more than 1% slower than previous PASS: csibe -O1 runtime zlib-1.1.4:minigzip not more than 10% slower than previous (repeated for each different test and gcc options in the chosen set.) Roughly, the tester person decides (or relies on defaults) on a number of baselines like the arbitrary set shown above: "best", "milestone" and "previous" to which the runtime (seen above), compile-time and size of the test-programs is compared to some set of criteria iterating on a set of compiler options not unlike the torture iterations. Updated baseline data is also output by the tests, to simplify feedback (just the "previous"; "best" not currently implemented). Before you guys hose it completely, let me repeat: this is work in progress. I'm not sure what's useful yet; perhaps just one baseline should be default. Perhaps some of the test results should be accumulated, to avoid 43445 different sub-tests. Note that csibe doesn't have integrity checks for its (few) runtime tests; patch for that is attached. (No, I haven't contacted the csibe people yet.) One of the tests has lots of off-by-one bugs causing SEGV on cris-axis-linux-gnu (the equivalent within the simulator); patch attached for that too. Another of the programs, "flex", is definitely simulator- unfriendly: it relies heavily on fork and executing sub-programs for its final output. Most of the other programs could do with some editing to avoid constructs rarely present in simulators, and perhaps some pruning to cut down the time from ~2h per iteration to a few minutes (in total, 1h for cris-axis-linux-gnu + sim/cris). To wit: I agree we need some performance tests other than SPEC and I think *something* like the above should be done, and optionally run as part of the usual testsuite. brgds, H-P
Attachment:
csibe.exp
Description: gcc/testsuite/gcc.performance/csibe.exp
Attachment:
csibe112-test-patch4
Description: CSiBE integrity checks for runtime tests
Attachment:
csibe112-test-patch4-2
Description: Bugfix for CSiBE jikespg
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |