MiDataSets for MiBench to enable more realistic benchmarking and better tuning of the GCC optimization heuristic

Andrew Pinski pinskia@gmail.com
Mon Mar 19 16:00:00 GMT 2007

On 3/19/07, Grigori Fursin <gfursin@gmail.com> wrote:
> Hi all,
> In case someone is interested, we are developing a set of inputs
> (MiDataSets) for the MiBench benchmark. Iterative optimization
> is now a popular technique to obtain performance or code size improvements
> over the default settings in a compiler. However, in most
> of the research projects, the best configuration is found
> for one arbitrary dataset and it is assumed that this configuration
> will work well with any other dataset that a program uses.
> We created 20 different datasets per program for free MiBench benchmark
> to evaluate this assumption and analyze the behavior of various
> programs with multiple datasets. We hope that this will enable more
> realistic benchmarking, practical iterative optimizations (iterative compilation),
> and can help to automatically improve GCC optimization heuristic.

I think this is nice but semi useless unless you look into also why
stuff is better.
The anylsis part is the hard part really but the most useful part of
to figure out why GCC is failing to produce good code.

An example of this is I was working on a patch which speeds up most
code (and reduces code size there) but slows down some code (and
increase the code too) and I found that scheduling, and reordering
blocks decisions would change which causes the code to become
slower/larger.  This anylsis was neccessary to figure out my
patch/pass was not directly causing the slower/larger code.  This is
the same thing with any kind of heuristic tuning is needed.

Andrew Pinski

More information about the Gcc mailing list