This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: flow speed regression [Re: How long should -O1 compiles take?]
- To: jbuck at synopsys dot com (Joe Buck)
- Subject: Re: flow speed regression [Re: How long should -O1 compiles take?]
- From: Brad Lucier <lucier at math dot purdue dot edu>
- Date: Wed, 6 Oct 1999 08:14:25 -0500 (EST)
- Cc: lucier at math dot purdue dot edu (Brad Lucier), rth at cygnus dot com, law at cygnus dot com, gcc at gcc dot gnu dot org, hosking at cs dot purdue dot edu, feeley at iro dot umontreal dot ca, gcc-patches at gcc dot gnu dot org
> > The first column is today's gcc-2.96 with Richard's patch, compiled
> > with gcc-2.95.1. Unfortunately, compile times are still nearly three
> > times as long as with egcs-1.1.2.
>
> How does the code quality (size, speed) compare with egcs-1.1.2?
Here's some data:
(1) (2) (3) (4)
egcs-1.1.2 -O1 496.0u 1847624 2130084 187.0u
gcc-2.96 19991005 -O1 933.68u 1837560 2003304 163.93u
egcs-1.1.2 -O2 39287.0u 1947160 2202316 188.0u
gcc-2.96 19991005 -O2 4395.95u 1932888 2109640 171.02u
This a Scheme runtime library and Scheme->C compiler, written itself
in Scheme by Marc Feeley. (1) is the time to compile the C files for
the runtime and compiler with -mcpu=supersparc -fPIC;
(2) is the compiler size; (3) is the runtime size; and
(4) is the time it takes to compile all the Scheme source files
to C with the resulting compiler.
So the size is somewhat smaller, and the runtime (4) is 12.5% smaller.
And, yes, -O2 results in larger, slower code.
But those long -O1 compile times (especially before the flow.c fix)
are/were not really necessary. The long times in the global register
allocator and in flow.c are the result of using > linear algorithms
over the number of edges/number of registers. Richard got rid of
the flow problems by using a different data structure; Joern
suggested changing the data structure for the global register allocator
to a hash table; that may reduce the run times for the global
register allocator significantly.
Some applications really need a "pretty good" compiler that runs
relatively quickly. For example, I believe that the fastest way
to do genetic programming on image data is to compile the evolving
programs before testing them. When you're testing a program on
hundreds of thousands of pixels, it should pay to compile it. Last
fall, the Scheme->C compiler took three times as long to generate
the C code as egcs-1.whatever took to compile it, so I spent several
weeks to reduce the Scheme->C runtimes to match the egcs runtimes. Now,
gcc is slowing down by a factor of two or three, for an improvement
in code speed of 12.5% in the test above. It's not worth it---now
gcc is the slowest part of the entire genetic programming system,
by far. (Through program transformation techniques, we get the
average runtime of the evolving programs down to < 30 cycles/pixel,
so the Scheme->C + C->machine code compile time is twice as long as
the test time; the transformations themselves take only 1-2% of the total
time.) -O2 does not result in a reduction of total system time,
and neither does -O0.
So, I think it is important for the gcc developers to take care about
what they put into -O1. I would expect only local optimizations, or
optimizations that can run in linear (or N log N) time. Either
find algorithms/data structures to do that, or keep it for -O2.
Brad Lucier