This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Compiler Analysis: 3.3, 3.4, or tree-ssa?
- From: Scott Robert Ladd <coyote at coyotegulch dot com>
- To: Falk Hueffner <falk dot hueffner at student dot uni-tuebingen dot de>
- Cc: gcc mailing list <gcc at gcc dot gnu dot org>
- Date: Fri, 17 Oct 2003 06:30:51 -0400
- Subject: Re: Compiler Analysis: 3.3, 3.4, or tree-ssa?
- References: <3F8D50D5.1020401@coyotegulch.com> <87ptgwkyhu.fsf@student.uni-tuebingen.de>
Falk Hueffner wrote:
Interestingly, I did something similar just this week. My attempt was
pretty half-assed though, just a small Python script. In each
generation, it keeps the best 16 and creates 16 new by crossover of
two random others. The genome is based only on bits, which is also
not so great, because I end up with stuff like "-mmemory-latency=12
-mmemory-latency=256 -mmemory-latency=3".
My Acovea framework is written in C++; I began working on it early this
year, in my spare time.
Until two months ago, I, too, was using bit-strings to represent
compiler option sets. That proved inadequate, considering that some
options have numeric values, while other can have several specific
states (e.g., -mfpmath=387, -mfpmath=sse, and -mfpmath=sse,387). I found
a combination of polymorphism and templates to provide the most
flexibility and functional coverage.
At the moment, I'm testing combinations of approximately 50 different
options. I can test with and without certain option sets; for example,
most of my current runs avoid any options implied in
-ffast-math, since those options tend to mask the effects of other
optimizations.
Once I have the runtime "speed" tests complete, I'll work on my accuracy
test, then (given time) on a compile-time optimization. The framework
should be capable of handling different goals. In essence, Acovea solves
a minimization problem, a task well-suited to evolutionary analysis.
Still, I got some interesting results, e. g., all of the fittest
individuals have "-mno-bwx", which for this test case (gzip) is
probably the option I would have bet to have the *worst* possible
effect (it turns off byte access instructions on Alpha). Overall, I
could improve run time on my gzip test case from 10s at -O3 to 9.3s.
I've been able to identify improvements over -O3 (and -O2 and -O1) of up
to 40%. Note that most of my early tests involve floating-point
intensive code, given that my work often requires number crunching. I'm
testing other benchmarks (bit-twiddling, test processing, etc.) at the
moment.
The initial tests focus on Pentium systems, since those are what I have
at hand. Once I get the first cut published, I'll expand into other
processors (SPARC, as I have the hardware, and AMD if I can procure the
hardware) and languages other than C and C++ (GNU Fortran 95, for example).
Is your framework available somewhere? I'd be interested in comparing
the methods...
The ACOVEA framework is code and concept complete, and in testing
even as we speak. If testing goes well, I'll release the code in the
next week or so.
--
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing