This is the mail archive of the
mailing list for the GCC project.
[lto] preliminary SPECint benchmark numbers
- From: Nathan Froyd <froydnj at codesourcery dot com>
- To: gcc at gcc dot gnu dot org
- Date: Mon, 24 Dec 2007 12:21:48 -0800
- Subject: [lto] preliminary SPECint benchmark numbers
In one of my recent messages about a patch to the LTO branch, I
mentioned that we could compile and successfully run all of the C
SPECint benchmarks except 176.gcc. Chris Lattner asked if I had done
any benchmarking now that real programs could be run; I said that I
hadn't but would try to do some soon. This is the result of that.
I don't have numbers on what compile times look like, but I don't think
they're good. 176.gcc takes several minutes to compile (basically -flto
*.o, not counting the time to compile individual .o files); the other
benchmarks are all a minute or more apiece.
Executive summary: LTO is currently *not* a win.
In the table below, runtimes are in seconds. I ran the tests on an
8-core 1.6GHz machine with 8 GB RAM. I believe the machine was
relatively idle; I ran the tests over a weekend evening. The last merge
from mainline to the LTO branch was mainline r130155, so that's about
what the -O2 numbers correspond to--I don't think we've changed too much
core code on the branch. The % change are just in-my-head estimates,
using -O2 as a baseline.
-O2 -flto % change
164.gzip 174 176 + 1
175.vpr 139 143 + 3
181.mcf 162 166 + 3
186.crafty 65.2 66.6 + < 1
197.parser 240 261 + 9
253.perlbmk 119 133 + 13
254.gap 84.4 87 + 4
256.bzip2 131 145 + 11
300.twolf 202 193 - 4 (!)
176.gcc doesn't run correctly with LTO yet; 255.vortex didn't run
correctly with "mainline", but it did with -flto, which is curious. We
don't do C++ yet, so 252.eon is not included.
In general, things get worse with LTO, sometimes much worse. I can
think of at least three possible reasons off the top of my head:
- Alias information. We don't have any type-based alias information in
-flto, which hurts.
- We don't merge types between compilation units, which could account
for poor optimization behavior.
- I believe we lose some information in the LTO write/read process; edge
probabilities, estimated # instructions in functions, etc. get lost.
This hurts inlining decisions, block layout, alignment of jump
targets, etc. So there's information we need to write out or