This is the mail archive of the
mailing list for the GCC project.
Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64
Xinliang David Li wrote:
On Thu, Apr 29, 2010 at 9:25 AM, Vladimir Makarov <firstname.lastname@example.org> wrote:Thanks for the comments. FDO will probably improve SPEC2000 score.
Although it is not obvious for some tests because the train data sets
for them are different from the reference data sets and it might
actually mislead the compiler.
GCC-4.5.0 and LLVM-2.7 were released recently. To understand
where we stand after releasing GCC-4.5.0 I benchmarked it on SPEC2000
for x86/x86-64 and posted the comparison of it with the
previous GCC releases and LLVM-2.7.
Even benchmarking SPEC2000 takes a lot of time on the fastest
machine I have. So I don't plan to use SPEC2006 for this in near
You can find the comparison on
http://vmakarov.fedorapeople.org/spec/ (please just click links at the
bottom of the left frame starting with link "GCC release comparison").
If you need exact numbers, please use the tables (the links to them
are also given) which were used to generate the corresponding bar
In general GCC-4.5.0 became faster (upto 10%) in -O2 mode. This is
first considerable compilation speed improvement since GCC-4.2.
GCC-4.5.0 generates a better (1-2% in average upto 4% for x86-64
SPECFP2000 in -O2 mode) code too in comparison with the previous
release. That is not including LTO and Graphite which can gives even
more (especially LTO) in many cases.
GCC-4.5.0 has new big optimizations LTO and Graphite (more
accurately graphite was introduced in the previous release).
Therefore I ran additional benchmarks to test them.
LTO is a promising technology especially for integer benchmarks for
which it results in smaller and faster code. But it might result in
degradations too on SPECFP2000 mainly because of big degradations on a
few benchmarks like wupwise or facerec. Another annoying thing about
LTO, it considerably slows down the compiler.
The LTO improvement on spec2000int is is only 1.86%
4.5 4.5+lto Improvement
164.gzip 955 950 -0.52% <-- degrade
175.vpr 588 594 1.02%
176.gcc 1211 1216 0.41%
181.mcf 699 698 -0.14%
186.crafty 1011 987 -2.37% <--- degrade
197.parser 792 813 2.65%
252.eon 1026 1023 -0.29% <-- degrade
253.perlbmk 1312 1294 -1.37% <-- degrade
254.gap 1021 1037 1.57%
255.vortex 1123 1319 17.45%
256.bzip2 737 768 4.21%
300.twolf 773 779 0.78%
SPECint2000 913 930 1.86%
This matches our previous observation that to bring the best out of
LTO, FDO is also needed. (As a reference, LIPO improves over plain FDO
by ~4.5%, vortex improves 23%). You will probably see even smaller
improvement in SPEC2006.
FDO is important for optimizations where all possible data sets do not
change branch probability distribution much. IMHO therefore FDO is not
widely used by most of developers (although I am sure that for Google
applications it is extremely important) and therefore I don't measure it
and it is not so interesting for me. Although bigger reason not use FDO
is inconvenience to use it for regular compiler user.
As for vortex FDO improvement, vortex contains a moderate size loop in
which most of time is spent. The loop has if-then-else on the top loop
level. On all SPEC2000 data sets, one if-branch is taken practically
always (like 1 to 1,000,000). So it is not amazing for me that FDO
gives such improvement for vortex.
Usually after such posting the comparisons, I am getting a lot of
requests. I'd like to do all of them but unfortunately running and the
result preparation takes a lot of my time. May be I'll do such
comparison next year.
It would be great if there is number collected comparing LTO + FDO vs
plain FDO in the same setup.