This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64
I did some measurement (64bit).
Experiment 1:
-O2 -funroll-loops vs -O2
It improves performance (geomean) by 0.56%, not too much:
O2 O2 unroll-loops
164.gzip 1324 1331 0.56%
175.vpr 1694 1605 -5.24%
176.gcc 2293 2350 2.47%
181.mcf 1772 1788 0.90%
186.crafty 2320 2326 0.26%
197.parser 1166 1162 -0.32%
252.eon 2443 2529 3.50%
253.perlbmk 2410 2460 2.07%
254.gap 1987 2019 1.58%
255.vortex 2392 2406 0.58%
256.bzip2 1719 1715 -0.25%
300.twolf 2288 2308 0.88%
Experiment 2: O3 vs O2:
The improvement on SPEC2k is larger than large internal programs
tested -- geomean 2.38%.
164.gzip 1324 1329 0.40%
175.vpr 1694 1700 0.31%
176.gcc 2293 2336 1.89%
181.mcf 1772 1739 -1.81%
186.crafty 2320 2323 0.14%
197.parser 1166 1252 7.39%
252.eon 2443 2645 8.23%
253.perlbmk 2410 2452 1.74%
254.gap 1987 2020 1.62%
255.vortex 2392 2473 3.39%
256.bzip2 1719 1766 2.74%
300.twolf 2288 2350 2.70%
Experiment 3: O2 lto vs O2: geomean 0.72%
O2 O2 LTO
164.gzip 1324 1317 -0.53%
175.vpr 1694 1697 0.18%
176.gcc 2293 2291 -0.08%
181.mcf 1772 1760 -0.65%
186.crafty 2320 2245 -3.26%
197.parser 1166 1163 -0.29%
252.eon 2443 2576 5.44%
253.perlbmk 2410 2433 0.93%
254.gap 1987 1995 0.36%
255.vortex 2392 2588 8.19%
256.bzip2 1719 1729 0.56%
300.twolf 2288 2248 -1.77%
David
On Mon, Nov 15, 2010 at 9:54 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> For peak, FDO is the most effective option. It can boost performance
>> by 7-10% depending on the program. The options you suggested probably
>> won't make too big a dent. ?-funroll-loops can hurt performance
>> without profiling. ?More aggressive inlining, ipa-cp, unswitching etc
>
> -funroll-loops overall was 2.2% win on SPECint, -funrol-all-loops 2.5% last
> time I noted down the SPECint results of this (that was in 2003, heh :)
> http://www.ucw.cz/~hubicka/papers/amd64/node4.html
>
>> enabled by O3 may help a little if there is any. -ffast-math won't
>> help for integer benchmarks other than eon. ?Traditionally, O3 helps
>> FP performance because of the loop transformation enabled, but this
>> won't be the case for gcc for now.
>
> Function inlining definitly helps. -O3 also imply vectorization and other stuff.
>
> Honza
>>
>> Thanks,
>>
>> David
>>
>> On Mon, Nov 15, 2010 at 4:29 AM, Andrey Belevantsev <abel@ispras.ru> wrote:
>> > Hello,
>> >
>> > On 14.11.2010 0:08, Xinliang David Li wrote:
>> >>
>> >> I re-measured the performance difference using trunk gcc and trunk
>> >> clang/llvm on a core-2 box. ?-fno-strict-aliasing is added to gcc
>> >> because clang/llvm's type based aliasing is not incomplete and not
>> >> enabled by default. I also added -fomit-frame-pointer to clang/llvm as
>> >> this is gcc's default. The base option is -O2.
>> >
>> > It would be very interesting to compare also peak numbers, i.e. with LTO and
>> > strict aliasing enabled, as well as -O3 and -ffast-math/-funroll-loops,
>> > similar to Vlad's or OpenSUSE's options. ?Can you try to measure these?
>> > Maybe you can also run SPEC2k6, if there is enough machine resources, but
>> > that's probably asking too much...
>> >
>> > Andrey
>> >
>> >
>