This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM
- From: "Bin.Cheng" <amker dot cheng at gmail dot com>
- To: Bingfeng Mei <bmei at broadcom dot com>
- Cc: Vladimir Makarov <vmakarov at redhat dot com>, Ramana Radhakrishnan <ramana dot radhakrishnan at arm dot com>, "gcc.gcc.gnu.org" <gcc at gcc dot gnu dot org>
- Date: Wed, 25 Jun 2014 17:53:49 +0800
- Subject: Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM
- Authentication-results: sourceware.org; auth=none
- References: <53A98705 dot 10909 at redhat dot com> <53A98CE2 dot 9080108 at arm dot com> <53A98F0A dot 7000802 at redhat dot com> <53A991D7 dot 6070709 at arm dot com> <53A993F8 dot 7030101 at redhat dot com> <B71DF1153024A14EABB94E39368E44A6042CE185 at SJEXCHMB13 dot corp dot ad dot broadcom dot com> <CAHFci289cDF86y-0FRFAz3003DA0H8OVNbBA3uMzaj2cQvxNKg at mail dot gmail dot com>
On Wed, Jun 25, 2014 at 5:47 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Wed, Jun 25, 2014 at 5:26 PM, Bingfeng Mei <bmei@broadcom.com> wrote:
>> Thanks for nice benchmarks. Vladimir.
>>
>> Why is GCC code size so much bigger than LLVM? Does -Ofast have more unrolling
> On the contrary, I don't think rtl unrolling is enabled by default on
> GCC with level O3/Ofast. There is no unroll dump file at all unless
> -funroll-loops/-funroll-all-loops is explicitly specified.
Need to clarify, I did see cases in which GCC's rtl unroller more
aggressive than llvm's once it's specified.
>
> Thanks,
> bin
>
>> on GCC? It doesn't seem increasing code size help performance (164.gzip & 197.parser)
>> Is there comparisons for O2? I guess that is more useful for typical
>> mobile/embedded programmers.
>>
>> Bingfeng
>>
>>> -----Original Message-----
>>> From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of
>>> Vladimir Makarov
>>> Sent: 24 June 2014 16:07
>>> To: Ramana Radhakrishnan; gcc.gcc.gnu.org
>>> Subject: Re: Comparison of GCC-4.9 and LLVM-3.4 performance on
>>> SPECInt2000 for x86-64 and ARM
>>>
>>> On 06/24/2014 10:57 AM, Ramana Radhakrishnan wrote:
>>> >
>>> > The ball-park number you have probably won't change much.
>>> >
>>> >>>
>>> >> Unfortunately, that is the configuration I can use on my system
>>> because
>>> >> of lack of libraries for other configurations.
>>> >
>>> > Using --with-fpu={neon / neon-vfpv4} shouldn't cause you ABI issues
>>> > with libraries for any other configurations. neon / neon-vfpv4 enable
>>> > use of the neon unit in a manner that is ABI compatible with the rest
>>> > of the system.
>>> >
>>> > For more on command line options for AArch32 and how they map to
>>> > various CPU's you might find this blog interesting.
>>> >
>>> > http://community.arm.com/groups/tools/blog/2013/04/15/arm-cortex-a-
>>> processors-and-gcc-command-lines
>>> >
>>> >
>>> >>
>>> >> I don't think Neon can improve score for SPECInt2000 significantly
>>> but
>>> >> may be I am wrong.
>>> >
>>> > It won't probably improve the overall score by a large amount but some
>>> > individual benchmarks will get some help.
>>> >
>>> There are some few benchmarks which benefit from autovectorization (eon
>>> particularly).
>>> >>> Did you add any other architecture specific options to your SPEC2k
>>> >>> runs ?
>>> >>>
>>> >>>
>>> >> No. The only options I used are -Ofast.
>>> >>
>>> >> Could you recommend me what best options you think I should use for
>>> this
>>> >> processor.
>>> >>
>>> >
>>> > I would personally use --with-cpu=cortex-a15 --with-fpu=neon-vfpv4
>>> > --with-float=hard on this processor as that maps with the processor
>>> > available on that particular piece of Silicon.
>>> Thanks, Ramana. Next time, I'll try these options.
>>> >
>>> > Also given it's a big LITTLE system with probably kernel switching -
>>> > it may be better to also make sure that you are always running on the
>>> > big core.
>>> >
>>> The results are pretty stable. Also this version of Fedora does not
>>> implement switching from Big to Little processors.
>>