This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: RFA: patch - tuning gcc for Intel Nocona (64 bit).
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Vladimir Makarov <vmakarov at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Fri, 2 Apr 2004 00:21:45 +0200
- Subject: Re: RFA: patch - tuning gcc for Intel Nocona (64 bit).
- References: <406C74C9.9040305@redhat.com>
> The following patch tunes gcc to Nocona Xeon. The SPEC2000
> results for mainline gcc for 2.8Ghz Nocona Xeon in 64-bit mode (using
> x86_64 insns) are given below.
>
> SPECFP2000 is 28% faster with the tuning. SPECInt2000 is 0.6%
> faster. The code size (text segment) is also usually smaller.
>
> IMHO, there are still other possibilities to tune the code for
> Nocona. These ones are only the simplest ones.
>
> I see one problem what tunning will be the default for x86_64. I
> don't think that e.g. Linux distributions will have different code for
> Intel Nocona and AMD Opteron.
>
> Vlad
>
>
> Base: -O2
> Peak: -O2 -mtune=nocona
>
> SPECInt2000 base peak
> =============================================================
> 164.gzip 1400 186 753* 1400 183 764*
> 175.vpr 1400 218 641* 1400 215 650*
> 176.gcc 1100 102 1081* 1100 98.4 1118*
> 181.mcf 1800 362 497* 1800 364 494*
> 186.crafty 1000 85.7 1167* 1000 87.0 1150*
> 197.parser 1800 273 659* 1800 275 654*
> 252.eon X X
> 253.perlbmk 1800 153 1176* 1800 152 1186*
> 254.gap 1100 95.8 1148* 1100 95.4 1153*
> 255.vortex 1900 154 1230* 1900 157 1214*
> 256.bzip2 1500 180 832* 1500 176 853*
> 300.twolf 3000 390 770* 3000 388 774*
> Est. SPECint_base2000 868
> Est. SPECint2000 873
>
> ----------------CINT2000-----------------
> -4.003% 101249 97196 197.parser
> -1.721% 137763 135392 175.vpr
> -3.579% 573087 552578 255.vortex
> -1.506% 30800 30336 256.bzip2
> -3.582% 550705 530980 253.perlbmk
> -4.354% 523548 500752 252.eon
> -1.416% 185083 182462 300.twolf
> -3.727% 471741 454160 254.gap
> -2.783% 1.41256e+06 1.37325e+06 176.gcc
> -0.065% 166675 166566 186.crafty
> -2.033% 37627 36862 164.gzip
> -1.723% 12074 11866 181.mcf
>
> SPECFp2000 base peak
> ======================================================================
> 168.wupwise 1600 171 936 * 1600 132 1216 *
> 171.swim 3100 270 1150 * 3100 210 1477 *
> 172.mgrid 1800 492 366 * 1800 303 595 *
> 173.applu 2100 306 686 * 2100 231 908 *
> 177.mesa 1400 122 1152 * 1400 120 1167 *
> 178.galgel X X
> 179.art 2600 334 779 * 2600 326 798 *
> 183.equake 1300 110 1179 * 1300 93.4 1392 *
> 187.facerec X X
> 188.ammp X X
> 189.lucas X X
> 191.fma3d X X
> 200.sixtrack 1100 435 253 * 1100 242 455 *
> 301.apsi 2600 429 606 * 2600 369 705 *
> Est. SPECfp_base2000 706
> Est. SPECfp2000 904
>
> ----------------CFP2000-----------------
> 5.005% 8792 9232 171.swim
> -1.158% 19340 19116 183.equake
> -0.251% 12769 12737 172.mgrid
> -1.619% 15316 15068 179.art
> -0.527% 28821 28669 168.wupwise
> -4.078% 495562 475354 177.mesa
> -0.072% 865962 865338 200.sixtrack
> -0.483% 125935 125327 301.apsi
> 4.811% 46231 48455 173.applu
>
> *************** extern int x86_prefetch_sse;
> *** 674,679 ****
> --- 677,687 ----
> builtin_define ("__pentium4"); \
> builtin_define ("__pentium4__"); \
> } \
> + else if (ix86_arch == PROCESSOR_NOCONA) \
> + { \
> + builtin_define ("__nocona"); \
> + builtin_define ("__nocona__"); \
Also I am not sure here, but perhaps we want to define pentium4 here as
well. Nocona is after all variant of P4 core so older code can use it.
(we use this practice for Pentium/PentiumMMX and so on)
Honza