[PATCH] Enabling Software Prefetching by Default at -O3

Christian Borntraeger borntraeger@de.ibm.com
Sun Jun 20 00:11:00 GMT 2010


Am Samstag 19 Juni 2010, 00:07:07 schrieb H.J. Lu:
> > We never tried software prefetch since Intel processors rarely need
> > it.  We will try it. It will take some times.
> >
> 
> Here are what we got on Intel Core i7 64bit with -O3 vs -O3
> -fprefetch-loop-arrays:
> 
> 400.perlbench 			 -0.369004%
> 401.bzip2 			 -3%
> 403.gcc 			 -1.5748%
> 429.mcf 			 0.784314%
> 445.gobmk 			 -1.28755%
> 456.hmmer 			 -1.67364%
> 458.sjeng 			 0%
> 462.libquantum 			 2.57827%
> 464.h264ref 			 -0.806452%
> 471.omnetpp 			 -1.51515%
> 473.astar 			 -0.581395%
> 483.xalancbmk 			 0%
> SPECint(R)_base2006 			 -0.766284%
> 410.bwaves 			 -1.27796%
> 416.gamess 			 -1.2605%
> 433.milc 			 0%
> 434.zeusmp 			 1.24481%
> 435.gromacs 			 -0.478469%
> 436.cactusADM 			 -5.07813%
> 437.leslie3d 			 -1.10294%
> 444.namd 			 0%
> 450.soplex 			 0.628931%
> 453.povray 			 0.392157%
> 454.calculix 			 0%
> 459.GemsFDTD 			 -1.51515%
> 465.tonto 			 0%
> 470.lbm 			 -1.16279%
> 481.wrf 			 -0.442478%
> 482.sphinx3 			 2.51397%
> SPECfp(R)_base2006 			 -0.769231%
> 
> It doesn't help Intel Core i7.  I think it should be enabled
> with -mtune=XXX on x86 where prefetch improves
> performance on XXX.

It also might be worth to investigate if overriding the parameters per
-mtune=XXX results in an overall win for -fprefetch-loop-arrays. We did
that on s390 since the default values were not ideal:
e.g. we have several of these
[...]
  if (!PARAM_SET_P (PARAM_PREFETCH_MIN_INSN_TO_MEM_RATIO))
    set_param_value ("prefetch-min-insn-to-mem-ratio", 2);
  if (!PARAM_SET_P (PARAM_SIMULTANEOUS_PREFETCHES))
    set_param_value ("simultaneous-prefetches", 6);
[...]
in override_options.

e.g. if software prefetch is expensive you could make it happen less often
for core i7 or vice versa.

Christian



More information about the Gcc-patches mailing list