[PATCH] Enabling Software Prefetching by Default at -O3
Christian Borntraeger
borntraeger@de.ibm.com
Sun Jun 20 00:11:00 GMT 2010
Am Samstag 19 Juni 2010, 00:07:07 schrieb H.J. Lu:
> > We never tried software prefetch since Intel processors rarely need
> > it. We will try it. It will take some times.
> >
>
> Here are what we got on Intel Core i7 64bit with -O3 vs -O3
> -fprefetch-loop-arrays:
>
> 400.perlbench -0.369004%
> 401.bzip2 -3%
> 403.gcc -1.5748%
> 429.mcf 0.784314%
> 445.gobmk -1.28755%
> 456.hmmer -1.67364%
> 458.sjeng 0%
> 462.libquantum 2.57827%
> 464.h264ref -0.806452%
> 471.omnetpp -1.51515%
> 473.astar -0.581395%
> 483.xalancbmk 0%
> SPECint(R)_base2006 -0.766284%
> 410.bwaves -1.27796%
> 416.gamess -1.2605%
> 433.milc 0%
> 434.zeusmp 1.24481%
> 435.gromacs -0.478469%
> 436.cactusADM -5.07813%
> 437.leslie3d -1.10294%
> 444.namd 0%
> 450.soplex 0.628931%
> 453.povray 0.392157%
> 454.calculix 0%
> 459.GemsFDTD -1.51515%
> 465.tonto 0%
> 470.lbm -1.16279%
> 481.wrf -0.442478%
> 482.sphinx3 2.51397%
> SPECfp(R)_base2006 -0.769231%
>
> It doesn't help Intel Core i7. I think it should be enabled
> with -mtune=XXX on x86 where prefetch improves
> performance on XXX.
It also might be worth to investigate if overriding the parameters per
-mtune=XXX results in an overall win for -fprefetch-loop-arrays. We did
that on s390 since the default values were not ideal:
e.g. we have several of these
[...]
if (!PARAM_SET_P (PARAM_PREFETCH_MIN_INSN_TO_MEM_RATIO))
set_param_value ("prefetch-min-insn-to-mem-ratio", 2);
if (!PARAM_SET_P (PARAM_SIMULTANEOUS_PREFETCHES))
set_param_value ("simultaneous-prefetches", 6);
[...]
in override_options.
e.g. if software prefetch is expensive you could make it happen less often
for core i7 or vice versa.
Christian
More information about the Gcc-patches
mailing list