This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] Improve prefetch heuristics
- From: Zdenek Dvorak <rakdver at kam dot mff dot cuni dot cz>
- To: "Fang, Changpeng" <Changpeng dot Fang at amd dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, "rguenther at suse dot de" <rguenther at suse dot de>, "sebpop at gmail dot com" <sebpop at gmail dot com>
- Date: Wed, 12 May 2010 09:52:07 +0200
- Subject: Re: [patch] Improve prefetch heuristics
- References: <20100430010543.GA30055@kam.mff.cuni.cz> <1C13CD442679CE45A2E80AE9251D7EF921803A43@SAUSEXMBP01.amd.com>
Hi,
> > >> Patch5: 0005-Also-apply-the-insn-to-prefetch-ratio-heuristic-to-l.patch
> > >> This patch applies the instruction to prefetch ratio heuristic also to loops with
> > >> known loop trip count. It improves 416.gamess by 3~4% and 445.gobmk by 3%.
> >
> >in principle, I agree that the patch might be reasonable. But, I would like to see
> >the comparison with the results that you get by decreasing SIMULTANEOUS_PREFETCHES to
> >some reasonable value (say 10),
>
> First of all, decrease SIMULTANEOUS_PREFETCHES to 10 seems drop some useful prefetches and
> causes ~20+% performance degradation on 470.lbm.
>
> For 416.gamess, with SIMULTANEOUS_PREFETCHES =10, the performance is still worse (~1%) than what I got
> by appltying instruction-to-prefetch-ratio heuristic. But we see a ~2% gain compared with
> SIMULTANEOUS_PREFETCHES =100 (default).
>
> For 445.gobmk, The performance is the same with SIMULTANEOUS_PREFETCHES =10 and =100. So, for
> this benchmark, too many prefetch may not be the reason of the degradation.
thanks for the comparison. The patch is OK,
Zdenek