This is the mail archive of the
mailing list for the GCC project.
Re: [patch] Improve loop array prefetch for IA-64
- From: "Steven Bosscher" <stevenb dot gcc at gmail dot com>
- To: "Davis, Mark" <mark dot davis at intel dot com>
- Cc: "Canqun Yang" <canqun at yahoo dot com dot cn>, gcc at gcc dot gnu dot org, gcc-patches at gcc dot gnu dot org
- Date: Sat, 3 Jun 2006 00:17:33 +0200
- Subject: Re: [patch] Improve loop array prefetch for IA-64
- References: <E11A89E888E04547A5E0158061F40A2008234B@hdsmsx412.amr.corp.intel.com>
On 6/2/06, Davis, Mark <email@example.com> wrote:
Question: does gcc now know the difference between prefetching to cache L1 via
"lfetch", as opposed to prefetching only to level L2 via "lfetch.nt1"?
The ia64 backend knows the difference, see the prefetch pattern in ia64.md.
But ia64 is the only backend that supports this kind of explicit
locality parameter. And since no-one from the ia64 community cared
much about gcc until recently, gcc's prefetching pass (which is
limited anyway) does not generate lfetch.nt1 or other prefetches with
explicit locality parameters.
For floating point data, the latter is the only interesting case because float loads only
access the L2. Thus using "lfetch" for floating point arrays will unnecessarily wipe out > the contents of L1. (gcc 3.2.3 only seems to generate "lfetch", which is why I ask...)
You could experiment with this for ia64 by hacking issue_prefetch_ref
in tree-ssa-loop-prefetch.c to issue a prefetch to L2 for floating