This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
RE: [patch, PR44297] prefetch improvements to fix 465.tonto from non-constant step prefetching
- From: "Fang, Changpeng" <Changpeng dot Fang at amd dot com>
- To: Nathan Froyd <froydnj at codesourcery dot com>
- Cc: Zdenek Dvorak <rakdver at kam dot mff dot cuni dot cz>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, "sebpop at gmail dot com" <sebpop at gmail dot com>
- Date: Mon, 7 Jun 2010 18:20:37 -0500
- Subject: RE: [patch, PR44297] prefetch improvements to fix 465.tonto from non-constant step prefetching
- References: <D4C76825A6780047854A11E93CDE84D02F7723@SAUSEXMBP01.amd.com>,<20100607231237.GL19235@codesourcery.com>
Hi, Nathan:
Thanks for pointing out. Here is the attachment.
(Zdenek sent me the same notice and I replied to send the attachment, unfortunately,
reply is off the mailing list).
Thanks,
Changpeng
________________________________________
From: Nathan Froyd [froydnj@codesourcery.com]
Sent: Monday, June 07, 2010 6:12 PM
To: Fang, Changpeng
Cc: Zdenek Dvorak; gcc-patches@gcc.gnu.org; sebpop@gmail.com
Subject: Re: [patch, PR44297] prefetch improvements to fix 465.tonto from non-constant step prefetching
On Mon, Jun 07, 2010 at 05:26:58PM -0500, Fang, Changpeng wrote:
> Attached is the patch to fix 465/tonto regression (> 9%) caused (or
> exposed) by non-constant step prefetching.
You did not attach the patch.
-Nathan
From 989d719dec43195efdffce4e3dce90d3c291f9b1 Mon Sep 17 00:00:00 2001
From: Changpeng Fang <chfang@houghton.(none)>
Date: Mon, 7 Jun 2010 14:57:27 -0700
Subject: [PATCH 2/2] Account prefetch_mod and unroll_factor for the computation of the prefetch count
*tree-ssa-loop-prefetch.c (estimate_prefetch_count): use prefetch_mod
and unroll_factor to estimate the prefetch_count.
(loop_prefetch_arrays): re-estimate the prefetch count by considering
the unroll_factor and prefetch_mod for is_loop_prefetching_profitable.
(is_loop_prefetching_profitable): return false if the estimated prefetch
count is 0.
---
gcc/tree-ssa-loop-prefetch.c | 18 +++++++++++++-----
1 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c
index d63ede1..dd3dabf 100644
--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -987,18 +987,24 @@ schedule_prefetches (struct mem_ref_group *groups, unsigned unroll_factor,
return any;
}
-/* Estimate the number of prefetches in the given GROUPS. */
+/* Estimate the number of prefetches in the given GROUPS.
+ UNROLL_FACTOR is the factor by which LOOP was unrolled. */
static int
-estimate_prefetch_count (struct mem_ref_group *groups)
+estimate_prefetch_count (struct mem_ref_group *groups, unsigned unroll_factor)
{
struct mem_ref *ref;
+ unsigned n_prefetches;
int prefetch_count = 0;
for (; groups; groups = groups->next)
for (ref = groups->refs; ref; ref = ref->next)
if (should_issue_prefetch_p (ref))
- prefetch_count++;
+ {
+ n_prefetches = ((unroll_factor + ref->prefetch_mod - 1)
+ / ref->prefetch_mod);
+ prefetch_count += n_prefetches;
+ }
return prefetch_count;
}
@@ -1620,7 +1626,7 @@ is_loop_prefetching_profitable (unsigned ahead, HOST_WIDE_INT est_niter,
{
int insn_to_mem_ratio, insn_to_prefetch_ratio;
- if (mem_ref_count == 0)
+ if (mem_ref_count == 0 || prefetch_count == 0)
return false;
/* Prefetching improves performance by overlapping cache missing
@@ -1709,7 +1715,7 @@ loop_prefetch_arrays (struct loop *loop)
/* Step 2: estimate the reuse effects. */
prune_by_reuse (refs);
- prefetch_count = estimate_prefetch_count (refs);
+ prefetch_count = estimate_prefetch_count (refs, 1);
if (prefetch_count == 0)
goto fail;
@@ -1733,6 +1739,8 @@ loop_prefetch_arrays (struct loop *loop)
ahead, unroll_factor, est_niter,
ninsns, mem_ref_count, prefetch_count);
+ /* Re-estimate the prefetch count based upon loop unrolling. */
+ prefetch_count = estimate_prefetch_count (refs, unroll_factor);
if (!is_loop_prefetching_profitable (ahead, est_niter, ninsns, prefetch_count,
mem_ref_count, unroll_factor))
goto fail;
--
1.6.3.3