This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[patch 3/4] Prefetch: Ignore large values of prefetch_before
- From: Christian Borntraeger <borntraeger at de dot ibm dot com>
- To: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Cc: Richard Guenther <rguenther at suse dot de>, Zdenek Dvorak <rakdver at kam dot mff dot cuni dot cz>, Changpeng Fang <Changpeng dot Fang at amd dot com>
- Date: Wed, 05 May 2010 12:09:19 +0200
- Subject: [patch 3/4] Prefetch: Ignore large values of prefetch_before
- References: <20100505100916.707088000@de.ibm.com>
There is a heuristic in the prefetch code to detect multiple accesses
that will "meet" somewhen in the future. The prefetch code assumes
that at this point one access is already "prefetched" by the other one.
The prefetch code, therefore, sets prefetch_before stating that only
the first prefetch_before iterations need a prefetch.
The problem is that the current prefetch code does not issue a prefetch
if prefetch_before is set.
Most of the time this is not a problem, but with large prefetch_before
settings we might have exceeded the cache before the reuse really
happens or prefetch_before might be in the same order of magnitude as
the amount of loop iterations.
This is the current state of a patch addressing this problem. We simply
reset prefetch_before to PREFETCH_ALL if prefetch_before is too large.
I decided to use the amount of L2 cache lines as a breaking point.
Bootstrapped and tested on s390x-ibm-linux-gnu.
Christian.
2010-05-05 Christian Borntraeger <borntraeger@de.ibm.com>
* tree-ssa-loop-prefetch.c: Add debug for dropped prefetches.
Index: b/gcc/tree-ssa-loop-prefetch.c
===================================================================
--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -705,6 +705,9 @@ prune_ref_by_group_reuse (struct mem_ref
hit_from = ddown (delta_b, PREFETCH_BLOCK) * PREFETCH_BLOCK;
prefetch_before = (hit_from - delta_r + step - 1) / step;
+ /* Do not reduce prefetch_before if we meet beyond cache size */
+ if (prefetch_before > abs(L2_CACHE_SIZE_BYTES / step))
+ prefetch_before = PREFETCH_ALL;
if (prefetch_before < ref->prefetch_before)
ref->prefetch_before = prefetch_before;
@@ -735,6 +738,9 @@ prune_ref_by_group_reuse (struct mem_ref
reduced_prefetch_block, align_unit);
if (miss_rate <= ACCEPTABLE_MISS_RATE)
{
+ /* Do not reduce prefetch_before if we meet beyond cache size */
+ if (prefetch_before > abs(L2_CACHE_SIZE_BYTES / PREFETCH_BLOCK))
+ prefetch_before = PREFETCH_ALL;
if (prefetch_before < ref->prefetch_before)
ref->prefetch_before = prefetch_before;