This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[patch 3/4 v3] Prefetch: Ignore large values of prefetch_before
- From: Christian Borntraeger <borntraeger at de dot ibm dot com>
- To: Zdenek Dvorak <rakdver at kam dot mff dot cuni dot cz>
- Cc: "gcc-patches" <gcc-patches at gcc dot gnu dot org>, Richard Guenther <rguenther at suse dot de>, Changpeng Fang <Changpeng dot Fang at amd dot com>, Andreas Krebbel <krebbel at linux dot vnet dot ibm dot com>
- Date: Fri, 7 May 2010 17:37:00 +0200
- Subject: [patch 3/4 v3] Prefetch: Ignore large values of prefetch_before
- References: <20100507135917.951354000@de.ibm.com> <20100507140449.603260000@de.ibm.com> <20100507145141.GA27608@kam.mff.cuni.cz>
Am Freitag 07 Mai 2010 16:51:41 schrieb Zdenek Dvorak:
> Missing space after "abs".
[...]
> No need for "abs" here.
Right. I also added a dot space at the end of the comments.
Prefetch: Ignore large values of prefetch_before
There is a heuristic in the prefetch code to detect multiple accesses
that will "meet" somewhen in the future. The prefetch code assumes
that at this point one access is already "prefetched" by the other one.
The prefetch code, therefore, sets prefetch_before stating that only
the first prefetch_before iterations need a prefetch.
The problem is that the current prefetch code does not issue a prefetch
if prefetch_before is set.
Most of the time this is not a problem, but with large prefetch_before
settings we might have exceeded the cache before the reuse really
happens or prefetch_before might be in the same order of magnitude as
the amount of loop iterations.
This patch tries to address the first problem. We simply reset
prefetch_before to PREFETCH_ALL if prefetch_before is too large.
The amount of L2 cache together with the step size or cache line
size is used as a breaking point.
I did not check in should_issue_prefetch_p, because I think we should
distinguish (step > PREFETCH_BLOCK) and (step < PREFETCH_BLOCK).
On s390 this patch results in a 6.6% win for lbm. All other tests are
within the noise ratio with small losses and wins.
Bootstrapped and tested on s390x-ibm-linux-gnu.
Ok to apply?
Christian.
2010-05-07 Christian Borntraeger <borntraeger@de.ibm.com>
* tree-ssa-loop-prefetch.c (prune_ref_by_group_reuse): Reset
prefetch_before to PREFETCH_ALL if to accesses "meet" beyond
cache size.
Index: gcc/tree-ssa-loop-prefetch.c
===================================================================
*** gcc/tree-ssa-loop-prefetch.c.orig
--- gcc/tree-ssa-loop-prefetch.c
*************** prune_ref_by_group_reuse (struct mem_ref
*** 704,709 ****
--- 704,712 ----
hit_from = ddown (delta_b, PREFETCH_BLOCK) * PREFETCH_BLOCK;
prefetch_before = (hit_from - delta_r + step - 1) / step;
+ /* Do not reduce prefetch_before if we meet beyond cache size. */
+ if (prefetch_before > abs (L2_CACHE_SIZE_BYTES / step))
+ prefetch_before = PREFETCH_ALL;
if (prefetch_before < ref->prefetch_before)
ref->prefetch_before = prefetch_before;
*************** prune_ref_by_group_reuse (struct mem_ref
*** 734,739 ****
--- 737,745 ----
reduced_prefetch_block, align_unit);
if (miss_rate <= ACCEPTABLE_MISS_RATE)
{
+ /* Do not reduce prefetch_before if we meet beyond cache size. */
+ if (prefetch_before > L2_CACHE_SIZE_BYTES / PREFETCH_BLOCK)
+ prefetch_before = PREFETCH_ALL;
if (prefetch_before < ref->prefetch_before)
ref->prefetch_before = prefetch_before;