This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Patch to Avoid Bad Prefetching

From: Steven Bosscher <stevenb dot gcc at gmail dot com>
To: "Shobaki, Ghassan" <Ghassan dot Shobaki at amd dot com>
Cc: Richard Guenther <richard dot guenther at gmail dot com>, Zdenek Dvorak <rakdver at kam dot mff dot cuni dot cz>, gcc-patches at gcc dot gnu dot org
Date: Wed, 3 Jun 2009 09:11:54 +0200
Subject: Re: Patch to Avoid Bad Prefetching
References: <84fc9c000904150216h170269b7v70f5ae2a10110296@mail.gmail.com> <20090415155239.GA20198@kam.mff.cuni.cz> <D60441ACF713E84E9952ADA42F2E08B70149981A@ssvlexmb2.amd.com> <20090416010249.GB21250@kam.mff.cuni.cz> <D60441ACF713E84E9952ADA42F2E08B70149988B@ssvlexmb2.amd.com> <20090416052522.GB3021@kam.mff.cuni.cz> <D60441ACF713E84E9952ADA42F2E08B701499986@ssvlexmb2.amd.com> <20090416172534.GB19702@kam.mff.cuni.cz> <84fc9c000904240451g28482e85mcd202d894f29b3f@mail.gmail.com> <912DA18E911D8B418824641EF1541F3C417E84@ssvlexmb2.amd.com>

On Wed, Jun 3, 2009 at 2:19 AM, Shobaki, Ghassan
<Ghassan.Shobaki@amd.com> wrote:
> First Heuristic:  Disable prefetching in a loop if the
> potential benefit is insignificant. Prefetching improves
> performance by overlapping cache missing memory operations
> with CPU operations. Therefore, if the loop does not have
> a significant amount of CPU operations for the machine to
> execute while waiting on cache misses, the gain from
> prefetching will be insignificant and hence unlikely to
> pay for the prefetching cost. To be precise, an upper bound
> on the benefit from prefetching can be computed by
> estimating the time needed to execute the CPU operations
> and dividing that by the time needed to execute the entire
> loop (with cache misses taken into account). However, this
> patch avoids these instruction-by-instruction calculations
> and adopts an approximation that simply looks at the ratio
> between the total instruction count and the memory reference
> count and disables prefetching if that ratio is less than a
> certain threshold (PREFETCH_MIN_INSN_TO_MEM_RATIO).


Is loop blocking already implemented in Graphite?  If so, it would be
interesting to see if you can use loop blocking with a prefetch in one
of the outer loops, e.g.:

! Original loop
REAL A(N,M), B(N,M)
DO J =1, M
  DO I = 1, N
    A(I,J) = A(I,J) + B(I,J)
  ENDDO
ENDDO


! Transformed loop after blocking and inserting prefetches

REAL A(N,M), B(N,M)
DO J = 1, M, BS
  DO I =1, N, BS
    PREFETCH(A(I,J+BS))    ! Or something like this, don't...
    PREFETCH(B(I,J+BS))    ! ...mind the details ;-)
    DO JJ = J, J+M, BS-1
      DO II = I, I+N, BS-1
        A(II,JJ) = A(II,JJ) + B(II,JJ)
      ENDDO
    ENDDO
  ENDDO
ENDDO


This way, you can raise the PREFETCH_MIN_INSN_TO_MEM_RATIO for the
inner two loops.

Ciao!
Steven

References:
- RE: Patch to Avoid Bad Prefetching
  - From: Shobaki, Ghassan

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]