[patch] Improve prefetch heuristics

Fang, Changpeng Changpeng.Fang@amd.com
Mon May 3 21:20:00 GMT 2010


>> Patch5: 0005-Also-apply-the-insn-to-prefetch-ratio-heuristic-to-l.patch
>> This patch applies the instruction to prefetch ratio heuristic also to loops with
>> known loop trip count.  It improves 416.gamess by 3~4% and 445.gobmk by 3%.

>I think it would be better to find out why the prefetching for the loops in
>these examples is currently performed (or why it causes the degradation).  If
>the instruction count is too small, AHEAD should be big enough to prevent the
>prefetching.  Currently we just test whether est_niter <= ahead, which is
>rather aggressive. We should only emit the prefetches if # of iterations is
>significantly bigger than ahead, so that could be the place that needs to be
>fixed.  Which ...


The reason is that there are too many prefetches. Here is an example:
Ahead 5, unroll factor 1, trip count 216, insn count 49, mem ref count 15, prefetch count 15

Here we inserted one prefetch for each memory reference. I just feel uncomfortable that
we inserted so many prefetches in this loop. I think prefetches will compete with loads/stores
for memory ports and bandwidth, and thus cause performance degradation in this case.


>> Patch6: 0006-Define-the-TRIP_COUNT_TO_AHEAD_RATIO-heuristic.patch This
>> patch defines the trip count to ahead ratio heuristic in the cost
>> model: don't generate prefetches for loops where the trip count is
>> less than TRIP_COUNT_TO_AHEAD_RATIO times the ahead iterations.

.>.. you actually do here.  This is OK (I assume you did some experiments with
>the value of TRIP_COUNT_TO_AHEAD_RATIO?)

>> Patch 7: 0007-Don-t-generate-prefetches-for-loops-with-small-trip-.patch
>> Don't generate prefetch for loops with small trip count.  We found that
>> loop prefetch is not effective for loops with small trip counts, which
>> usually have big loop bodies and appropriate prefetch scheduling is
>> required.  This patch improves 454.calculix by ~5%.

>This looks a bit redundant with the previous patch -- can't you just set
>TRIP_COUNT_TO_AHEAD_RATIO to 5 (or whatever), so that it would take care
>of this case as well?

Yes, patch 6 and 7 can be combined by adjusting TRIP_COUNT_TO_AHEAD_RATIO.

Thanks,

Changpeng










More information about the Gcc-patches mailing list