[patch] Improve prefetch heuristics
Fang, Changpeng
Changpeng.Fang@amd.com
Mon May 3 21:20:00 GMT 2010
>> Patch5: 0005-Also-apply-the-insn-to-prefetch-ratio-heuristic-to-l.patch
>> This patch applies the instruction to prefetch ratio heuristic also to loops with
>> known loop trip count. It improves 416.gamess by 3~4% and 445.gobmk by 3%.
>I think it would be better to find out why the prefetching for the loops in
>these examples is currently performed (or why it causes the degradation). If
>the instruction count is too small, AHEAD should be big enough to prevent the
>prefetching. Currently we just test whether est_niter <= ahead, which is
>rather aggressive. We should only emit the prefetches if # of iterations is
>significantly bigger than ahead, so that could be the place that needs to be
>fixed. Which ...
The reason is that there are too many prefetches. Here is an example:
Ahead 5, unroll factor 1, trip count 216, insn count 49, mem ref count 15, prefetch count 15
Here we inserted one prefetch for each memory reference. I just feel uncomfortable that
we inserted so many prefetches in this loop. I think prefetches will compete with loads/stores
for memory ports and bandwidth, and thus cause performance degradation in this case.
>> Patch6: 0006-Define-the-TRIP_COUNT_TO_AHEAD_RATIO-heuristic.patch This
>> patch defines the trip count to ahead ratio heuristic in the cost
>> model: don't generate prefetches for loops where the trip count is
>> less than TRIP_COUNT_TO_AHEAD_RATIO times the ahead iterations.
.>.. you actually do here. This is OK (I assume you did some experiments with
>the value of TRIP_COUNT_TO_AHEAD_RATIO?)
>> Patch 7: 0007-Don-t-generate-prefetches-for-loops-with-small-trip-.patch
>> Don't generate prefetch for loops with small trip count. We found that
>> loop prefetch is not effective for loops with small trip counts, which
>> usually have big loop bodies and appropriate prefetch scheduling is
>> required. This patch improves 454.calculix by ~5%.
>This looks a bit redundant with the previous patch -- can't you just set
>TRIP_COUNT_TO_AHEAD_RATIO to 5 (or whatever), so that it would take care
>of this case as well?
Yes, patch 6 and 7 can be combined by adjusting TRIP_COUNT_TO_AHEAD_RATIO.
Thanks,
Changpeng
More information about the Gcc-patches
mailing list