This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] Improve prefetch heuristics


Somehow gcc-patches got dropped from my CC.
I'm sending it back to the list.

Sebastian

On Fri, May 7, 2010 at 11:17, Sebastian Pop <sebpop@gmail.com> wrote:
> Changpeng,
>
> I committed 0002 and 0003 to trunk as revisions r159161 and r159162.
> Could you please send the corrected version of 0001 and add the
> suggested comment to 0004.
>
> Thanks,
> Sebastian
>
> On Fri, May 7, 2010 at 04:51, Zdenek Dvorak <rakdver@kam.mff.cuni.cz> wrote:
>> Hi,
>>
>>> 0001-Reduce-useless-and-redundant-prefetches.patch
>>> ==========================================
>>> This patch has too parts:
>>> First one is in schedule_prefetches. ?If the actual unroll_factor is far less than what is
>>> required by the prefetch (i.e. the loop is not sufficiently unrolled), redundant prefetches
>>> will be introduced. For example, if the prefetch requires unrolling 16 times and the actually
>>> unroll factor is 1 (not unrolled), 15 out of 16 interations of the loop will issue redundant
>>> prefetches (16 prefetches fall on the same cache line).
>>> We add the following lines in schedule_prefetches to disable prefetches for such cases:
>>> if (prefetch_mod / unroll_factor > 8)
>>> ? continue;
>>
>> this part is OK.
>>
>>> The second part is in issue_prefetch_refs. If, due to some reason, the computed "ahead"
>>> is too small, and the prefetch would most likely fall on the same cache line with the existing
>>> memory reference, the prefetch is considered useless. This patch can avoid such useless
>>> prefetches.
>>>
>>> Prefetch distance is how far ahead should we issue the prefetch. The ideal prefetch distance
>>> should be prefetch latency. We should not schedule prefetch too eailer or too later. In loop prefetch,
>>> we compute ahead which is how many iteration ahead should we issue the prefetch. Essentially,
>>> ahead = prefetch_latency/loop_body_size. If the loop body is too big, ahead is vey small (possibly
>>> less than 1 -- we round it to 1), and thus the address difference with the memory reference is too
>>> small.
>>
>> This part is not. ?issue_prefetch_refs a wrong place for this decision; the decision which prefetches
>> will be issued should be done in schedule_prefetches. ?Anyway, if I follow your reasoning, one
>> would conclude that we need to disable the prefetching for loops with PREFETCH_LATENCY < time
>> (in loop_prefetch_arrays) completely. ?This IMHO makes no sense, as one would expect the
>> prefetching to be profitable exactly for these loops, that do enough extra work to make it
>> possible to hide memory latency through prefetching.
>>
>>> 0002-Dump-a-diagnostic-info-when-the-insn-to-mem-ratio-is.patch
>>> =======================================================================
>>> This patch adds diagnostic statements if the instruction to memory ratio is too small and the prefetch is
>>> not generated. This helps us find the reason why prefetch not generated in a loop.\
>>> =======================================================================
>>
>> OK.
>>
>>> 0003-Account-for-loop-unrolling-in-the-insn-to-prefetch-r.patch
>>> =================================================================
>>> This patch accounts for loop unrolling when applying the instruction to prefetch heuristic for loops with unknown
>>> trip count. (unroll_factor * ninsns) is used to estimate the number of instructions in a loop. This approach is
>>> too simple, and is aggressive in generating prefetches. We add comments from Zdenek with suggestions
>>> for further improvements.
>>> ==================================================================
>>
>> OK.
>>
>>> 0004-Define-the-TRIP_COUNT_TO_AHEAD_RATIO-heuristic.patch
>>> ====================================================
>>> This patch defines the trip count to ahead ratio heuristic in the cost
>>> ?model: don't generate prefetches for loops where the trip count is
>>> ?less than TRIP_COUNT_TO_AHEAD_RATIO times the ahead iterations.
>>> ====================================================
>>
>> OK; but I would suggest to update the comment explaining TRIP_COUNT_TO_AHEAD_RATIO
>> ("For example, in a loop with a prefetch ahead distance of 10, supposing that
>> TRIP_COUNT_TO_AHEAD_RATIO is equal to ...") for its current value of 4.
>>
>> Zdenek
>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]