This is the mail archive of the
mailing list for the GCC project.
Re: Patch to Avoid Bad Prefetching
On Thu, Jun 4, 2009 at 11:53 PM, Shobaki, Ghassan
> I meant the time estimate.
> For example, consider a loop with 2 cache missing memory ops and 4 CPU
> ops. Assuming that we set the min_insn_to_mem_ratio to 3, the current
> *imprecise* heuristic will calculate an insn-to-mem ratio of (2+4)/2=3
> and conclude that this loop will benefit from prfetching whether the 4
> CPU ops are integer adds or FP divides. A more precise calculation will
> go like this:
> Time for memory ops = 200 cycle (assuming a cache-miss latency of 200
> and that the machine can do 2 mem ops simultaneously)
> If the 4 CPU ops are integer arithmetic:
> Time for CPU ops = 4*1=4 (assuming that int arith ops take one cycle
> Max potential benefit from prefetching = 4/204 = 2% (probably not
> significant enough to pay off for the prefething cost)
> On the other hand,
> If the 4 CPU ops are FP divides:
> Time for CPU ops = 4*20=80 (assuming that FP divides take 20 cycles
> Max potential benefit from prefetching = 80/(200+80) = 29% (most likely
> significant enough to pay off for the prefething cost and deliver a good
> performance gain)
> As you can see, this is much more precise than simply looking at the
> ratio, but it requires a good time estimate for each operation. I assume
> that the function tree_num_loop_insns() internally computes such time
> estimates if we pass it time weights. Of course, I know that middle-end
Right. And the prefetcher already computes this as 'time' to compute
the ahead distance (I was suggesting to use that here as a start).
> time estimates will not be very precise compared to backend estimates.
> BTW, are the middle-end time estimates machine dependent?
No, they are not. And they are indeed very imprecise right now.