This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Patch to Avoid Bad Prefetching

An estimate of 10 iterations can lead to a serious degradation in
performance if the actual trip count turns out to be less than 10 and
less than the ahead distance. For example, this is exactly the cause of
the 27% degradation in bwaves reported in my original email. Bwaves's
hottest innermost loop has a trip count of 5 that GCC does not know at
compile time and the ahead distance for that loop is 8. So, GCC issues a
bunch of useless prefetches in that loop. The most reasonable solution
that I can think of when the trip count is not known at compile time, is
to conservatively assume a small trip count of 4 or 3 or even 2 in this
profitability cost model. This is under the assumption that missing an
improvement opportunity is not as bad as degrading performance - right?
However, the user can always override that conservative estimate by
setting the prefetch-unknown-trip-count parameter that I am proposing.
It seems to me that setting this parameter provides the programmer with
an easy way to give a hint about the trip count if he wants. Of course,
enabling profile feedback is always a more precise option, but we don't
expect most users to take the extra time and effort that are needed to
enable this expensive option. If someone on this list can think of a
better solution to this unknown-trip-count problem, I am open to
pursuing that instead. Any ideas?


-----Original Message-----
From: Zdenek Dvorak [] 
Sent: Wednesday, April 15, 2009 8:53 AM
To: Shobaki, Ghassan
Subject: Re: Patch to Avoid Bad Prefetching


> > I think it would be better to use expected_loop_iterations if the
> > estimate is not available, rather than introducing a new param
> > (while it is likely not more precise, we have way too many params
> > as it is).
> > I agree.  IMHO for prefetching we should recommend profile-feedback.
> The point is that in the prefetching profitability analysis, we must
> have a very conservative estimate for unknown trip counts or we may
> get a big performance degradation as shown in the benchmark numbers I
> included. By conservative, I mean something like 3 or 4 iterations
> only.  In other words, if we don't know the trip count, we should
> assume the worse in this particular analysis to avoid seriously
> degradaing performance. I wonder how the estimate of
> expected_loop_iterations is computed and whether it gives us the
> conservative estimate that we need here?

in the absence of any other hints obtained from the program, the
loops are predicted to iterate 10 times,


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]