This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Patch to Avoid Bad Prefetching


On Thu, Apr 16, 2009 at 7:25 PM, Zdenek Dvorak <rakdver@kam.mff.cuni.cz> wrote:
> Hi,
>
>> However, using the command-line option you propose likely won't do the
>> job for this anyway, as different loops behave differently. ?A better
>> solution for such optimization would be per-loop hints, e.g.
>> #pragma loop count used by the intel compiler.
>>
>> [Ghassan] Totally agree that such user hints will give more precise
>> information and hence better performance, but the point is: what's the
>> best that we can do when that precise information is not available?
>> Should we just give up? My answer is "No".
>
> right; so why don't you implement #pragma loop count? ?It should take
> just a few hours to do, and would be hugely more useful, as other
> passes can take advantage of it too.
>
>> Also, at the point where the customer pays you to spend hours or days on
>> fidling with the compiler options, it likely won't hurt you to spend a
>> few minutes on modifying the makefiles (and getting the necessary
>> testcases)
>> to enable profile feedback, either,
>>
>> [Ghassan] Yes, but what about the compile time cost? Many users are not
>> willing to pay that cost every time they compile.
>
> You only need to enable the profile feedback for the final compilation
> (or before performance testing etc.)

To go forward with this may I propose to split this patch further.

A first patch to disable prefetching completely for unknown trip-count
loops, thus, a patch introducing is_loop_prefetching_profitable, but
with

+     prefetching may cause serious performance degradation. To avoid
this
+     problem when the trip count cannot be guessed at compile time,
+     do not issue prefetches in this case.  */
+  if (est_niter < 0)
+    return false;

without any new params.  This would be to reduce the degradation
of SPEC with prefetching.  In this light it it may be possible to
turn on prefetching by default at -O3 (or at least with
-fprofile-use/generate and -O3).  Detailed SPEC numbers
with/without prefetching would be useful here (also with/without
profile-feedback).

A second patch removing the artificial limit on the number of
basic blocks for unrolling completely, without any new params,
given that SPEC numbers do not degrade with this.

Note that in addition to SPEC Polyhedron also is a good source
for benchmarking.

Thanks,
Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]