[RFC][AARCH64][PATCH 2/5]: Add number of hw prefetchers available to cpu_prefetch_tune

Kugan Vivekanandarajah kugan.vivekanandarajah@linaro.org
Sat Sep 16 22:51:00 GMT 2017


Hi Andrew,

On 15 September 2017 at 13:20, Andrew Pinski <pinskia@gmail.com> wrote:
> On Thu, Sep 14, 2017 at 6:28 PM, Kugan Vivekanandarajah
> <kugan.vivekanandarajah@linaro.org> wrote:
>> This patch adds number of hw prefetchers available to
>> cpu_prefetch_tune so it can be used in loop unrolling decisions.
>
> Can you explain the difference between this and num_slots
> (PARAM_SIMULTANEOUS_PREFETCHES)?  Because it seems like they should be
> the same here.
>
I kept it different for two reason.

1. I am not sure if this would have the same effect on all the
micro-architecture. Keeping it separate allows each microarchitecture
to enable prefetch loop arrays and aiding hw prefetcher (my goal here)
by limiting prefetch streams.

2. The params used for ARAM_SIMULTANEOUS_PREFETCHES seems to be a
value determined by experimentation rather than based on functional
units in hardware. This also allows tuning them speretaterly.

Thanks,
Kugan



More information about the Gcc-patches mailing list