[PATCH 5/6][AArch64] Enable -fprefetch-loop-arrays at -O3 for cores that benefit from prefetching.
Maxim Kuvyrkov
maxim.kuvyrkov@linaro.org
Fri Feb 3 11:58:00 GMT 2017
> On Jan 30, 2017, at 5:50 PM, Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org> wrote:
>
>> On Jan 30, 2017, at 3:23 PM, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>>
>> Hi Maxim,
>>
>> On 30/01/17 12:06, Maxim Kuvyrkov wrote:
>>> This patch enables prefetching at -O3 for aarch64 cores that set "simultaneous prefetches" parameter above 0. There are currently no such settings, so this patch doesn't change default code generation.
>>>
>>> I'm now working on improvements to -fprefetch-loop-arrays pass to make it suitable for -O2. I'll post this work in the next month.
>>>
>>> Bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu.
>>>
>>
>> Are you aiming to get this in for GCC 8?
>> I have one small comment on this patch:
>>
>> + /* Enable sw prefetching at -O3 for CPUS that have prefetch, and we
>> + have deemed it beneficial (signified by setting
>> + prefetch.num_slots to 1 or more). */
>> + if (flag_prefetch_loop_arrays < 0
>> + && HAVE_prefetch
>>
>> HAVE_prefetch will always be true on aarch64.
>> I imagine midend code that had logic like this would need this check, but aarch64-specific code shouldn't need it.
>
> Agree, I'll remove HAVE_prefetch.
>
> This pattern was copied from other backends, and HAVE_prefetch is most likely a historical artifact.
Andrew raised a good point in the review of his patch that it is a bad idea to use one of prefetching parameters (simultaneous_prefetches) as indicator for whether to enable prefetching pass by default. Indeed there are cases when we want to set simultaneous_prefetch according to HW documentation (or experimental results), but not enable prefetching pass by default.
This update to the patch addresses it. The patch adds a new explicit field to prefetch tuning structure "default_opt_level" that sets optimization level from which prefetching should be enabled by default. The current value is to enable prefetching at -O3; additionally, this parameter will come handy for enabling prefetching at -O2 [when it is ready].
--
Maxim Kuvyrkov
www.linaro.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-Enable-fprefetch-loop-arrays-at-O3-for-cores-that-be.patch
Type: application/octet-stream
Size: 3781 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20170203/a8b2bce2/attachment.obj>
More information about the Gcc-patches
mailing list