This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH 5/6][AArch64] Enable -fprefetch-loop-arrays at -O3 for cores that benefit from prefetching.
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Maxim Kuvyrkov <maxim dot kuvyrkov at linaro dot org>
- Cc: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Andrew Pinski <apinski at cavium dot com>, "Richard Guenther" <richard dot guenther at gmail dot com>, <nd at arm dot com>
- Date: Thu, 8 Jun 2017 17:31:38 +0100
- Subject: Re: [PATCH 5/6][AArch64] Enable -fprefetch-loop-arrays at -O3 for cores that benefit from prefetching.
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=pass (sender IP is 126.96.36.199) smtp.mailfrom=arm.com; cavium.com; dkim=none (message not signed) header.d=none;cavium.com; dmarc=bestguesspass action=none header.from=arm.com;
- Nodisclaimer: True
- References: <F7C2520D-866C-4293-831D-815BF466DFA2@linaro.org> <CABFF4CD-C5A5-4DBF-99EA-ED1F7D984430@linaro.org> <588F3050.firstname.lastname@example.org> <19F76073-6A79-41C6-8D7E-E91A450A54D0@linaro.org> <58AEEA90-95A0-4BC6-BB90-F4C781F87AC4@linaro.org>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On Fri, Feb 03, 2017 at 02:58:23PM +0300, Maxim Kuvyrkov wrote:
> > On Jan 30, 2017, at 5:50 PM, Maxim Kuvyrkov <email@example.com> wrote:
> >> On Jan 30, 2017, at 3:23 PM, Kyrill Tkachov <firstname.lastname@example.org> wrote:
> >> Hi Maxim,
> >> On 30/01/17 12:06, Maxim Kuvyrkov wrote:
> >>> This patch enables prefetching at -O3 for aarch64 cores that set "simultaneous prefetches" parameter above 0. There are currently no such settings, so this patch doesn't change default code generation.
> >>> I'm now working on improvements to -fprefetch-loop-arrays pass to make it suitable for -O2. I'll post this work in the next month.
> >>> Bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu.
> >> Are you aiming to get this in for GCC 8?
> >> I have one small comment on this patch:
> >> + /* Enable sw prefetching at -O3 for CPUS that have prefetch, and we
> >> + have deemed it beneficial (signified by setting
> >> + prefetch.num_slots to 1 or more). */
> >> + if (flag_prefetch_loop_arrays < 0
> >> + && HAVE_prefetch
> >> HAVE_prefetch will always be true on aarch64.
> >> I imagine midend code that had logic like this would need this check, but aarch64-specific code shouldn't need it.
> > Agree, I'll remove HAVE_prefetch.
> > This pattern was copied from other backends, and HAVE_prefetch is most likely a historical artifact.
> Andrew raised a good point in the review of his patch that it is a bad idea
> to use one of prefetching parameters (simultaneous_prefetches) as indicator
> for whether to enable prefetching pass by default. Indeed there are cases
> when we want to set simultaneous_prefetch according to HW documentation (or
> experimental results), but not enable prefetching pass by default.
> This update to the patch addresses it. The patch adds a new explicit field
> to prefetch tuning structure "default_opt_level" that sets optimization level
> from which prefetching should be enabled by default. The current value is to
> enable prefetching at -O3; additionally, this parameter will come handy for
> enabling prefetching at -O2 [when it is ready].
I really don't like the scheme of changing the optimisation threshold when
profiling data is used.
I've seen too many reports and presentations by the uninitiated who believe
that the use of profiling data has made the difference, when in reality
it is just GCC changing behaviour on which passes run. It is very
With that line removed, and any rebasing needed over changes to the macro,
I'm happy with this patch.