[PATCH 1/2] Introduce prefetch-minimum stride option
Luis Machado
luis.machado@linaro.org
Mon May 7 15:51:00 GMT 2018
On 05/07/2018 12:15 PM, H.J. Lu wrote:
> On Mon, May 7, 2018 at 7:09 AM, Luis Machado <luis.machado@linaro.org> wrote:
>>
>>
>> On 05/01/2018 03:30 PM, Jeff Law wrote:
>>>
>>> On 01/22/2018 06:46 AM, Luis Machado wrote:
>>>>
>>>> This patch adds a new option to control the minimum stride, for a memory
>>>> reference, after which the loop prefetch pass may issue software prefetch
>>>> hints for. There are two motivations:
>>>>
>>>> * Make the pass less aggressive, only issuing prefetch hints for bigger
>>>> strides
>>>> that are more likely to benefit from prefetching. I've noticed a case in
>>>> cpu2017
>>>> where we were issuing thousands of hints, for example.
>>>>
>>>> * For processors that have a hardware prefetcher, like Falkor, it allows
>>>> the
>>>> loop prefetch pass to defer prefetching of smaller (less than the
>>>> threshold)
>>>> strides to the hardware prefetcher instead. This prevents conflicts
>>>> between
>>>> the software prefetcher and the hardware prefetcher.
>>>>
>>>> I've noticed considerable reduction in the number of prefetch hints and
>>>> slightly positive performance numbers. This aligns GCC and LLVM in terms
>>>> of
>>>> prefetch behavior for Falkor.
>>>>
>>>> The default settings should guarantee no changes for existing targets.
>>>> Those
>>>> are free to tweak the settings as necessary.
>>>>
>>>> No regressions in the testsuite and bootstrapped ok on aarch64-linux.
>>>>
>>>> Ok?
>>>>
>>>> 2018-01-22 Luis Machado <luis.machado@linaro.org>
>>>>
>>>> Introduce option to limit software prefetching to known constant
>>>> strides above a specific threshold with the goal of preventing
>>>> conflicts with a hardware prefetcher.
>>>>
>>>> gcc/
>>>> * config/aarch64/aarch64-protos.h (cpu_prefetch_tune)
>>>> <minimum_stride>: New const int field.
>>>> * config/aarch64/aarch64.c (generic_prefetch_tune): Update to
>>>> include
>>>> minimum_stride field.
>>>> (exynosm1_prefetch_tune): Likewise.
>>>> (thunderxt88_prefetch_tune): Likewise.
>>>> (thunderx_prefetch_tune): Likewise.
>>>> (thunderx2t99_prefetch_tune): Likewise.
>>>> (qdf24xx_prefetch_tune): Likewise. Set minimum_stride to 2048.
>>>> (aarch64_override_options_internal): Update to set
>>>> PARAM_PREFETCH_MINIMUM_STRIDE.
>>>> * doc/invoke.texi (prefetch-minimum-stride): Document new option.
>>>> * params.def (PARAM_PREFETCH_MINIMUM_STRIDE): New.
>>>> * params.h (PARAM_PREFETCH_MINIMUM_STRIDE): Define.
>>>> * tree-ssa-loop-prefetch.c (should_issue_prefetch_p): Return
>>>> false if
>>>> stride is constant and is below the minimum stride threshold.
>>>
>>> OK for the trunk.
>>> jeff
>>>
>>
>> Thanks. Committed as revision 259995 now.
>
> This breaks bootstrap on x86:
>
> ../../src-trunk/gcc/tree-ssa-loop-prefetch.c: In function âbool
> should_issue_prefetch_p(mem_ref*)â:
> ../../src-trunk/gcc/tree-ssa-loop-prefetch.c:1010:54: error:
> comparison of integer expressions of different signedness: âlong long
> unsigned intâ and âintâ [-Werror=sign-compare]
> && absu_hwi (int_cst_value (ref->group->step)) < PREFETCH_MINIMUM_STRIDE)
> ../../src-trunk/gcc/tree-ssa-loop-prefetch.c:1014:4: error: format
> â%dâ expects argument of type âintâ, but argument 5 has type âlong
> long intâ [-Werror=format=]
> "Step for reference %u:%u (%d) is less than the mininum "
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> "required stride of %d\n",
> ~~~~~~~~~~~~~~~~~~~~~~~~~
> ref->group->uid, ref->uid, int_cst_value (ref->group->step),
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
I've reverted this for now while i address the bootstrap problem.
More information about the Gcc-patches
mailing list