This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][AArch64] PR79262: Adjust vector cost
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>
- Date: Fri, 9 Nov 2018 14:54:13 +0000
- Subject: Re: [PATCH][AArch64] PR79262: Adjust vector cost
- References: <DB6PR0801MB2053221E018DE366B60EAC1E83EC0@DB6PR0801MB2053.eurprd08.prod.outlook.com> <DB5PR08MB1030E3F6C94348770C9BB83283C60@DB5PR08MB1030.eurprd08.prod.outlook.com>
On Fri, Nov 09, 2018 at 08:14:27AM -0600, Wilco Dijkstra wrote:
> PR79262 has been fixed for almost all AArch64 cpus, however the example is still
> vectorized in a few cases, resulting in lower performance. Increase the cost of
> vector-to-scalar moves so it is more similar to the other vector costs. As a result
> -mcpu=cortex-a53 no longer vectorizes the testcase - libquantum and SPECv6
> performance improves.
>
> OK for commit?
No.
We have 7 unique target tuning structures in the AArch64 backend, of which
only one has a 2x ratio between scalar_int_cost and vec_to_scalar_cost. Other
ratios are 1, 3, 8, 3, 4, 6.
What makes this choice correct? What makes it more correct than what we
have now? On which of the 28 entries in config/aarch64/aarch64-cores.def does
performance improve? Are the Spec benchmarks sufficiently representative to
change the generic vectorisation costs?
Please validate the performance effect of this patch, which changes default
code generation for everyone, on more than one testcase in a bug report.
Thanks,
James
> ChangeLog:
> 2018-01-22 Wilco Dijkstra <wdijkstr@arm.com>
>
> PR target/79262
> * config/aarch64/aarch64.c (generic_vector_cost): Adjust vec_to_scalar_cost.
> --