[PATCH][AArch64] PR79262: Adjust vector cost
Richard Sandiford
richard.sandiford@arm.com
Wed Oct 16 21:35:00 GMT 2019
Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> ping
>
> PR79262 has been fixed for almost all AArch64 cpus, however the example is still
> vectorized in a few cases, resulting in lower performance. Increase the cost of
> vector-to-scalar moves so it is more similar to the other vector costs. As a result
> -mcpu=cortex-a53 no longer vectorizes the testcase - libquantum and SPECv6
> performance improves.
>
> OK for commit?
>
> ChangeLog:
> 2018-01-22 Wilco Dijkstra <wdijkstr@arm.com>
>
> PR target/79262
> * config/aarch64/aarch64.c (generic_vector_cost): Adjust vec_to_scalar_cost.
OK, thanks, and sorry for the delay.
qdf24xx_vector_cost is the only specific CPU cost table with a
vec_to_scalar_cost as low as 1. It's not obvious how emphatic
that choice is though. It looks like qdf24xx_vector_cost might
(very reasonably!) have started out as a copy of the generic costs
with some targeted changes.
But even if 1 is accurate there from a h/w perspective, the problem
is that the vectoriser's costings have a tendency to miss additional
overhead involved in scalarisation. Although increasing the cost
to avoid that might be a bit of a hack, it's the accepted hack.
So I suspect in practice all CPUs will benefit from a higher cost,
not just those whose CPU tables already have one. On that basis,
increasing the generic cost by the smallest possible amount should
be a good change across the board.
If anyone finds a counter-example, please let us know or file a bug.
Richard
> --
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index c6a83c881038873d8b68e36f906783be63ddde56..43f5b7162152ca92a916f4febee01f624c375202 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -403,7 +403,7 @@ static const struct cpu_vector_cost generic_vector_cost =
> 1, /* vec_int_stmt_cost */
> 1, /* vec_fp_stmt_cost */
> 2, /* vec_permute_cost */
> - 1, /* vec_to_scalar_cost */
> + 2, /* vec_to_scalar_cost */
> 1, /* scalar_to_vec_cost */
> 1, /* vec_align_load_cost */
> 1, /* vec_unalign_load_cost */
More information about the Gcc-patches
mailing list