[PATCH][AARCH64] Enable compare branch fusion
Richard Sandiford
richard.sandiford@arm.com
Fri Jan 17 10:02:00 GMT 2020
Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> Enable the most basic form of compare-branch fusion since various CPUs
> support it. This has no measurable effect on cores which don't support
> branch fusion, but increases fusion opportunities on cores which do.
If you're able to say for the record which cores you tested, then that'd
be good.
> Bootstrapped on AArch64, OK for commit?
>
> ChangeLog:
> 2019-12-24 Wilco Dijkstra <wdijkstr@arm.com>
>
> * config/aarch64/aarch64.c (generic_tunings): Add branch fusion.
> (neoversen1_tunings): Likewise.
OK, thanks. I agree there doesn't seem to be an obvious reason why this
would pessimise any cores significantly. And it looked from a quick
check like all AArch64 cores give these compares the lowest in-use
latency (as expected).
We can revisit this if anyone finds any counterexamples.
Richard
>
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index a3b18b381e1748f8fe5e522bdec4f7c850821fe8..1c32a3543bec4031cc9b641973101829c77296b5 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -726,7 +726,7 @@ static const struct tune_params generic_tunings =
> SVE_NOT_IMPLEMENTED, /* sve_width */
> 4, /* memmov_cost */
> 2, /* issue_rate */
> - (AARCH64_FUSE_AES_AESMC), /* fusible_ops */
> + (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops */
> "16:12",/* function_align. */
> "4",/* jump_align. */
> "8",/* loop_align. */
> @@ -1130,7 +1130,7 @@ static const struct tune_params neoversen1_tunings =
> SVE_NOT_IMPLEMENTED, /* sve_width */
> 4, /* memmov_cost */
> 3, /* issue_rate */
> - AARCH64_FUSE_AES_AESMC, /* fusible_ops */
> + (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops */
> "32:16",/* function_align. */
> "32:16",/* jump_align. */
> "32:16",/* loop_align. */
More information about the Gcc-patches
mailing list