This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][AArch64] Improve Cortex-A53 FP scheduler
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>, <kyrtka01 at arm dot com>, <ramana dot radhakrishnan at arm dot com>, <richard dot earnshaw at arm dot com>
- Date: Wed, 14 Jun 2017 15:20:56 +0100
- Subject: Re: [PATCH][AArch64] Improve Cortex-A53 FP scheduler
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=pass (sender IP is 217.140.96.140) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com;
- Nodisclaimer: True
- References: <AM5PR0802MB2610CF23364D273805919F7C83CD0@AM5PR0802MB2610.eurprd08.prod.outlook.com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On Mon, Jun 12, 2017 at 02:16:44PM +0100, Wilco Dijkstra wrote:
> The Cortex-A53 scheduler model of FMAC bypass is not quite right
> for FMAC to FMAC forwarding. Experiments also show the latencies of
> FP operations are too high as well. Rather than adding more bypasses,
> adjust the latencies of FP instructions to get a better schedule on
> average. As a result SPECFP2006 is 1.1% faster.
>From an AArch64 perspective this is OK, but it will need an ARM OK too,
as it is shared code.
Thanks,
James
> Passes AArch64 and ARM bootstrap and regress.
>
> ChangeLog:
> 2017-05-30 Wilco Dijkstra <wdijkstr@arm.com>
>
> * config/arm/cortex-a53.md (cortex_a53_fpalu) Adjust latency.
> (cortex_a53_fconst): Likewise.
> (cortex_a53_fpmul): Likewise.
> (cortex_a53_f_load_64): Likewise.
> (cortex_a53_f_load_many): Likewise.
> (cortex_a53_advsimd_alu): Likewise.
> (cortex_a53_advsimd_alu_q): Likewise.
> (cortex_a53_advsimd_mul): Likewise.
> (cortex_a53_advsimd_mul_q): Likewise.
> (fpmac bypass): Add new bypass for fpmac-fpmac case.
> Add missing fmul, r2f_cvt and fconst cases.
> --