This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][AArch64] Improve register allocation of fma
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>
- Date: Tue, 15 May 2018 18:11:41 +0100
- Subject: Re: [PATCH][AArch64] Improve register allocation of fma
- Nodisclaimer: True
- References: <DB6PR0801MB2053C98A886E390B17BF308D831F0@DB6PR0801MB2053.eurprd08.prod.outlook.com> <DB5PR08MB10308C7801C8085382AA917883930@DB5PR08MB1030.eurprd08.prod.outlook.com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On Tue, May 15, 2018 at 08:00:49AM -0500, Wilco Dijkstra wrote:
>
> ping
This seems like a fairly horrible hack around the register allocator
behaviour.
BUt, OK.
James
> This patch improves register allocation of fma by preferring to update the
> accumulator register. This is done by adding fma insns with operand 1 as the
> accumulator. The register allocator considers copy preferences only in operand
> order, so if the first operand is dead, it has the highest chance of being
> reused as the destination. As a result code using fma often has a better
> register allocation. Performance of SPECFP2017 improves by over 0.5% on some
> implementations, while it had no effect on other implementations. Fma is more
> readable too, in a simple example we now generate:
>
> fmadd s16, s2, s1, s16
> fmadd s7, s17, s16, s7
> fmadd s6, s16, s7, s6
> fmadd s5, s7, s6, s5
>
> instead of:
>
> fmadd s16, s16, s2, s1
> fmadd s7, s7, s16, s6
> fmadd s6, s6, s7, s5
> fmadd s5, s5, s6, s4
>
> Bootstrap OK. OK for commit?
>
> ChangeLog:
> 2018-01-04 Wilco Dijkstra <wdijkstr@arm.com>
>
> gcc/
> * config/aarch64/aarch64.md (fma<mode>4): Change into expand pattern.
> (fnma<mode>4): Likewise.
> (fms<mode>4): Likewise.
> (fnms<mode>4): Likewise.
> (aarch64_fma<mode>4): Rename insn, reorder accumulator operand.
> (aarch64_fnma<mode>4): Likewise.
> (aarch64_fms<mode>4): Likewise.
> (aarch64_fnms<mode>4): Likewise.
> (aarch64_fnmadd<mode>4): Likewise.