This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53
- From: Tamar Christina <Tamar dot Christina at arm dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Cc: nd <nd at arm dot com>, Kyrylo Tkachov <Kyrylo dot Tkachov at arm dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>, James Greenhalgh <James dot Greenhalgh at arm dot com>, "nickc at redhat dot com" <nickc at redhat dot com>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>
- Date: Tue, 2 May 2017 15:37:21 +0000
- Subject: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53
- Authentication-results: sourceware.org; auth=none
- Authentication-results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=arm.com;
- Nodisclaimer: True
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Hi All,
This patch adjusts the cost model for Cortex-A53 to increase the costs of
an integer division. The reason for this is that we want to always expand
the division to a multiply when doing a division by constant.
On the Cortex-A53 shifts are modeled to cost 1 instruction,
when doing the expansion we have to perform two shifts and an addition.
However because the cost model can't model things such as fusing of shifts,
we have to fully cost both shifts.
This leads to the cost model telling us that for the Cortex-A53 we can never
do the expansion. By increasing the costs of the division by two instructions
we recover the room required in the cost calculation to do the expansions.
The reason for all of this is that currently the code does not produce what you'd expect,
which is that division by constants are always expanded. Also it's inconsistent because
unsigned division does get expanded.
This all reduces the ability to do CSE when using signed modulo since that one is also expanded.
Given:
void f5(void)
{
int x = 0;
while (x > -1000)
{
g(x % 300);
x--;
}
}
we now generate
smull x0, w19, w21
asr x0, x0, 37
sub w0, w0, w19, asr 31
msub w0, w0, w20, w19
sub w19, w19, #1
bl g
as opposed to
sdiv w0, w19, w20
msub w0, w0, w20, w19
sub w19, w19, #1
bl g
Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.
OK for trunk?
Thanks,
Tamar
gcc/
2017-05-02 Tamar Christina <tamar.christina@arm.com>
* config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv cost.
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 68f84b04fe2f2cbdb66c6dd0c7add097606a7878..8cff517861aea2c249a07a07f6775c60f75fb9a0 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -154,7 +154,7 @@ const struct cpu_cost_table cortexa53_extra_costs =
COSTS_N_INSNS (1), /* extend. */
COSTS_N_INSNS (1), /* add. */
COSTS_N_INSNS (1), /* extend_add. */
- COSTS_N_INSNS (7) /* idiv. */
+ COSTS_N_INSNS (9) /* idiv. */
},
/* MULT DImode */
{