This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: James Greenhalgh <James dot Greenhalgh at arm dot com>
- Cc: nd <nd at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 16 May 2016 10:38:04 +0000
- Subject: Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting
- Authentication-results: sourceware.org; auth=none
- Nodisclaimer: True
- References: <AM3PR08MB008846A786228571DC6E0731836F0 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:23
ping
________________________________________
From: Wilco Dijkstra
Sent: 22 April 2016 17:15
To: gcc-patches@gcc.gnu.org
Cc: nd
Subject: [PATCH][AArch64] Improve aarch64_case_values_threshold setting
GCC expands switch statements in a very simplistic way and tries to use a table
expansion even when it is a bad idea for performance or codesize.
GCC typically emits extremely sparse tables that contain mostly default entries
(something which currently cannot be tuned by backends). Additionally the
computation of the minimum/maximum label offsets is too simplistic so the tables
are often twice as large as necessary.
The cost of a table switch is significant due to the setup overhead, the table
lookup (which due to being sparse and large adds unnecessary cachemisses)
and hard to predict indirect jump. Therefore it is best to avoid using a table
unless there are many real case labels.
This patch fixes that by setting the default aarch64_case_values_threshold to
16 when the per-CPU tuning is not set. On SPEC2006 this improves the switch
heavy benchmarks GCC and perlbench both in performance (1-2%) as well as size
(0.5-1% smaller).
OK for trunk?
ChangeLog:
2016-04-22 Wilco Dijkstra <wdijkstr@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_case_values_threshold):
Return a better case_values_threshold when optimizing.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 0620f1e..a240635 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3546,7 +3546,12 @@ aarch64_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
return aarch64_tls_referenced_p (x);
}
-/* Implement TARGET_CASE_VALUES_THRESHOLD. */
+/* Implement TARGET_CASE_VALUES_THRESHOLD.
+ The expansion for a table switch is quite expensive due to the number
+ of instructions, the table lookup and hard to predict indirect jump.
+ When optimizing for speed, with -O3 use the per-core tuning if set,
+ otherwise use tables for > 16 cases as a tradeoff between size and
+ performance. */
static unsigned int
aarch64_case_values_threshold (void)
@@ -3557,7 +3562,7 @@ aarch64_case_values_threshold (void)
&& selected_cpu->tune->max_case_values != 0)
return selected_cpu->tune->max_case_values;
else
- return default_case_values_threshold ();
+ return optimize_size ? default_case_values_threshold () : 17;
}