This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting
- From: Jim Wilson <jim dot wilson at linaro dot org>
- To: James Greenhalgh <james dot greenhalgh at arm dot com>
- Cc: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, nd <nd at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, philipp dot tomsich at theobroma-systems dot com, benedikt dot huber at theobroma-systems dot com, pinskia at gmail dot com
- Date: Wed, 18 May 2016 18:03:15 -0700
- Subject: Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting
- Authentication-results: sourceware.org; auth=none
- References: <AM3PR08MB008846A786228571DC6E0731836F0 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com> <AM3PR08MB0088B30E6CF907BA2A29414B83770 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com> <20160516113055 dot GB23900 at arm dot com>
On Mon, May 16, 2016 at 4:30 AM, James Greenhalgh
<james.greenhalgh@arm.com> wrote:
> As this change will change code generation for all cores (except
> Exynos-M1), I'd like to hear from those with more detailed knowledge of
> ThunderX, X-Gene and qdf24xx before I take this patch.
It looks like a slight lose on qdf24xx on SPEC CPU2006 at -O3. I see
about a 0.37% loss on the integer benchmarks, and no significant
change on the FP benchmarks. The integer loss is mainly due to
458.sjeng which drops 2%. We had tried various values for
max_case_values earlier, and didn't see any performance improvement
from setting it, so we are using the default value.
We've been tracking changes to the FSF tree, and adjust our tuning
structure as necessary, so I'm not too concerned about this. We will
just set the max_case_values field in the tuning structure to get the
result we want. What I am slightly concerned about is that the
max_case_values field is only used at -O3 and above which limits the
usefulness. If a port has specified a value, it probably should be
used for all non-size optimization, which means we should check for
optimize_size first, then check for a cpu specific value, then use the
default. If you do that, then you don't need to change the default to
get better generic/a53 code, you can change it in the generic and/or
a53 tuning tables.
Though I see that the original patch from Samsung that added the
max_case_values field has the -O3 check, so there was apparently some
reason why they wanted it to work that way. The value that the
exynos-m1 is using, 48, looks pretty large, so maybe they thought that
the code size expansion from that is only OK at -O3 and above. Worst
case, we might need two max_case_value fields, one to use at -O1/-O2,
and one to use at -O3.
Jim