This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Update X86_TUNE_AVOID_256FMA_CHAINS for znver2
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: gcc-patches at gcc dot gnu dot org
- Date: Tue, 23 Jul 2019 11:34:37 +0200
- Subject: Update X86_TUNE_AVOID_256FMA_CHAINS for znver2
Hi,
this patch enables logic which avoid FMA for matrix multiplicaiton loop
for 256 bit vectors. The underlying issue is same as with znver1. While
combined latency of mutliply and add operations is slower than FMA, the
dependency chain in matrix multiplication depends only on additions
that are faster.
Bootstrapped/regtested x86_64-linux, comitted.
* config/i386/i386-options.c (ix86_option_override_internal): Default
PARAM_AVOID_FMA_MAX_BITS to 256 for znver2.
* conifg/i386/x86-tune.def (X86_TUNE_AVOID_256FMA_CHAINS): Set for
ZNVER2.
Index: config/i386/i386-options.c
===================================================================
--- config/i386/i386-options.c (revision 273727)
+++ config/i386/i386-options.c (working copy)
@@ -2779,7 +2779,11 @@ ix86_option_override_internal (bool main
opts->x_flag_cf_protection
= (cf_protection_level) (opts->x_flag_cf_protection | CF_SET);
- if (ix86_tune_features [X86_TUNE_AVOID_128FMA_CHAINS])
+ if (ix86_tune_features [X86_TUNE_AVOID_256FMA_CHAINS])
+ maybe_set_param_value (PARAM_AVOID_FMA_MAX_BITS, 256,
+ opts->x_param_values,
+ opts_set->x_param_values);
+ else if (ix86_tune_features [X86_TUNE_AVOID_128FMA_CHAINS])
maybe_set_param_value (PARAM_AVOID_FMA_MAX_BITS, 128,
opts->x_param_values,
opts_set->x_param_values);
Index: config/i386/x86-tune.def
===================================================================
--- config/i386/x86-tune.def (revision 273727)
+++ config/i386/x86-tune.def (working copy)
@@ -431,6 +431,10 @@ DEF_TUNE (X86_TUNE_USE_GATHER, "use_gath
smaller FMA chain. */
DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER)
+/* X86_TUNE_AVOID_256FMA_CHAINS: Avoid creating loops with tight 256bit or
+ smaller FMA chain. */
+DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2)
+
/*****************************************************************************/
/* AVX instruction selection tuning (some of SSE flags affects AVX, too) */
/*****************************************************************************/