This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Update X86_TUNE_AVOID_256FMA_CHAINS for znver2


Hi,
this patch enables logic which avoid FMA for matrix multiplicaiton loop
for 256 bit vectors. The underlying issue is same as with znver1. While
combined latency of mutliply and add operations is slower than FMA, the
dependency chain in matrix multiplication depends only on additions
that are faster.

Bootstrapped/regtested x86_64-linux, comitted.

	* config/i386/i386-options.c (ix86_option_override_internal): Default
	PARAM_AVOID_FMA_MAX_BITS to 256 for znver2.
	* conifg/i386/x86-tune.def (X86_TUNE_AVOID_256FMA_CHAINS): Set for
	ZNVER2.

Index: config/i386/i386-options.c
===================================================================
--- config/i386/i386-options.c	(revision 273727)
+++ config/i386/i386-options.c	(working copy)
@@ -2779,7 +2779,11 @@ ix86_option_override_internal (bool main
     opts->x_flag_cf_protection
       = (cf_protection_level) (opts->x_flag_cf_protection | CF_SET);
 
-  if (ix86_tune_features [X86_TUNE_AVOID_128FMA_CHAINS])
+  if (ix86_tune_features [X86_TUNE_AVOID_256FMA_CHAINS])
+    maybe_set_param_value (PARAM_AVOID_FMA_MAX_BITS, 256,
+			   opts->x_param_values,
+			   opts_set->x_param_values);
+  else if (ix86_tune_features [X86_TUNE_AVOID_128FMA_CHAINS])
     maybe_set_param_value (PARAM_AVOID_FMA_MAX_BITS, 128,
 			   opts->x_param_values,
 			   opts_set->x_param_values);
Index: config/i386/x86-tune.def
===================================================================
--- config/i386/x86-tune.def	(revision 273727)
+++ config/i386/x86-tune.def	(working copy)
@@ -431,6 +431,10 @@ DEF_TUNE (X86_TUNE_USE_GATHER, "use_gath
    smaller FMA chain.  */
 DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER)
 
+/* X86_TUNE_AVOID_256FMA_CHAINS: Avoid creating loops with tight 256bit or
+   smaller FMA chain.  */
+DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2)
+
 /*****************************************************************************/
 /* AVX instruction selection tuning (some of SSE flags affects AVX, too)     */
 /*****************************************************************************/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]