Zen tuning part 4: Avoid 512bit memcpy/memset expansions on AVX128 optimal targets

Jan Hubicka hubicka@ucw.cz
Sun Oct 8 10:58:00 GMT 2017


Hi,
ix86_expand_set_or_movmem is trying to use widest possible vector mode available.
This does not help for ryzen, because 512 operations are performed by halves.
This patch by itself does not affect generated code because memcpy/memset expansion
tables needs to be updated.

Bootstrapped/regtested x86_64-linux.

Honza

	* i386.c (ix86_expand_set_or_movmem): Disable 512bit loops for targets
	that preffer 128bit.
Index: i386.c
===================================================================
--- i386.c	(revision 253513)
+++ i386.c	(working copy)
@@ -28947,6 +28947,9 @@ ix86_expand_set_or_movmem (rtx dst, rtx
 	     && optab_handler (mov_optab, wider_mode) != CODE_FOR_nothing)
 	move_mode = wider_mode;
 
+      if (TARGET_AVX128_OPTIMAL && GET_MODE_BITSIZE (move_mode) > 128)
+	move_mode = TImode;
+
       /* Find the corresponding vector mode with the same size as MOVE_MODE.
 	 MOVE_MODE is an integer mode at the moment (SI, DI, TI, etc.).  */
       if (GET_MODE_SIZE (move_mode) > GET_MODE_SIZE (word_mode))



More information about the Gcc-patches mailing list