This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH, rs6000] don't use unaligned vsx for memset of less than 32 bytes


In gcc 8 I added support for unaligned vsx in the builtin expansion of
memset(x,0,y). Turns out that for memset of less than 32 bytes, this
doesn't really help much, and it also runs into an egregious load-hit-
store case in CPU2006 components gcc and hmmer.

This patch reverts to the previous (gcc 7) behavior for memset of 16-31 
bytes, which is to use vsx stores only if the target is 16 byte
aligned. For 32 bytes or more, unaligned vsx stores will still be used.
  Performance testing of the memset expansion shows that not much is
given up by using scalar stores for 16-31 bytes, and CPU2006 runs show
the performance regression is fixed.

Regstrap passes on powerpc64le, ok for trunk and backport to 8?

Thanks,
   Aaron

2018-06-25  Aaron Sawdey  <acsawdey@linux.ibm.com>

	* config/rs6000/rs6000-string.c (expand_block_clear): Don't use
	unaligned vsx for 16B memset.


-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
Index: gcc/config/rs6000/rs6000-string.c
===================================================================
--- gcc/config/rs6000/rs6000-string.c	(revision 261808)
+++ gcc/config/rs6000/rs6000-string.c	(working copy)
@@ -90,7 +90,9 @@
       machine_mode mode = BLKmode;
       rtx dest;
 
-      if (bytes >= 16 && TARGET_ALTIVEC && (align >= 128 || TARGET_EFFICIENT_UNALIGNED_VSX))
+      if (TARGET_ALTIVEC
+	  && ((bytes >= 16 && align >= 128)
+	      || (bytes >= 32 && TARGET_EFFICIENT_UNALIGNED_VSX)))
 	{
 	  clear_bytes = 16;
 	  mode = V4SImode;

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]