[PATCH] Optimize store_expr from STRING_CST [PR95052]

Jakub Jelinek jakub@redhat.com
Tue May 12 08:12:01 GMT 2020


Hi!

In the following testcase, store_expr of e.g. 97 bytes long string literal
into 1MB long array is implemented by copying the 97 bytes from .rodata
section, followed by clearing the remaining bytes.  But, as the STRING_CST
has type char[1024*1024], we actually allocate whole 1MB in .rodata section
for it, even when we only use the first 97 bytes from that.

The following patch tweaks it so that if we are going to initialize only the
small part from it, we don't emit all the zeros that we never use after it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-05-12  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/95052
	* expr.c (store_expr): If expr_size is constant and significantly
	larger than TREE_STRING_LENGTH, set temp to just the
	TREE_STRING_LENGTH portion of the STRING_CST.

	* gcc.target/i386/pr95052.c: New test.

--- gcc/expr.c.jj	2020-04-16 12:57:16.584912448 +0200
+++ gcc/expr.c	2020-05-11 20:58:46.436785403 +0200
@@ -5749,7 +5749,31 @@ store_expr (tree exp, rtx target, int ca
       /* If we want to use a nontemporal or a reverse order store, force the
 	 value into a register first.  */
       tmp_target = nontemporal || reverse ? NULL_RTX : target;
-      temp = expand_expr_real (exp, tmp_target, GET_MODE (target),
+      tree rexp = exp;
+      if (TREE_CODE (exp) == STRING_CST
+	  && tmp_target == target
+	  && GET_MODE (target) == BLKmode
+	  && TYPE_MODE (TREE_TYPE (exp)) == BLKmode)
+	{
+	  rtx size = expr_size (exp);
+	  if (CONST_INT_P (size)
+	      && size != const0_rtx
+	      && (UINTVAL (size)
+		  > ((unsigned HOST_WIDE_INT) TREE_STRING_LENGTH (exp) + 32)))
+	    {
+	      /* If the STRING_CST has much larger array type than
+		 TREE_STRING_LENGTH, only emit the TREE_STRING_LENGTH part of
+		 it into the rodata section as the code later on will use
+		 memset zero for the remainder anyway.  See PR95052.  */
+	      tmp_target = NULL_RTX;
+	      rexp = copy_node (exp);
+	      tree index
+		= build_index_type (size_int (TREE_STRING_LENGTH (exp) - 1));
+	      TREE_TYPE (rexp) = build_array_type (TREE_TYPE (TREE_TYPE (exp)),
+						   index);
+	    }
+	}
+      temp = expand_expr_real (rexp, tmp_target, GET_MODE (target),
 			       (call_param_p
 				? EXPAND_STACK_PARM : EXPAND_NORMAL),
 			       &alt_rtl, false);
--- gcc/testsuite/gcc.target/i386/pr95052.c.jj	2020-05-11 21:03:05.635238485 +0200
+++ gcc/testsuite/gcc.target/i386/pr95052.c	2020-05-11 21:02:47.473487020 +0200
@@ -0,0 +1,20 @@
+/* PR middle-end/95052 */
+/* { dg-do compile } */
+/* { dg-options "-Os -mtune=skylake" } */
+/* Verify we don't waste almost 2 megabytes of .rodata.  */
+/* { dg-final { scan-assembler-not "\.zero\t1048\[0-9]\[0-9]\[0-9]" } } */
+extern void foo (char *, unsigned);
+
+int
+main ()
+{
+  char str[1024 * 1024] =
+    "fooiuhluhpiuhliuhliyfyukyfklyugkiuhpoipoipoipoipoipoipoipoipoipoipoipoipoimipoipiuhoulouihnliuhl";
+  char arr[1024 * 1024] =
+    { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 6, 2, 3,
+      4, 5, 6, 7, 8, 9, 0, 3, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6,
+      7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 };
+  foo (str, sizeof (str));
+  foo (arr, sizeof (arr));
+  return 0;
+}

	Jakub



More information about the Gcc-patches mailing list