This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH 6/6] Make SRA replace constant-pool loads


This has changed quite a bit since the previous revision
(https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01484.html), mostly due to Ada
and specifically Ada on ARM.

I didn't find a good alternative to scanning for constant-pool accesses "as we
go" through the function, and although I didn't find any of these accesses
being disqualified after we'd found them in C, some awkward constant-pool
expressions in Ada forced me to disqualify them for a new reason (inability to
constant-propagate). Hence, I introduced a new bitmap of
disqualified_constants, rather than an additional pass through the function.

Next, the approach of using a replacement_expr (with tho constant value) in
place of the old replacement_decl, also failed (the decl could be used in an
expression where substituting in the expr produced ill-formed gimple). Hence,
constant-pool entries now use replacement_decls just like everything else, for
which I generate initializers in initialize_constant_pool_replacements.

However, I was able to avoid generating the replacement constants early, and to
do so late in initialize_constant_pool_replacements; the main trick here was to
use fold_array_ctor_reference, kindly pointed out by Richie :), to fold the
MEM_REFs built in analyze_access_subtree.

Finally, I found completely_scalarize was still failing to fold all the
constant expressions, because Ada was putting VIEW_CONVERT_EXPRs in the
constant pool, and fold-const.c did not deal with ARRAY_REFs of these. However,
requiring completely_scalarize to be able to fold everything, seemed fragile,
instead I decoupled from fold-const by allowing to fail.

With these changes, ssa-dom-cse-2.c is fixed on all platforms with appropriate
--param.

There are still a few remaining test failures, on AArch64:
gcc.dg/guality/pr54970.c   -O1  line 15 a[0] == 1
gcc.dg/guality/pr54970.c   -O1  line 20 a[0] == 1
gcc.dg/guality/pr54970.c   -O1  line 25 a[0] == 1
(also at -O2, -O2 -flto with/without plugin, -O3 -g and -Os).
...which I'm working on. I also tested with the hack at https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01483.html . This revealed one new failure in advsimd-intrinsics/vldX.c on AArch64 (fixed by https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02777.html), and fixed a bunch of guality failures on ARM:
gcc.dg/guality/pr54970.c   -O1  line 31 a[0] == 4
gcc.dg/guality/pr54970.c   -O1  line 36 a[0] == 4
gcc.dg/guality/pr54970.c   -O1  line 45 a[0] == 4
gcc.dg/guality/pr54970.c   -O1  line 45 p[-2] == 4
gcc.dg/guality/pr54970.c   -O1  line 45 q[-1] == 4
(also at -O2, -O2 -flto with/without plugin, -O3 -g and -Os)

--Alan

gcc/ChangeLog:

	* tree-sra.c (disqualified_constants): New.
	(sra_initialize): Initialize it.
	(sra_deinitialize): Deallocate it.
	(disqualify_candidate): Set bit in disqualified_constants.
	(subst_constant_pool_initial): New.
	(create_access): Scan for constant-pool entries as we go.
	(scalarizable_type_p): Disallow types containing placeholders.
	(completely_scalarize): Return bool to allow failure.
	(scalarize_elem): Likewise; check we can generate constant replacements.
	(maybe_add_sra_candidate): Allow constant-pool entries.
	(analyze_access_subtree): Checking-assert that we can fold any refs
	built for constant-pool entries.
	(analyze_all_variable_accesses): Deal with completely_scalarize failing.
	(initialize_constant_pool_replacements): New.
	(sra_modify_function_body): Call previous.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/ssa-dom-cse-2.c: Add sra-max-scalarization-size param,
	remove xfail.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c |   7 +-
 gcc/tree-sra.c                                | 221 ++++++++++++++++++++++++--
 2 files changed, 208 insertions(+), 20 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
index 9eccdc9..b13d583 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized" } */
+/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized --param sra-max-scalarization-size-Ospeed=32" } */
 
 int
 foo ()
@@ -17,7 +17,4 @@ foo ()
 /* After late unrolling the above loop completely DOM should be
    able to optimize this to return 28.  */
 
-/* See PR63679 and PR64159, if the target forces the initializer to memory then
-   DOM is not able to perform this optimization.  */
-
-/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail aarch64*-*-* alpha*-*-* hppa*-*-* powerpc*-*-* sparc*-*-* s390*-*-* } } } */
+/* { dg-final { scan-tree-dump "return 28;" "optimized" } } */
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 358db79..1920265 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -90,6 +90,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "gimplify-me.h"
 #include "gimple-walk.h"
+#include "gimple-fold.h"
 #include "tree-cfg.h"
 #include "flags.h"
 #include "insn-config.h"
@@ -336,6 +337,10 @@ candidate (unsigned uid)
    those which cannot be (because they are and need be used as a whole).  */
 static bitmap should_scalarize_away_bitmap, cannot_scalarize_away_bitmap;
 
+/* Bitmap of candidates in the constant pool, which cannot be scalarized
+   because this would produce non-constant expressions (e.g. Ada).  */
+static bitmap disqualified_constants;
+
 /* Obstack for creation of fancy names.  */
 static struct obstack name_obstack;
 
@@ -658,6 +663,7 @@ sra_initialize (void)
     (vec_safe_length (cfun->local_decls) / 2);
   should_scalarize_away_bitmap = BITMAP_ALLOC (NULL);
   cannot_scalarize_away_bitmap = BITMAP_ALLOC (NULL);
+  disqualified_constants = BITMAP_ALLOC (NULL);
   gcc_obstack_init (&name_obstack);
   base_access_vec = new hash_map<tree, auto_vec<access_p> >;
   memset (&sra_stats, 0, sizeof (sra_stats));
@@ -676,6 +682,7 @@ sra_deinitialize (void)
   candidates = NULL;
   BITMAP_FREE (should_scalarize_away_bitmap);
   BITMAP_FREE (cannot_scalarize_away_bitmap);
+  BITMAP_FREE (disqualified_constants);
   access_pool.release ();
   assign_link_pool.release ();
   obstack_free (&name_obstack, NULL);
@@ -690,6 +697,8 @@ disqualify_candidate (tree decl, const char *reason)
 {
   if (bitmap_clear_bit (candidate_bitmap, DECL_UID (decl)))
     candidates->remove_elt_with_hash (decl, DECL_UID (decl));
+  if (TREE_CODE (decl) == VAR_DECL && DECL_IN_CONSTANT_POOL (decl))
+    bitmap_set_bit (disqualified_constants, DECL_UID (decl));
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -841,8 +850,76 @@ create_access_1 (tree base, HOST_WIDE_INT offset, HOST_WIDE_INT size)
   return access;
 }
 
+/* Given EXPR a chain of COMPONENT_REFs and/or ARRAY_REFs built around a
+   constant-pool declaration VAR, builds a tree like EXPR but with VAR replaced
+   by its DECL_INITIAL, and tries to fold it to a constant.  If folding succeeds
+   then return the folded tree, else NULL_TREE.  */
+
+static tree
+subst_constant_pool_initial (tree expr, tree var)
+{
+  if (TREE_CODE (expr) == VAR_DECL)
+    {
+      gcc_assert (DECL_IN_CONSTANT_POOL (expr));
+      gcc_assert (expr == var);
+      return DECL_INITIAL (expr);
+    }
+  if (TREE_CODE (expr) == COMPONENT_REF)
+    {
+      tree field = TREE_OPERAND (expr, 1);
+      gcc_assert (TREE_CODE (field) == FIELD_DECL);
+      gcc_assert (TREE_OPERAND (expr, 2) == NULL_TREE);
+      tree record = subst_constant_pool_initial (TREE_OPERAND (expr, 0), var);
+      if (!record)
+	return NULL_TREE;
+      tree ref = build3 (COMPONENT_REF, TREE_TYPE (expr),
+			 record, field, NULL_TREE);
+      tree res = fold (ref);
+      if (res != ref)
+	return res;
+    }
+  else if (TREE_CODE (expr) == ARRAY_REF)
+    {
+      tree array = subst_constant_pool_initial (TREE_OPERAND (expr, 0), var);
+      if (!array)
+	return NULL_TREE;
+      tree ref = build4 (ARRAY_REF, TREE_TYPE (expr),
+			 array,
+			 TREE_OPERAND (expr, 1),
+			 TREE_OPERAND (expr, 2),
+			 TREE_OPERAND (expr, 3));
+      /* Ada (at least) builds ARRAY_REFs around strings.  */
+      if (tree folded = fold_read_from_constant_string (ref))
+	return folded;
+      tree res = fold (ref);
+      if (res != ref)
+	return res;
+    }
+  else if (TREE_CODE (expr) == MEM_REF)
+    {
+      gcc_assert (TREE_CODE (TREE_OPERAND (expr, 0)) == ADDR_EXPR);
+      tree type = TREE_TYPE (expr);
+      tree sz = TYPE_SIZE (TREE_TYPE (expr));
+      tree val = TREE_OPERAND (TREE_OPERAND (expr, 0), 0);
+      val = subst_constant_pool_initial (val, var);
+      if (TREE_CODE (val) != CONSTRUCTOR
+	  || !tree_fits_uhwi_p (TREE_OPERAND (expr, 1))
+	  || !tree_fits_uhwi_p (sz))
+	return NULL_TREE;
+      unsigned HOST_WIDE_INT offset =
+	  tree_to_uhwi (TREE_OPERAND (expr, 1)) * BITS_PER_UNIT;
+      return fold_ctor_reference (type, val, offset, tree_to_uhwi (sz), var);
+    }
+
+  /* Unknown expression, or could not constant-fold.  */
+  return NULL_TREE;
+}
+
+static bool maybe_add_sra_candidate (tree var);
+
 /* Create and insert access for EXPR. Return created access, or NULL if it is
-   not possible.  */
+   not possible.  Also scan for uses of constant pool as we go along and add
+   to candidates.  */
 
 static struct access *
 create_access (tree expr, gimple *stmt, bool write)
@@ -865,6 +942,32 @@ create_access (tree expr, gimple *stmt, bool write)
   else
     ptr = false;
 
+  /* For constant-pool entries, check we can substitute the constant value.  */
+  if (TREE_CODE (base) == VAR_DECL && DECL_IN_CONSTANT_POOL (base)
+      && (sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA))
+    {
+      gcc_assert (!write);
+      if (bitmap_bit_p (disqualified_constants, DECL_UID (base)))
+	return NULL;
+      tree repl = subst_constant_pool_initial (expr, base);
+      if (dump_file && (dump_flags & TDF_DETAILS))
+      {
+	fprintf (dump_file, "Trying to scalarize constant-pool ");
+	print_generic_expr (dump_file, base, 0);
+	fprintf (dump_file, "\n In expression: ");
+	print_generic_expr (dump_file, expr, 0);
+	fprintf (dump_file, "\n Constant replacement: ");
+	print_generic_expr (dump_file, repl, 0);
+	fprintf (dump_file, "\n");
+      }
+      /* If we can't resolve the expression to a constant, we risk creating
+	  a malformed expression (e.g. Ada testsuite, ACATS chapter C3).  */
+      if (!repl || !TREE_CONSTANT (repl))
+	disqualify_candidate (base, "Cannot constant-propagate out of pool.");
+      else
+	maybe_add_sra_candidate (base);
+    }
+
   if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
     return NULL;
 
@@ -923,6 +1026,8 @@ static bool
 scalarizable_type_p (tree type)
 {
   gcc_assert (!is_gimple_reg_type (type));
+  if (type_contains_placeholder_p (type))
+    return false;
 
   switch (TREE_CODE (type))
   {
@@ -970,15 +1075,19 @@ scalarizable_type_p (tree type)
   }
 }
 
-static void scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, tree, tree);
+static bool scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, tree, tree);
 
 /* Create total_scalarization accesses for all scalar fields of a member
    of type DECL_TYPE conforming to scalarizable_type_p.  BASE
    must be the top-most VAR_DECL representing the variable; within that,
    OFFSET locates the member and REF must be the memory reference expression for
-   the member.  */
+   the member.
 
-static void
+   Return TRUE if successful; may fail in the case of constant-pool entry BASE
+   whose DECL_INITIAL contains values to which we cannot fold ARRAY_REF or
+   COMPONENT_REF expressions.  */
+
+static bool
 completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref)
 {
   switch (TREE_CODE (decl_type))
@@ -991,10 +1100,11 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref)
 	    tree ft = TREE_TYPE (fld);
 	    tree nref = build3 (COMPONENT_REF, ft, ref, fld, NULL_TREE);
 
-	    scalarize_elem (base, pos, tree_to_uhwi (DECL_SIZE (fld)), nref,
-			    ft);
+	    if (!scalarize_elem (base, pos, tree_to_uhwi (DECL_SIZE (fld)),
+				 nref, ft))
+	      return false;
 	  }
-      break;
+      return true;
     case ARRAY_TYPE:
       {
 	tree elemtype = TREE_TYPE (decl_type);
@@ -1026,12 +1136,13 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref)
 				    ref, size_int (idx + min),
 				    NULL_TREE, NULL_TREE);
 		int el_off = offset + idx * el_size;
-		scalarize_elem (base, el_off, el_size, nref, elemtype);
+		if (!scalarize_elem (base, el_off, el_size, nref, elemtype))
+		  return false;
 	      }
 	    while (++idx <= lenlessone);
 	  }
       }
-      break;
+      return true;
     default:
       gcc_unreachable ();
     }
@@ -1040,22 +1151,32 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref)
 /* Create total_scalarization accesses for a member of type TYPE, which must
    satisfy either is_gimple_reg_type or scalarizable_type_p.  BASE must be the
    top-most VAR_DECL representing the variable; within that, POS and SIZE locate
-   the member and REF must be the reference expression for it.  */
+   the member and REF must be the reference expression for it.
 
-static void
+   Return TRUE if successful; may fail in the case of constant-pool entry BASE
+   whose DECL_INITIAL contains values to which we cannot fold ARRAY_REF or
+   COMPONENT_REF expressions.  */
+
+static bool
 scalarize_elem (tree base, HOST_WIDE_INT pos, HOST_WIDE_INT size,
 		 tree ref, tree type)
 {
   if (is_gimple_reg_type (type))
   {
+    if (TREE_CODE (base) == VAR_DECL
+	&& DECL_IN_CONSTANT_POOL (base)
+	&& subst_constant_pool_initial (ref, base) == NULL_TREE)
+      return false;
     struct access *access = create_access_1 (base, pos, size);
     access->expr = ref;
     access->type = type;
     access->grp_total_scalarization = 1;
     /* Accesses for intraprocedural SRA can have their stmt NULL.  */
+
+    return true;
   }
-  else
-    completely_scalarize (base, type, pos, ref);
+
+  return completely_scalarize (base, type, pos, ref);
 }
 
 /* Create a total_scalarization access for VAR as a whole.  VAR must be of a
@@ -1843,7 +1964,12 @@ maybe_add_sra_candidate (tree var)
       reject (var, "not aggregate");
       return false;
     }
-  if (needs_to_live_in_memory (var))
+  /* Allow constant-pool entries (that "need to live in memory")
+     unless we are doing IPA SRA.  */
+  if (needs_to_live_in_memory (var)
+      && (sra_mode == SRA_MODE_EARLY_IPA
+	  || TREE_CODE (var) != VAR_DECL
+	  || !DECL_IN_CONSTANT_POOL (var)))
     {
       reject (var, "needs to live in memory");
       return false;
@@ -2343,6 +2469,10 @@ analyze_access_subtree (struct access *root, struct access *parent,
 		       (unsigned) root->offset, (unsigned) root->size);
 	      fprintf (dump_file, " to an integer.\n");
 	    }
+	  gcc_checking_assert (TREE_CODE (root->base) != VAR_DECL
+			       || !DECL_IN_CONSTANT_POOL (root->base)
+			       || subst_constant_pool_initial (root->expr,
+							       root->base));
 	}
 
       root->grp_to_be_replaced = 1;
@@ -2604,8 +2734,17 @@ analyze_all_variable_accesses (void)
 	    if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var)))
 		<= max_scalarization_size)
 	      {
+		vec<access_p> &accesses = base_access_vec->get_or_insert (var);
+		unsigned orig_count = accesses.length ();
+		if (!completely_scalarize (var, TREE_TYPE (var), 0, var))
+		  {
+		    /* Failed.  Discard any accesses created during attempt.  */
+		    for (unsigned i = orig_count; i < accesses.length (); i++)
+		      access_pool.remove (accesses[i]);
+		    accesses.truncate (orig_count);
+		    continue;
+		  }
 		create_total_scalarization_access (var);
-		completely_scalarize (var, TREE_TYPE (var), 0, var);
 		if (dump_file && (dump_flags & TDF_DETAILS))
 		  {
 		    fprintf (dump_file, "Will attempt to totally scalarize ");
@@ -3480,6 +3619,56 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi)
     }
 }
 
+static void
+initialize_constant_pool_replacements (void)
+{
+  gimple_seq seq = NULL;
+  gimple_stmt_iterator gsi = gsi_start (seq);
+  bitmap_iterator bi;
+  unsigned i;
+
+  EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi)
+    if (bitmap_bit_p (should_scalarize_away_bitmap, i)
+	&& !bitmap_bit_p (cannot_scalarize_away_bitmap, i))
+      {
+	tree var = candidate (i);
+	if (TREE_CODE (var) != VAR_DECL || !DECL_IN_CONSTANT_POOL (var))
+	  continue;
+	vec<access_p> *access_vec = get_base_access_vector (var);
+	if (!access_vec)
+	  continue;
+	for (unsigned i = 0; i < access_vec->length (); i++)
+	  {
+	    struct access *access = (*access_vec)[i];
+	    tree val = subst_constant_pool_initial (access->expr, access->base);
+	    if (dump_file && (dump_flags & TDF_DETAILS))
+	      {
+		fprintf (dump_file, "Expression ");
+		print_generic_expr (dump_file, access->expr, 0);
+		fprintf (dump_file, " replaced with ");
+		print_generic_expr (dump_file, access->replacement_decl, 0);
+		fprintf (dump_file, " initialized to ");
+		print_generic_expr (dump_file, val, 0);
+		fprintf (dump_file, "\n");
+		if (!access->replacement_decl)
+		  dump_access (dump_file, access, true);
+	      }
+	    if (!access->replacement_decl)
+	      continue;
+	    gcc_assert (val);
+	    gassign *stmt = gimple_build_assign (
+	      get_access_replacement (access), val);
+	    gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+	    update_stmt (stmt);
+	  }
+      }
+
+  seq = gsi_seq (gsi);
+  if (seq)
+    gsi_insert_seq_on_edge_immediate (
+      single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun)), seq);
+}
+
 /* Traverse the function body and all modifications as decided in
    analyze_all_variable_accesses.  Return true iff the CFG has been
    changed.  */
@@ -3490,6 +3679,8 @@ sra_modify_function_body (void)
   bool cfg_changed = false;
   basic_block bb;
 
+  initialize_constant_pool_replacements ();
+
   FOR_EACH_BB_FN (bb, cfun)
     {
       gimple_stmt_iterator gsi = gsi_start_bb (bb);
-- 
1.9.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]