This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Fix PR63677 and a regression from alias-improvements


PR63677 shows that late complete unrolling can expose quite some memory
CSE opportunities (and followup simplifications) which we fail to
exploit because currently DOMs ability to do memory CSE is quite
limited (it requires the redundant load to optimize to directly follow the
load or store).  This is actually a regression from before
alias-improvements, thus to the time where we had multiple virtual
operand symbols which made us handle some cases better than now
with a single memory operand symbol.

The following patch makes DOM use the alias walker to fix that.

The easiest way to do that is to no longer hash virtual operands
and use the expression hash lookup result as "candidate" if
the expression involves memory.  We can then verify if the candidate
can be used by using the alias walker to walk to the candidate.
If there is an intermediate possibly aliasing stmt we simply
replace the hashtable entry and set up enough info to be able
to rewind to the previous state during unwinding.

This means the DOM implementation of memory CSE is a tad bit more
efficient than that of SCCVN.  But it is also very simplistic as
it requires 1:1 requivalence for the memory access structure
(as compared to SCCVN which can look through various kinds of
aggregate copies or initializations as well as handling accessing
the same memory location with different memory access structures).
This is also the reason that the patch doesn't fix PR63864 which
requires us to look through aggregate copies.

To fix PR63677 I also had to teach DOM the trick of propagating
&a + CST as &MEM[&a, CST] which is used by all other propagators.

Given how easy this was to restore pre-alias-improvements state
I wonder why I didn't think of this before...

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Ok for trunk?

Thanks,
Richard.

2014-11-19   Richard Biener  <rguenther@suse.de>

	PR tree-optimization/63677
	* tree-ssa-dom.c: Include gimplify.h for unshare_expr.
	(avail_exprs_stack): Make a vector of pairs.
	(struct hash_expr_elt): Replace stmt member with vop member.
	(expr_elt_hasher::equal): Simplify.
	(initialize_hash_element): Adjust.
	(initialize_hash_element_from_expr): Likewise.
	(dom_opt_dom_walker::thread_across_edge): Likewise.
	(record_cond): Likewise.
	(dom_opt_dom_walker::before_dom_children): Likewise.
	(print_expr_hash_elt): Likewise.
	(remove_local_expressions_from_table): Restore previous state
	if requested.
	(record_equivalences_from_stmt): Record &x + CST as constant
	&MEM[&x, CST] for further propagation.
	(vuse_eq): New function.
	(lookup_avail_expr): For loads use the alias oracle to see
	whether a candidate from the expr hash is usable.
	(avail_expr_hash): Do not hash VUSEs.

	* gcc.dg/tree-ssa/ssa-dom-cse-2.c: New testcase.
	* gcc.dg/tree-ssa/ssa-dom-cse-3.c: Likewise.

Index: gcc/tree-ssa-dom.c
===================================================================
*** gcc/tree-ssa-dom.c.orig	2014-11-19 13:28:46.123015743 +0100
--- gcc/tree-ssa-dom.c	2014-11-19 13:58:33.122937543 +0100
*************** along with GCC; see the file COPYING3.
*** 66,71 ****
--- 66,72 ----
  #include "tree-ssa-threadedge.h"
  #include "tree-ssa-dom.h"
  #include "inchash.h"
+ #include "gimplify.h"
  
  /* This file implements optimizations on the dominator tree.  */
  
*************** struct edge_info
*** 139,145 ****
     marker.  */
  typedef struct expr_hash_elt * expr_hash_elt_t;
  
! static vec<expr_hash_elt_t> avail_exprs_stack;
  
  /* Structure for entries in the expression hash table.  */
  
--- 140,146 ----
     marker.  */
  typedef struct expr_hash_elt * expr_hash_elt_t;
  
! static vec<std::pair<expr_hash_elt_t, expr_hash_elt_t> > avail_exprs_stack;
  
  /* Structure for entries in the expression hash table.  */
  
*************** struct expr_hash_elt
*** 151,158 ****
    /* The expression (rhs) we want to record.  */
    struct hashable_expr expr;
  
!   /* The stmt pointer if this element corresponds to a statement.  */
!   gimple stmt;
  
    /* The hash value for RHS.  */
    hashval_t hash;
--- 152,160 ----
    /* The expression (rhs) we want to record.  */
    struct hashable_expr expr;
  
!   /* The virtual operand associated with the nearest dominating stmt
!      loading from or storing to expr.  */
!   tree vop;
  
    /* The hash value for RHS.  */
    hashval_t hash;
*************** expr_elt_hasher::hash (const value_type
*** 187,196 ****
  inline bool
  expr_elt_hasher::equal (const value_type &p1, const compare_type &p2)
  {
-   gimple stmt1 = p1->stmt;
    const struct hashable_expr *expr1 = &p1->expr;
    const struct expr_hash_elt *stamp1 = p1->stamp;
-   gimple stmt2 = p2->stmt;
    const struct hashable_expr *expr2 = &p2->expr;
    const struct expr_hash_elt *stamp2 = p2->stamp;
  
--- 189,196 ----
*************** expr_elt_hasher::equal (const value_type
*** 198,222 ****
    if (stamp1 == stamp2)
      return true;
  
!   /* FIXME tuples:
!      We add stmts to a hash table and them modify them. To detect the case
!      that we modify a stmt and then search for it, we assume that the hash
!      is always modified by that change.
!      We have to fully check why this doesn't happen on trunk or rewrite
!      this in a more  reliable (and easier to understand) way. */
!   if (((const struct expr_hash_elt *)p1)->hash
!       != ((const struct expr_hash_elt *)p2)->hash)
      return false;
  
    /* In case of a collision, both RHS have to be identical and have the
       same VUSE operands.  */
    if (hashable_expr_equal_p (expr1, expr2)
        && types_compatible_p (expr1->type, expr2->type))
!     {
!       /* Note that STMT1 and/or STMT2 may be NULL.  */
!       return ((stmt1 ? gimple_vuse (stmt1) : NULL_TREE)
! 	      == (stmt2 ? gimple_vuse (stmt2) : NULL_TREE));
!     }
  
    return false;
  }
--- 198,211 ----
    if (stamp1 == stamp2)
      return true;
  
!   if (p1->hash != p2->hash)
      return false;
  
    /* In case of a collision, both RHS have to be identical and have the
       same VUSE operands.  */
    if (hashable_expr_equal_p (expr1, expr2)
        && types_compatible_p (expr1->type, expr2->type))
!     return true;
  
    return false;
  }
*************** initialize_hash_element (gimple stmt, tr
*** 387,393 ****
      gcc_unreachable ();
  
    element->lhs = lhs;
!   element->stmt = stmt;
    element->hash = avail_expr_hash (element);
    element->stamp = element;
  }
--- 376,382 ----
      gcc_unreachable ();
  
    element->lhs = lhs;
!   element->vop = gimple_vuse (stmt);
    element->hash = avail_expr_hash (element);
    element->stamp = element;
  }
*************** initialize_hash_element_from_expr (struc
*** 429,435 ****
  {
    element->expr = *expr;
    element->lhs = lhs;
!   element->stmt = NULL;
    element->hash = avail_expr_hash (element);
    element->stamp = element;
  }
--- 418,424 ----
  {
    element->expr = *expr;
    element->lhs = lhs;
!   element->vop = NULL_TREE;
    element->hash = avail_expr_hash (element);
    element->stamp = element;
  }
*************** add_hashable_expr (const struct hashable
*** 681,690 ****
  static void
  print_expr_hash_elt (FILE * stream, const struct expr_hash_elt *element)
  {
!   if (element->stmt)
!     fprintf (stream, "STMT ");
!   else
!     fprintf (stream, "COND ");
  
    if (element->lhs)
      {
--- 670,676 ----
  static void
  print_expr_hash_elt (FILE * stream, const struct expr_hash_elt *element)
  {
!   fprintf (stream, "STMT ");
  
    if (element->lhs)
      {
*************** print_expr_hash_elt (FILE * stream, cons
*** 758,770 ****
          }
          break;
      }
-   fprintf (stream, "\n");
  
!   if (element->stmt)
      {
!       fprintf (stream, "          ");
!       print_gimple_stmt (stream, element->stmt, 0, 0);
      }
  }
  
  /* Delete variable sized pieces of the expr_hash_elt ELEMENT.  */
--- 744,757 ----
          }
          break;
      }
  
!   if (element->vop)
      {
!       fprintf (stream, " with ");
!       print_generic_expr (stream, element->vop, 0);
      }
+ 
+   fprintf (stream, "\n");
  }
  
  /* Delete variable sized pieces of the expr_hash_elt ELEMENT.  */
*************** remove_local_expressions_from_table (voi
*** 1067,1076 ****
    /* Remove all the expressions made available in this block.  */
    while (avail_exprs_stack.length () > 0)
      {
!       expr_hash_elt_t victim = avail_exprs_stack.pop ();
        expr_hash_elt **slot;
  
!       if (victim == NULL)
  	break;
  
        /* This must precede the actual removal from the hash table,
--- 1054,1064 ----
    /* Remove all the expressions made available in this block.  */
    while (avail_exprs_stack.length () > 0)
      {
!       std::pair<expr_hash_elt_t, expr_hash_elt_t> victim
! 	= avail_exprs_stack.pop ();
        expr_hash_elt **slot;
  
!       if (victim.first == NULL)
  	break;
  
        /* This must precede the actual removal from the hash table,
*************** remove_local_expressions_from_table (voi
*** 1079,1090 ****
        if (dump_file && (dump_flags & TDF_DETAILS))
          {
            fprintf (dump_file, "<<<< ");
!           print_expr_hash_elt (dump_file, victim);
          }
  
!       slot = avail_exprs->find_slot (victim, NO_INSERT);
!       gcc_assert (slot && *slot == victim);
!       avail_exprs->clear_slot (slot);
      }
  }
  
--- 1067,1084 ----
        if (dump_file && (dump_flags & TDF_DETAILS))
          {
            fprintf (dump_file, "<<<< ");
!           print_expr_hash_elt (dump_file, victim.first);
          }
  
!       slot = avail_exprs->find_slot (victim.first, NO_INSERT);
!       gcc_assert (slot && *slot == victim.first);
!       if (victim.second != NULL)
! 	{
! 	  free_expr_hash_elt (*slot);
! 	  *slot = victim.second;
! 	}
!       else
! 	avail_exprs->clear_slot (slot);
      }
  }
  
*************** dom_opt_dom_walker::thread_across_edge (
*** 1171,1177 ****
  
    /* Push a marker on both stacks so we can unwind the tables back to their
       current state.  */
!   avail_exprs_stack.safe_push (NULL);
    const_and_copies_stack.safe_push (NULL_TREE);
  
    /* Traversing E may result in equivalences we can utilize.  */
--- 1165,1172 ----
  
    /* Push a marker on both stacks so we can unwind the tables back to their
       current state.  */
!   avail_exprs_stack.safe_push
!     (std::pair<expr_hash_elt_t, expr_hash_elt_t> (NULL, NULL));
    const_and_copies_stack.safe_push (NULL_TREE);
  
    /* Traversing E may result in equivalences we can utilize.  */
*************** record_cond (cond_equivalence *p)
*** 1412,1418 ****
            print_expr_hash_elt (dump_file, element);
          }
  
!       avail_exprs_stack.safe_push (element);
      }
    else
      free_expr_hash_elt (element);
--- 1407,1414 ----
            print_expr_hash_elt (dump_file, element);
          }
  
!       avail_exprs_stack.safe_push
! 	(std::pair<expr_hash_elt_t, expr_hash_elt_t> (element, NULL));
      }
    else
      free_expr_hash_elt (element);
*************** dom_opt_dom_walker::before_dom_children
*** 1971,1977 ****
  
    /* Push a marker on the stacks of local information so that we know how
       far to unwind when we finalize this block.  */
!   avail_exprs_stack.safe_push (NULL);
    const_and_copies_stack.safe_push (NULL_TREE);
  
    record_equivalences_from_incoming_edge (bb);
--- 1967,1974 ----
  
    /* Push a marker on the stacks of local information so that we know how
       far to unwind when we finalize this block.  */
!   avail_exprs_stack.safe_push
!     (std::pair<expr_hash_elt_t, expr_hash_elt_t> (NULL, NULL));
    const_and_copies_stack.safe_push (NULL_TREE);
  
    record_equivalences_from_incoming_edge (bb);
*************** dom_opt_dom_walker::before_dom_children
*** 1982,1988 ****
    /* Create equivalences from redundant PHIs.  PHIs are only truly
       redundant when they exist in the same block, so push another
       marker and unwind right afterwards.  */
!   avail_exprs_stack.safe_push (NULL);
    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
      eliminate_redundant_computations (&gsi);
    remove_local_expressions_from_table ();
--- 1979,1986 ----
    /* Create equivalences from redundant PHIs.  PHIs are only truly
       redundant when they exist in the same block, so push another
       marker and unwind right afterwards.  */
!   avail_exprs_stack.safe_push
!     (std::pair<expr_hash_elt_t, expr_hash_elt_t> (NULL, NULL));
    for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
      eliminate_redundant_computations (&gsi);
    remove_local_expressions_from_table ();
*************** record_equivalences_from_stmt (gimple st
*** 2192,2197 ****
--- 2190,2221 ----
        }
      }
  
+   /* Make sure we can propagate &x + CST.  */
+   if (lhs_code == SSA_NAME
+       && gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
+       && TREE_CODE (gimple_assign_rhs1 (stmt)) == ADDR_EXPR
+       && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST)
+     {
+       tree op0 = gimple_assign_rhs1 (stmt);
+       tree op1 = gimple_assign_rhs2 (stmt);
+       tree new_rhs
+ 	= build_fold_addr_expr (fold_build2 (MEM_REF,
+ 					     TREE_TYPE (TREE_TYPE (op0)),
+ 					     unshare_expr (op0),
+ 					     fold_convert (ptr_type_node,
+ 							   op1)));
+       if (dump_file && (dump_flags & TDF_DETAILS))
+ 	{
+ 	  fprintf (dump_file, "==== ASGN ");
+ 	  print_generic_expr (dump_file, lhs, 0);
+ 	  fprintf (dump_file, " = ");
+ 	  print_generic_expr (dump_file, new_rhs, 0);
+ 	  fprintf (dump_file, "\n");
+ 	}
+ 
+       set_ssa_name_value (lhs, new_rhs);
+     }
+ 
    /* A memory store, even an aliased store, creates a useful
       equivalence.  By exchanging the LHS and RHS, creating suitable
       vops and recording the result in the available expression table,
*************** optimize_stmt (basic_block bb, gimple_st
*** 2511,2516 ****
--- 2535,2560 ----
      }
  }
  
+ /* Helper for walk_non_aliased_vuses.  Determine if we arrived at
+    the desired memory state.  */
+ 
+ static void *
+ vuse_eq (ao_ref *, tree vuse1, unsigned int cnt, void *data)
+ {
+   tree vuse2 = (tree) data;
+   if (vuse1 == vuse2)
+     return data;
+ 
+   /* This bounds the stmt walks we perform on reference lookups
+      to O(1) instead of O(N) where N is the number of dominating
+      stores leading to a candidate.  We re-use the SCCVN param
+      for this as it is basically the same complexity.  */
+   if (cnt > (unsigned) PARAM_VALUE (PARAM_SCCVN_MAX_ALIAS_QUERIES_PER_ACCESS))
+     return (void *)-1;
+ 
+   return NULL;
+ }
+ 
  /* Search for an existing instance of STMT in the AVAIL_EXPRS table.
     If found, return its LHS. Otherwise insert STMT in the table and
     return NULL_TREE.
*************** lookup_avail_expr (gimple stmt, bool ins
*** 2569,2583 ****
            print_expr_hash_elt (dump_file, element2);
          }
  
!       avail_exprs_stack.safe_push (element2);
        return NULL_TREE;
      }
!   else
!     free_expr_hash_elt_contents (&element);
  
    /* Extract the LHS of the assignment so that it can be used as the current
       definition of another variable.  */
!   lhs = ((struct expr_hash_elt *)*slot)->lhs;
  
    /* See if the LHS appears in the CONST_AND_COPIES table.  If it does, then
       use the value from the const_and_copies table.  */
--- 2613,2664 ----
            print_expr_hash_elt (dump_file, element2);
          }
  
!       avail_exprs_stack.safe_push
! 	(std::pair<expr_hash_elt_t, expr_hash_elt_t> (element2, NULL));
        return NULL_TREE;
      }
! 
!   /* If we found a redundant memory operation do an alias walk to
!      check if we can re-use it.  */
!   if (gimple_vuse (stmt) != (*slot)->vop)
!     {
!       tree vuse1 = (*slot)->vop;
!       tree vuse2 = gimple_vuse (stmt);
!       /* If we have a load of a register and a candidate in the
! 	 hash with vuse1 then try to reach its stmt by walking
! 	 up the virtual use-def chain using walk_non_aliased_vuses.
! 	 But don't do this when removing expressions from the hash.  */
!       ao_ref ref;
!       if (!(vuse1 && vuse2
! 	    && gimple_assign_single_p (stmt)
! 	    && TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME
! 	    && (ao_ref_init (&ref, gimple_assign_rhs1 (stmt)), true)
! 	    && walk_non_aliased_vuses (&ref, vuse2,
! 				       vuse_eq, NULL, vuse1) != NULL))
! 	{
! 	  struct expr_hash_elt *element2 = XNEW (struct expr_hash_elt);
! 	  *element2 = element;
! 	  element2->stamp = element2;
! 
! 	  /* Insert the expr into the hash by replacing the current
! 	     entry and recording the value to restore in the
! 	     aval_exprs_stack.  */
! 	  avail_exprs_stack.safe_push (std::make_pair (element2, *slot));
! 	  *slot = element2;
! 	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    {
! 	      fprintf (dump_file, "2>>> ");
! 	      print_expr_hash_elt (dump_file, *slot);
! 	    }
! 	  return NULL_TREE;
! 	}
!     }
! 
!   free_expr_hash_elt_contents (&element);
  
    /* Extract the LHS of the assignment so that it can be used as the current
       definition of another variable.  */
!   lhs = (*slot)->lhs;
  
    /* See if the LHS appears in the CONST_AND_COPIES table.  If it does, then
       use the value from the const_and_copies table.  */
*************** lookup_avail_expr (gimple stmt, bool ins
*** 2605,2630 ****
  static hashval_t
  avail_expr_hash (const void *p)
  {
-   gimple stmt = ((const struct expr_hash_elt *)p)->stmt;
    const struct hashable_expr *expr = &((const struct expr_hash_elt *)p)->expr;
-   tree vuse;
    inchash::hash hstate;
  
    inchash::add_hashable_expr (expr, hstate);
  
-   /* If the hash table entry is not associated with a statement, then we
-      can just hash the expression and not worry about virtual operands
-      and such.  */
-   if (!stmt)
-     return hstate.end ();
- 
-   /* Add the SSA version numbers of the vuse operand.  This is important
-      because compound variables like arrays are not renamed in the
-      operands.  Rather, the rename is done on the virtual variable
-      representing all the elements of the array.  */
-   if ((vuse = gimple_vuse (stmt)))
-     inchash::add_expr (vuse, hstate);
- 
    return hstate.end ();
  }
  
--- 2686,2696 ----
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
===================================================================
*** /dev/null	1970-01-01 00:00:00.000000000 +0000
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c	2014-11-19 13:54:38.073947829 +0100
***************
*** 0 ****
--- 1,21 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized" } */
+ 
+ int
+ foo ()
+ {
+   const int a[8] = { 0, 1, 2, 3, 4, 5, 6, 7 };
+   int i, sum;
+ 
+   sum = 0;
+   for (i = 0; i < sizeof (a) / sizeof (*a); i++)
+     sum += a[i];
+ 
+   return sum;
+ }
+ 
+ /* After late unrolling the above loop completely DOM should be
+    able to optimize this to return 28.  */
+ 
+ /* { dg-final { scan-tree-dump "return 28;" "optimized" } } */
+ /* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-3.c
===================================================================
*** /dev/null	1970-01-01 00:00:00.000000000 +0000
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-3.c	2014-11-19 13:54:38.096947828 +0100
***************
*** 0 ****
--- 1,31 ----
+ /* { dg-do run } */
+ /* { dg-options "-O -fno-tree-fre -fdump-tree-dom1" } */
+ 
+ extern void abort (void);
+ 
+ int a;
+ int __attribute__((noinline))
+ foo (int b)
+ {
+   a = 0;
+   if (b)
+     {
+       a = 1;
+       return a;
+     }
+   /* DOM should be able to CSE both loads here, forwarding 0 and 1
+      to the PHI feeding the return.  */
+   return a;
+ }
+ 
+ int
+ main()
+ {
+   if (foo (0) != 0
+       || foo (1) != 1)
+     abort ();
+   return 0;
+ }
+ 
+ /* { dg-final { scan-tree-dump "= PHI <\[01\]\\\(.\\\), \[01\]\\\(.\\\)>" "dom1" } } */
+ /* { dg-final { cleanup-tree-dump "dom1" } } */


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]