This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch, rfc] Make store motion use alias oracle


Hello,

I was recently asked about the possibility to use alias oracle in store
motion (instead of relying purely on the information contained in the
virtual operands).  This patch implements this idea (using a trivial
base+offset analysis for the oracle).  Some of the cases for that are
handled by the patch are in the added loop-24.c testcase (without the
patch, we fail to perform the store motion in all loops except for the
first one).  The patch was bootstrapped & regtested on x86_64.

I have some doubts about its usefulness, though.  The usage of alias
oracle makes sm more expensive -- for each "nice" memory reference that
would previously be refused because of may-alias according to vops, we
now need to check all the memory references with that it might conflict
using the alias oracle; this includes even "non-nice" memory references
(those whose address is not loop invariant) that we did not have to
consider at all before (we just recorded their vops).  And the gain
seems to be pretty limited:  On gcc stage2-bubble build, store motion
considers 103970 memory references.  Out of these 3033 have invariant
address, and without the patch, we determine that store motion can be
performed for 52 of them.  With the patch, we call the oracle
(mem_refs_may_alias_p) 2770 times; out of these calls, 492 times we
determine that the memory references cannot alias.  This causes us to
perform store motion for 56 memory references (i.e., 4 more than without
the patch).

Now, gcc obviously is not the type of application where sm is the most
useful; I guess e.g. c++ applications that heavily use structure fields
as induction variables in loops might benefit more from this change.
Also, the alias oracle I implemented is really trivial, so perhaps
better results are achievable.  But I think some more convincing
evidence of the usefulness of extending the store motion pass in this
way would be required.

Zdenek

	* tree-ssa-loop-im.c: Include tree-affine.h.
	(struct mem_ref): Remove next field.  Add is_movable field.
	(determine_lsm_ref, hoist_memory_references): Use is_movable instead of
	the set of clobbered vops.
	(gather_mem_refs, gather_mem_refs_stmt): Record memory references to
	vector instead of linked list.  Record also non-movable references.
	(find_more_ref_vops): Removed.
	(free_mem_refs): Work with a vector instead of a list.
	(vtoe_hash, vtoe_eq, vtoe_free, record_vop_access, get_vop_accesses,
	cannot_overlap_p, mem_refs_may_alias_p,
	disable_sm_for_conflicting_refs): New functions.
	(determine_lsm_loop): Use disable_sm_for_conflicting_refs instead of
	find_more_ref_vops.
	* tree-affine.c: Include tree-gimple.h and hashtab.h.
	(name_expansion_hash, name_expansion_eq, aff_combination_expand,
	tree_to_aff_combination_expand, get_inner_reference_aff): New
	functions.
	* tree-affine.h (aff_combination_expand,
	tree_to_aff_combination_expand, get_inner_reference_aff): Declare.
	* Makefile.in (tree-affine.o): Add $(TREE_GIMPLE_H dependency.
	(tree-ssa-loop-im.o): Add tree-affine.h dependency.

	* gcc.dg/tree-ssa/loop-24.c: New test.

Index: tree-ssa-loop-im.c
===================================================================
*** tree-ssa-loop-im.c	(revision 121271)
--- tree-ssa-loop-im.c	(working copy)
*************** Software Foundation, 51 Franklin Street,
*** 39,44 ****
--- 39,45 ----
  #include "flags.h"
  #include "real.h"
  #include "hashtab.h"
+ #include "tree-affine.h"
  
  /* TODO:  Support for predicated code motion.  I.e.
  
*************** struct mem_ref_loc
*** 115,137 ****
  
  /* Description of a memory reference for store motion.  */
  
! struct mem_ref
  {
    tree mem;			/* The memory itself.  */
    hashval_t hash;		/* Its hash value.  */
    bool is_stored;		/* True if there is a store to the location
  				   in the loop.  */
    struct mem_ref_loc *locs;	/* The locations where it is found.  */
    bitmap vops;			/* Vops corresponding to this memory
  				   location.  */
!   struct mem_ref *next;		/* Next memory reference in the list.
! 				   Memory references are stored in a hash
! 				   table, but the hash function depends
! 				   on values of pointers. Thus we cannot use
! 				   htab_traverse, since then we would get
! 				   miscompares during bootstrap (although the
! 				   produced code would be correct).  */
! };
  
  /* Minimum cost of an expensive expression.  */
  #define LIM_EXPENSIVE ((unsigned) PARAM_VALUE (PARAM_LIM_EXPENSIVE))
--- 116,136 ----
  
  /* Description of a memory reference for store motion.  */
  
! typedef struct mem_ref
  {
    tree mem;			/* The memory itself.  */
    hashval_t hash;		/* Its hash value.  */
    bool is_stored;		/* True if there is a store to the location
  				   in the loop.  */
+   bool is_movable;		/* True if the reference can be moved out of
+ 				   the loop.  */
    struct mem_ref_loc *locs;	/* The locations where it is found.  */
    bitmap vops;			/* Vops corresponding to this memory
  				   location.  */
! } *mem_ref_p;
! 
! DEF_VEC_P(mem_ref_p);
! DEF_VEC_ALLOC_P(mem_ref_p, heap);
  
  /* Minimum cost of an expensive expression.  */
  #define LIM_EXPENSIVE ((unsigned) PARAM_VALUE (PARAM_LIM_EXPENSIVE))
*************** schedule_sm (struct loop *loop, VEC (edg
*** 1071,1083 ****
     is true, prepare the statements that load the value of the memory reference
     to a temporary variable in the loop preheader, store it back on the loop
     exits, and replace all the references inside LOOP by this temporary variable.
!    EXITS is the list of exits of LOOP.  CLOBBERED_VOPS is the bitmap of virtual
!    operands that are clobbered by a call or accessed through multiple references
!    in loop.  */
  
  static void
  determine_lsm_ref (struct loop *loop, VEC (edge, heap) *exits,
! 		   bitmap clobbered_vops, struct mem_ref *ref)
  {
    struct mem_ref_loc *aref;
    struct loop *must_exec;
--- 1070,1080 ----
     is true, prepare the statements that load the value of the memory reference
     to a temporary variable in the loop preheader, store it back on the loop
     exits, and replace all the references inside LOOP by this temporary variable.
!    EXITS is the list of exits of LOOP.  */
  
  static void
  determine_lsm_ref (struct loop *loop, VEC (edge, heap) *exits,
! 		   struct mem_ref *ref)
  {
    struct mem_ref_loc *aref;
    struct loop *must_exec;
*************** determine_lsm_ref (struct loop *loop, VE
*** 1088,1094 ****
  
    /* If the reference is aliased with any different ref, or killed by call
       in function, then fail.  */
!   if (bitmap_intersect_p (ref->vops, clobbered_vops))
      return;
  
    if (tree_could_trap_p (ref->mem))
--- 1085,1091 ----
  
    /* If the reference is aliased with any different ref, or killed by call
       in function, then fail.  */
!   if (!ref->is_movable)
      return;
  
    if (tree_could_trap_p (ref->mem))
*************** determine_lsm_ref (struct loop *loop, VE
*** 1125,1142 ****
    schedule_sm (loop, exits, ref->mem, ref->locs);
  }
  
! /* Hoists memory references MEM_REFS out of LOOP.  CLOBBERED_VOPS is the list
!    of vops clobbered by call in loop or accessed by multiple memory references.
!    EXITS is the list of exit edges of the LOOP.  */
  
  static void
! hoist_memory_references (struct loop *loop, struct mem_ref *mem_refs,
! 			 bitmap clobbered_vops, VEC (edge, heap) *exits)
  {
!   struct mem_ref *ref;
  
!   for (ref = mem_refs; ref; ref = ref->next)
!     determine_lsm_ref (loop, exits, clobbered_vops, ref);
  }
  
  /* Checks whether LOOP (with exits stored in EXITS array) is suitable
--- 1122,1139 ----
    schedule_sm (loop, exits, ref->mem, ref->locs);
  }
  
! /* Hoists memory references MEM_REFS out of LOOP.  EXITS is the list of exit
!    edges of the LOOP.  */
  
  static void
! hoist_memory_references (struct loop *loop, VEC (mem_ref_p, heap) *mem_refs,
! 			 VEC (edge, heap) *exits)
  {
!   mem_ref_p ref;
!   unsigned  i;
  
!   for (i = 0; VEC_iterate (mem_ref_p, mem_refs, i, ref); i++)
!     determine_lsm_ref (loop, exits, ref);
  }
  
  /* Checks whether LOOP (with exits stored in EXITS array) is suitable
*************** memref_eq (const void *obj1, const void 
*** 1186,1192 ****
  static void
  gather_mem_refs_stmt (struct loop *loop, htab_t mem_refs,
  		      bitmap clobbered_vops, tree stmt,
! 		      struct mem_ref **mem_ref_list)
  {
    tree *lhs, *rhs, *mem = NULL;
    hashval_t hash;
--- 1183,1189 ----
  static void
  gather_mem_refs_stmt (struct loop *loop, htab_t mem_refs,
  		      bitmap clobbered_vops, tree stmt,
! 		      VEC (mem_ref_p, heap) **mem_ref_list)
  {
    tree *lhs, *rhs, *mem = NULL;
    hashval_t hash;
*************** gather_mem_refs_stmt (struct loop *loop,
*** 1194,1200 ****
    struct mem_ref *ref = NULL;
    ssa_op_iter oi;
    tree vname;
!   bool is_stored;
  
    if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS))
      return;
--- 1191,1197 ----
    struct mem_ref *ref = NULL;
    ssa_op_iter oi;
    tree vname;
!   bool is_stored, is_movable = true;
  
    if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS))
      return;
*************** gather_mem_refs_stmt (struct loop *loop,
*** 1223,1242 ****
    else
      goto fail;
  
!   /* If we cannot create an SSA name for the result, give up.  */
    if (!is_gimple_reg_type (TREE_TYPE (*mem))
!       || TREE_THIS_VOLATILE (*mem))
!     goto fail;
! 
!   /* If we cannot move the reference out of the loop, fail.  */
!   if (!for_each_index (mem, may_move_till, loop))
!     goto fail;
  
    hash = iterative_hash_expr (*mem, 0);
    slot = htab_find_slot_with_hash (mem_refs, *mem, hash, INSERT);
  
    if (*slot)
!     ref = *slot;
    else
      {
        ref = XNEW (struct mem_ref);
--- 1220,1240 ----
    else
      goto fail;
  
!   /* If we cannot create an SSA name for the result, or the address is not
!      invariant enough to move it from the loop, mark the reference.  */
    if (!is_gimple_reg_type (TREE_TYPE (*mem))
!       || TREE_THIS_VOLATILE (*mem)
!       || !for_each_index (mem, may_move_till, loop))
!     is_movable = false;
  
    hash = iterative_hash_expr (*mem, 0);
    slot = htab_find_slot_with_hash (mem_refs, *mem, hash, INSERT);
  
    if (*slot)
!     {
!       ref = *slot;
!       gcc_assert (ref->is_movable == is_movable);
!     }
    else
      {
        ref = XNEW (struct mem_ref);
*************** gather_mem_refs_stmt (struct loop *loop,
*** 1244,1252 ****
        ref->hash = hash;
        ref->locs = NULL;
        ref->is_stored = false;
        ref->vops = BITMAP_ALLOC (NULL);
!       ref->next = *mem_ref_list;
!       *mem_ref_list = ref;
        *slot = ref;
      }
    ref->is_stored |= is_stored;
--- 1242,1250 ----
        ref->hash = hash;
        ref->locs = NULL;
        ref->is_stored = false;
+       ref->is_movable = is_movable;
        ref->vops = BITMAP_ALLOC (NULL);
!       VEC_safe_push (mem_ref_p, heap, *mem_ref_list, ref);
        *slot = ref;
      }
    ref->is_stored |= is_stored;
*************** fail:
*** 1265,1277 ****
     statements in CLOBBERED_VOPS.  The list of the references found by
     the function is returned.  */
  
! static struct mem_ref *
  gather_mem_refs (struct loop *loop, bitmap clobbered_vops)
  {
    basic_block *body = get_loop_body (loop);
    block_stmt_iterator bsi;
    unsigned i;
!   struct mem_ref *mem_ref_list = NULL;
    htab_t mem_refs = htab_create (100, memref_hash, memref_eq, NULL);
  
    for (i = 0; i < loop->num_nodes; i++)
--- 1263,1275 ----
     statements in CLOBBERED_VOPS.  The list of the references found by
     the function is returned.  */
  
! static VEC (mem_ref_p, heap) *
  gather_mem_refs (struct loop *loop, bitmap clobbered_vops)
  {
    basic_block *body = get_loop_body (loop);
    block_stmt_iterator bsi;
    unsigned i;
!   VEC (mem_ref_p, heap) *mem_ref_list = NULL;
    htab_t mem_refs = htab_create (100, memref_hash, memref_eq, NULL);
  
    for (i = 0; i < loop->num_nodes; i++)
*************** gather_mem_refs (struct loop *loop, bitm
*** 1287,1340 ****
    return mem_ref_list;
  }
  
! /* Finds the vops accessed by more than one of the memory references described
!    in MEM_REFS and marks them in CLOBBERED_VOPS.  */
  
  static void
! find_more_ref_vops (struct mem_ref *mem_refs, bitmap clobbered_vops)
  {
!   bitmap_head tmp, all_vops;
!   struct mem_ref *ref;
  
!   bitmap_initialize (&tmp, &bitmap_default_obstack);
!   bitmap_initialize (&all_vops, &bitmap_default_obstack);
  
!   for (ref = mem_refs; ref; ref = ref->next)
!     {
!       /* The vops that are already in all_vops are accessed by more than
! 	 one memory reference.  */
!       bitmap_and (&tmp, &all_vops, ref->vops);
!       bitmap_ior_into (clobbered_vops, &tmp);
!       bitmap_clear (&tmp);
  
!       bitmap_ior_into (&all_vops, ref->vops);
!     }
  
!   bitmap_clear (&all_vops);
  }
  
! /* Releases the memory occupied by REF.  */
  
  static void
! free_mem_ref (struct mem_ref *ref)
  {
!   free_mem_ref_locs (ref->locs);
!   BITMAP_FREE (ref->vops);
!   free (ref);
  }
  
! /* Releases the memory occupied by REFS.  */
  
  static void
! free_mem_refs (struct mem_ref *refs)
  {
!   struct mem_ref *ref, *next;
  
!   for (ref = refs; ref; ref = next)
      {
!       next = ref->next;
!       free_mem_ref (ref);
      }
  }
  
  /* Try to perform store motion for all memory references modified inside
--- 1285,1515 ----
    return mem_ref_list;
  }
  
! /* Releases the memory occupied by REF.  */
  
  static void
! free_mem_ref (struct mem_ref *ref)
  {
!   free_mem_ref_locs (ref->locs);
!   BITMAP_FREE (ref->vops);
!   free (ref);
! }
  
! /* Releases the memory occupied by REFS.  */
  
! static void
! free_mem_refs (VEC (mem_ref_p, heap) *refs)
! {
!   mem_ref_p ref;
!   unsigned i;
  
!   for (i = 0; VEC_iterate (mem_ref_p, refs, i, ref); i++)
!     free_mem_ref (ref);
!   VEC_free (mem_ref_p, heap, refs);
! }
! 
! /* Element of the hash table that maps vops to memory references.  */
! 
! struct vop_to_refs_elt
! {
!   /* DECL_UID of the vop.  */
!   unsigned uid;
! 
!   /* List of the references.  */
!   bitmap refs;
! };
  
! 
! /* A hash function for struct vop_to_refs_elt object OBJ.  */
! 
! static hashval_t
! vtoe_hash (const void *obj)
! {
!   const struct vop_to_refs_elt *vtoe = obj;
! 
!   return vtoe->uid;
  }
  
! /* An equality function for struct vop_to_refs_elt object OBJ1 with
!    uid of a vop OBJ2.  */
! 
! static int
! vtoe_eq (const void *obj1, const void *obj2)
! {
!   const struct vop_to_refs_elt *vtoe = obj1;
!   const unsigned *uid = obj2;
! 
!   return vtoe->uid == *uid;
! }
! 
! /* A function to free the struct vop_to_refs_elt object.  */
  
  static void
! vtoe_free (void *obj)
  {
!   struct vop_to_refs_elt *vtoe = obj;
! 
!   BITMAP_FREE (vtoe->refs);
!   free (vtoe);
  }
  
! /* Records REF to hashtable VOP_TO_REFS for the index VOP.  */
! 
! static void
! record_vop_access (htab_t vop_to_refs, unsigned vop, unsigned ref)
! {
!   void **slot = htab_find_slot_with_hash (vop_to_refs, &vop, vop, INSERT);
!   struct vop_to_refs_elt *vtoe;
! 
!   if (!*slot)
!     {
!       vtoe = XNEW (struct vop_to_refs_elt);
!       vtoe->uid = vop;
!       vtoe->refs = BITMAP_ALLOC (NULL);
!       *slot = vtoe;
!     }
!   else
!     vtoe = *slot;
! 
!   bitmap_set_bit (vtoe->refs, ref);
! }
! 
! /* Returns the bitmap associated with VOP in the table VOP_TO_REFS.  */
! 
! static bitmap
! get_vop_accesses (htab_t vop_to_refs, unsigned vop)
! {
!   struct vop_to_refs_elt *vtoe = htab_find_with_hash (vop_to_refs, &vop, vop);
!   return vtoe->refs;
! }
! 
! /* Returns true if a region of size SIZE1 at position 0 and a region of
!    size SIZE2 at position DIFF cannot overlap.  */
! 
! static bool
! cannot_overlap_p (aff_tree *diff, double_int size1, double_int size2)
! {
!   double_int d, bound;
! 
!   /* Unless the difference is a constant, we fail.  */
!   if (diff->n != 0)
!     return false;
! 
!   d = diff->offset;
!   if (double_int_negative_p (d))
!     {
!       /* The second object is before the first one, we succeed if the last
! 	 element of the second object is before the start of the first one.  */
!       bound = double_int_add (d, double_int_add (size2, double_int_minus_one));
!       return double_int_negative_p (bound);
!     }
!   else
!     {
!       /* We succeed if the second object starts after the first one ends.  */
!       return double_int_scmp (size1, d) <= 0;
!     }
! }
! 
! /* Returns true if MEM1 and MEM2 may alias.  TTAE_CACHE is used as a cache in
!    tree_to_aff_combination_expand.  */
! 
! static bool
! mem_refs_may_alias_p (tree mem1, tree mem2, htab_t *ttae_cache)
! {
!   /* Perform BASE + OFFSET analysis -- if MEM1 and MEM2 are based on the same
!      object and their offset differ in such a way that the locations cannot
!      overlap, then they cannot alias.  */
!   aff_tree off1, off2;
!   double_int size1, size2;
! 
!   /* The expansion of addresses may be a bit expensive, thus we only do
!      the check at -O2 and higher optimization levels.  TODO -- add handling
!      of common cases at -O1.  */
!   if (optimize < 2)
!     return true;
! 
!   get_inner_reference_aff (mem1, &off1, &size1);
!   get_inner_reference_aff (mem2, &off2, &size2);
!   aff_combination_expand (&off1, ttae_cache);
!   aff_combination_expand (&off2, ttae_cache);
!   aff_combination_scale (&off1, double_int_minus_one);
!   aff_combination_add (&off2, &off1);
! 
!   if (cannot_overlap_p (&off2, size1, size2))
!     return false;
! 
!   return true;
! }
! 
! /* Disables store motion for MEM_REFs that conflict with another memory
!    reference.  */
  
  static void
! disable_sm_for_conflicting_refs (VEC (mem_ref_p, heap) *mem_refs,
! 				 bitmap clobbered_vops)
  {
!   htab_t vop_to_refs = htab_create (VEC_length (mem_ref_p, mem_refs),
! 				    vtoe_hash, vtoe_eq, vtoe_free);
!   unsigned i, vid, rid;
!   bitmap_iterator bi;
!   mem_ref_p ref, ref1;
!   bitmap conflicts = BITMAP_ALLOC (NULL);
!   htab_t ttae_cache = NULL;
! 
!   for (i = 0; VEC_iterate (mem_ref_p, mem_refs, i, ref); i++)
!     {
!       /* First record for each vop all the memory references that touch it.
! 	 Also, we mark the clobbered references as not movable.  */
!       if (bitmap_and_compl_into (ref->vops, clobbered_vops))
! 	ref->is_movable = false;
!       EXECUTE_IF_SET_IN_BITMAP (ref->vops, 0, vid, bi)
! 	{
! 	  record_vop_access (vop_to_refs, vid, i);
! 	}
!     }
  
!   for (i = 0; VEC_iterate (mem_ref_p, mem_refs, i, ref); i++)
      {
!       if (!ref->is_movable)
! 	continue;
! 
!       /* Now, find all the references with that this one might conflict, and
! 	 verify that this is not the case.  */
!       bitmap_clear (conflicts);
!       EXECUTE_IF_SET_IN_BITMAP (ref->vops, 0, vid, bi)
! 	{
! 	  bitmap_ior_into (conflicts,
! 			   get_vop_accesses (vop_to_refs, vid));
! 	}
! 
!       
!       EXECUTE_IF_SET_IN_BITMAP (conflicts, 0, rid, bi)
! 	{
! 	  if (rid == i)
! 	    continue;
! 	  ref1 = VEC_index (mem_ref_p, mem_refs, rid);
! 
! 	  if (rid < i
! 	      && ref1->is_movable)
! 	    {
! 	      /* We already tested whether REF1 conflicts with anything, and
! 		 determined that it does not, so we do not need to check
! 		 again.  */
! 	      continue;
! 	    }
! 
! 	  if (mem_refs_may_alias_p (ref->mem, ref1->mem, &ttae_cache))
! 	    {
! 	      ref->is_movable = false;
! 	      ref1->is_movable = false;
! 	      break;
! 	    }
! 	}
      }
+   BITMAP_FREE (conflicts);
+   htab_delete (vop_to_refs);
+   if (ttae_cache)
+     htab_delete (ttae_cache);
  }
  
  /* Try to perform store motion for all memory references modified inside
*************** determine_lsm_loop (struct loop *loop)
*** 1345,1351 ****
  {
    VEC (edge, heap) *exits = get_loop_exit_edges (loop);
    bitmap clobbered_vops;
!   struct mem_ref *mem_refs;
  
    if (!loop_suitable_for_sm (loop, exits))
      {
--- 1520,1526 ----
  {
    VEC (edge, heap) *exits = get_loop_exit_edges (loop);
    bitmap clobbered_vops;
!   VEC (mem_ref_p, heap) *mem_refs;
  
    if (!loop_suitable_for_sm (loop, exits))
      {
*************** determine_lsm_loop (struct loop *loop)
*** 1357,1367 ****
    clobbered_vops = BITMAP_ALLOC (NULL);
    mem_refs = gather_mem_refs (loop, clobbered_vops);
  
!   /* Find the vops that are used for more than one reference.  */
!   find_more_ref_vops (mem_refs, clobbered_vops);
  
    /* Hoist all suitable memory references.  */
!   hoist_memory_references (loop, mem_refs, clobbered_vops, exits);
  
    free_mem_refs (mem_refs);
    VEC_free (edge, heap, exits);
--- 1532,1543 ----
    clobbered_vops = BITMAP_ALLOC (NULL);
    mem_refs = gather_mem_refs (loop, clobbered_vops);
  
!   /* Disable store motion for conflicting accesses to the same memory
!      location.  */
!   disable_sm_for_conflicting_refs (mem_refs, clobbered_vops);
  
    /* Hoist all suitable memory references.  */
!   hoist_memory_references (loop, mem_refs, exits);
  
    free_mem_refs (mem_refs);
    VEC_free (edge, heap, exits);
Index: testsuite/gcc.dg/tree-ssa/loop-24.c
===================================================================
*** testsuite/gcc.dg/tree-ssa/loop-24.c	(revision 0)
--- testsuite/gcc.dg/tree-ssa/loop-24.c	(revision 0)
***************
*** 0 ****
--- 1,53 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -fdump-tree-lim-details" } */
+ 
+ int x;
+ int a[100];
+ 
+ struct a
+ {
+   int X;
+   int Y;
+ };
+ 
+ struct a arr[100];
+ 
+ void bla(void);
+ 
+ void foo (struct a *A, unsigned b)
+ {
+   unsigned i;
+ 
+   /* We should perform store motion here.  */
+   for (x = 0; x < 100; x++)
+     a[x] = x;
+ 
+   /* But not here.  */
+   for (x = 0; x < 100; x++)
+     bla ();
+ 
+   /* But we should here (using base + offset analysis).  */
+   for (i = 0; i < 100; i++)
+     {
+       A[5].X += i;
+       A[5].Y += i;
+     }
+ 
+   /* And here.  */
+   for (i = 0; i < 100; i++)
+     {
+       arr[b+8].X += i;
+       arr[b+9].X += i;
+     }
+ 
+   /* And here as well.  */
+   for (i = 0; i < 100; i++)
+     {
+       A[b].X += i;
+       A[b+1].Y += i;
+     }
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "Executing store motion of" 7 "lim" } } */
+ 
+ /* { dg-final { cleanup-tree-dump "lim" } } */
Index: tree-affine.c
===================================================================
*** tree-affine.c	(revision 121271)
--- tree-affine.c	(working copy)
*************** Software Foundation, 51 Franklin Street,
*** 30,35 ****
--- 30,37 ----
  #include "diagnostic.h"
  #include "tree-dump.h"
  #include "tree-affine.h"
+ #include "tree-gimple.h"
+ #include "hashtab.h"
  
  /* Extends CST as appropriate for the affine combinations COMB.  */
  
*************** aff_combination_mult (aff_tree *c1, aff_
*** 493,495 ****
--- 495,649 ----
      aff_combination_add_product (c1, double_int_one, c2->rest, r);
    aff_combination_add_product (c1, c2->offset, NULL, r);
  }
+ 
+ /* Element of the cache that maps ssa name NAME to its expanded form
+    as an affine expression EXPANSION.  */
+ 
+ struct name_expansion
+ {
+   aff_tree expansion;
+   tree name;
+ 
+   /* True if the expansion for the name is just being generated.  */
+   unsigned in_progress : 1;
+ };
+ 
+ /* Hash function for struct name_expansion.  */
+ 
+ static hashval_t
+ name_expansion_hash (const void *e)
+ {
+   return SSA_NAME_VERSION (((struct name_expansion *) e)->name);
+ }
+ 
+ /* Equality function for struct name_expansion.  The second argument is an
+    SSA name.  */
+ 
+ static int
+ name_expansion_eq (const void *e, const void *n)
+ {
+   return ((struct name_expansion *) e)->name == n;
+ }
+ 
+ /* Expands SSA names in COMB recursively.  CACHE is used to cache the
+    results.  */
+ 
+ void
+ aff_combination_expand (aff_tree *comb, htab_t *cache)
+ {
+   unsigned i;
+   aff_tree to_add, current, curre;
+   tree e, def, rhs;
+   double_int scale;
+   void **slot;
+   struct name_expansion *exp;
+ 
+   aff_combination_zero (&to_add, comb->type);
+   for (i = 0; i < comb->n; i++)
+     {
+       e = comb->elts[i].val;
+       if (TREE_CODE (e) != SSA_NAME)
+ 	continue;
+       def = SSA_NAME_DEF_STMT (e);
+       if (TREE_CODE (def) != GIMPLE_MODIFY_STMT
+ 	  || GIMPLE_STMT_OPERAND (def, 0) != e)
+ 	continue;
+ 
+       rhs = GIMPLE_STMT_OPERAND (def, 1);
+       if (TREE_CODE (rhs) != SSA_NAME
+ 	  && !EXPR_P (rhs)
+ 	  && !is_gimple_min_invariant (rhs))
+ 	continue;
+ 
+       /* We do not know whether the reference retains its value at the
+ 	 place where the expansion is used.  */
+       if (REFERENCE_CLASS_P (rhs))
+ 	continue;
+ 
+       if (!*cache)
+ 	*cache = htab_create (10, name_expansion_hash, name_expansion_eq,
+ 			      free);
+       slot = htab_find_slot_with_hash (*cache, e, SSA_NAME_VERSION (e), INSERT);
+       exp = *slot;
+ 
+       if (exp)
+ 	{
+ 	  /* Since we follow the definitions in the SSA form, we should not
+ 	     enter a cycle unless we pass through a phi node.  */
+ 	  gcc_assert (!exp->in_progress);
+ 	  current = exp->expansion;
+ 	}
+       else
+ 	{
+ 	  exp = XNEW (struct name_expansion);
+ 	  exp->name = e;
+ 	  exp->in_progress = 1;
+ 	  *slot = exp;
+ 	  tree_to_aff_combination_expand (rhs, comb->type, &current, cache);
+ 	  exp->expansion = current;
+ 	  exp->in_progress = 0;
+ 	}
+ 
+       /* Accumulate the new terms to TO_ADD, so that we do not modify
+ 	 COMB while traversing it; include the term -coef * E, to remove
+          it from COMB.  */
+       scale = comb->elts[i].coef;
+       aff_combination_zero (&curre, comb->type);
+       aff_combination_add_elt (&curre, e, double_int_neg (scale));
+       aff_combination_scale (&current, scale);
+       aff_combination_add (&to_add, &current);
+       aff_combination_add (&to_add, &curre);
+     }
+   aff_combination_add (comb, &to_add);
+ }
+ 
+ /* Similar to tree_to_aff_combination, but follows SSA name definitions
+    and expands them recursively.  CACHE is used to cache the expansions
+    of the ssa names, to avoid exponential time complexity for cases
+    like
+  
+    a1 = a0 + a0;
+    a2 = a1 + a1;
+    a3 = a2 + a2;
+    ...  */
+ 
+ void
+ tree_to_aff_combination_expand (tree expr, tree type, aff_tree *comb,
+ 				htab_t *cache)
+ {
+   tree_to_aff_combination (expr, type, comb);
+   aff_combination_expand (comb, cache);
+ }
+ 
+ /* Returns address of the reference REF in ADDR.  The size of the accessed
+    location is stored to SIZE.  */
+ 
+ void
+ get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size)
+ {
+   HOST_WIDE_INT bitsize, bitpos;
+   tree toff;
+   enum machine_mode mode;
+   int uns, vol;
+   aff_tree tmp;
+   tree base = get_inner_reference (ref, &bitsize, &bitpos, &toff, &mode,
+ 				   &uns, &vol, false);
+   tree base_addr = build_fold_addr_expr (base);
+ 
+   /* ADDR = &BASE + TOFF + BITPOS / BITS_PER_UNIT.  */
+ 
+   tree_to_aff_combination (base_addr, sizetype, addr);
+ 
+   if (toff)
+     {
+       tree_to_aff_combination (toff, sizetype, &tmp);
+       aff_combination_add (addr, &tmp);
+     }
+ 
+   aff_combination_const (&tmp, sizetype,
+ 			 shwi_to_double_int (bitpos / BITS_PER_UNIT));
+   aff_combination_add (addr, &tmp);
+ 
+   *size = shwi_to_double_int ((bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT);
+ }
+ 
Index: tree-affine.h
===================================================================
*** tree-affine.h	(revision 121271)
--- tree-affine.h	(working copy)
*************** void aff_combination_convert (aff_tree *
*** 70,72 ****
--- 70,75 ----
  void tree_to_aff_combination (tree, tree, aff_tree *);
  tree aff_combination_to_tree (aff_tree *);
  void unshare_aff_combination (aff_tree *);
+ void aff_combination_expand (aff_tree *, htab_t *);
+ void tree_to_aff_combination_expand (tree, tree, aff_tree *, htab_t *);
+ void get_inner_reference_aff (tree, aff_tree *, double_int *);
Index: Makefile.in
===================================================================
*** Makefile.in	(revision 121271)
--- Makefile.in	(working copy)
*************** tree-ssa-loop-ivopts.o : tree-ssa-loop-i
*** 2142,2148 ****
     $(CFGLOOP_H) $(PARAMS_H) langhooks.h $(BASIC_BLOCK_H) hard-reg-set.h \
     tree-chrec.h $(VARRAY_H) tree-affine.h
  tree-affine.o : tree-affine.c tree-affine.h $(CONFIG_H) \
!    $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) \
     output.h $(DIAGNOSTIC_H) $(TM_H) coretypes.h $(TREE_DUMP_H)
  tree-ssa-loop-manip.o : tree-ssa-loop-manip.c $(TREE_FLOW_H) $(CONFIG_H) \
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) \
--- 2142,2148 ----
     $(CFGLOOP_H) $(PARAMS_H) langhooks.h $(BASIC_BLOCK_H) hard-reg-set.h \
     tree-chrec.h $(VARRAY_H) tree-affine.h
  tree-affine.o : tree-affine.c tree-affine.h $(CONFIG_H) \
!    $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(TREE_GIMPLE_H) \
     output.h $(DIAGNOSTIC_H) $(TM_H) coretypes.h $(TREE_DUMP_H)
  tree-ssa-loop-manip.o : tree-ssa-loop-manip.c $(TREE_FLOW_H) $(CONFIG_H) \
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) \
*************** tree-ssa-loop-im.o : tree-ssa-loop-im.c 
*** 2153,2159 ****
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) domwalk.h \
     $(PARAMS_H) output.h $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
     $(TREE_DUMP_H) tree-pass.h $(FLAGS_H) $(REAL_H) $(BASIC_BLOCK_H) \
!    hard-reg-set.h
  tree-ssa-math-opts.o : tree-ssa-math-opts.c $(TREE_FLOW_H) $(CONFIG_H) \
     $(SYSTEM_H) $(TREE_H) $(TIMEVAR_H) tree-pass.h $(TM_H) $(FLAGS_H) \
     alloc-pool.h $(BASIC_BLOCK_H) $(TARGET_H)
--- 2153,2159 ----
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) domwalk.h \
     $(PARAMS_H) output.h $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
     $(TREE_DUMP_H) tree-pass.h $(FLAGS_H) $(REAL_H) $(BASIC_BLOCK_H) \
!    hard-reg-set.h tree-affine.h
  tree-ssa-math-opts.o : tree-ssa-math-opts.c $(TREE_FLOW_H) $(CONFIG_H) \
     $(SYSTEM_H) $(TREE_H) $(TIMEVAR_H) tree-pass.h $(TM_H) $(FLAGS_H) \
     alloc-pool.h $(BASIC_BLOCK_H) $(TARGET_H)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]