This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] Lno branch merge part 7 -- tree level invariant motion


Hello,

> On Tue, Jul 06, 2004 at 04:15:42PM +0200, Zdenek Dvorak wrote:
> > + bool
> > + for_each_index (tree *addr_p, bool (*cbck) (tree, tree *, void *), void *data)
> 
> This doesn't handle ARRAY_RANGE_REF or VIEW_CONVERT_EXPR.

because I don't know they exist (they never showed up when I used to
have default: abort (); clause in the switch).

> Does it
> need to do anything special for ARRAY_REF with variable lower bound

I don't think so; this function just iterates over all indices in the
memory reference.

> or varible stride?

What's that?

> > + /* Returns true if MEM is a memory access that might cause error.  */
> > + 
> > + static bool
> > + unsafe_memory_access_p (tree mem)
> 
> I believe this is tree_could_trap_p ...

As far as I see, tree_could_trap_p will never say you that ARRAY_REF
access traps; otherwise I also think so (as far as I remember, the
reason for having unsafe_memory_access_p was that in the time I wrote
this part, tree_could_trap_p did nothing but checking for div_exprs).

> > +       return !in_array_bounds_p (base, idx);
> 
> ... except for this.  Is this necessary?  It would be if you're
> moving a store out of a possibly-zero-trip loop, sure.  Assuming
> it is necessary, can we share code with tree_could_trap_p?

Not strictly necessary, but it enables you to move more invariants
out of the loops.

> > +       || tree_could_trap_p (rhs)
> > +       || unsafe_memory_access_p (rhs))
> 
> Hum.  You're already calling both.  In which case we definitely
> should figure out how to share code.

Sure; I think we just need to fix handling of array_refs in
tree_could_trap_p and it should be possible to use it, then.

> I think your handling of
> INDIRECT_REF, for instance, is incorrect.

It is definitely conservatively correct.

> > + determine_max_movement (tree stmt, bool must_preserve_exec)
> ...
> > +   for (i = 0; i < NUM_USES (uses); i++)
> > +   for (i = 0; i < NUM_VUSES (vuses); i++)
> > +   for (i = 0; i < NUM_V_MAY_DEFS (v_may_defs); i++)
> 
> Missing MUST_DEFS?  If not, add a comment why.

Because we obviously are interested only in uses of statement.

> > + static tree
> > + single_reachable_address (struct loop *loop, tree stmt,
> > + 			  struct mem_ref **mem_refs)
> ...
> > + 	  for (i = 0; i < NUM_V_MAY_DEFS (v_may_defs); i++)
> > + 	  for (i = 0; i < NUM_VUSES (vuses); i++)
> 
> MUST_DEFS again?

Since we are interested only in virtual uses (and comment is already there).

> > +   /* Set ALWAYS_EXECUTED_IN for each block.  Quadratic, can be improved.  */
> 
> You're not processing any statement more than once.
> How is that quadratic?

I am not really sure why this comment is there; the only quadratic
behavior I see is that fill_always_executed_in is executed for each loop
and it is linear in size of the loop body, but this should not be a
problem in practice.

Anyway, here is the (not fully tested yet) patch reflecting those of the
changes you suggest I was able to do without further discussion.

Zdenek

Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.1321
diff -c -3 -p -r1.1321 Makefile.in
*** Makefile.in	8 Jul 2004 08:19:58 -0000	1.1321
--- Makefile.in	8 Jul 2004 15:52:50 -0000
*************** OBJS-common = \
*** 899,905 ****
   cfg.o cfganal.o cfgbuild.o cfgcleanup.o cfglayout.o cfgloop.o		   \
   cfgloopanal.o cfgloopmanip.o loop-init.o loop-unswitch.o loop-unroll.o	   \
   cfgrtl.o combine.o conflict.o convert.o coverage.o cse.o cselib.o 	   \
!  dbxout.o ddg.o tree-ssa-loop-ch.o loop-invariant.o			   \
   debug.o df.o diagnostic.o dojump.o dominance.o loop-doloop.o		   \
   dwarf2asm.o dwarf2out.o emit-rtl.o except.o explow.o loop-iv.o		   \
   expmed.o expr.o final.o flow.o fold-const.o function.o gcse.o		   \
--- 899,905 ----
   cfg.o cfganal.o cfgbuild.o cfgcleanup.o cfglayout.o cfgloop.o		   \
   cfgloopanal.o cfgloopmanip.o loop-init.o loop-unswitch.o loop-unroll.o	   \
   cfgrtl.o combine.o conflict.o convert.o coverage.o cse.o cselib.o 	   \
!  dbxout.o ddg.o tree-ssa-loop-ch.o loop-invariant.o tree-ssa-loop-im.o	   \
   debug.o df.o diagnostic.o dojump.o dominance.o loop-doloop.o		   \
   dwarf2asm.o dwarf2out.o emit-rtl.o except.o explow.o loop-iv.o		   \
   expmed.o expr.o final.o flow.o fold-const.o function.o gcse.o		   \
*************** tree-ssa-loop-ch.o : tree-ssa-loop-ch.c 
*** 1680,1685 ****
--- 1680,1689 ----
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) tree-inline.h \
     output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
     tree-pass.h flags.h
+ tree-ssa-loop-im.o : tree-ssa-loop-im.c $(TREE_FLOW_H) $(CONFIG_H) \
+    $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) domwalk.h $(PARAMS_H)\
+    output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
+    tree-pass.h flags.h
  tree-ssa-alias.o : tree-ssa-alias.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
     $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) tree-inline.h $(FLAGS_H) \
     function.h $(TIMEVAR_H) tree-alias-common.h convert.h $(TM_H) coretypes.h \
Index: cfgloop.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cfgloop.c,v
retrieving revision 1.32
diff -c -3 -p -r1.32 cfgloop.c
*** cfgloop.c	13 May 2004 06:39:32 -0000	1.32
--- cfgloop.c	8 Jul 2004 15:52:50 -0000
*************** flow_loop_nested_p (const struct loop *o
*** 101,106 ****
--- 101,120 ----
  	 && loop->pred[outer->depth] == outer;
  }
  
+ /* Returns superloop of LOOP at given DEPTH.  */
+ 
+ struct loop *
+ superloop_at_depth (struct loop *loop, unsigned depth)
+ {
+   if (depth > (unsigned) loop->depth)
+     abort ();
+ 
+   if (depth == (unsigned) loop->depth)
+     return loop;
+ 
+   return loop->pred[depth];
+ }
+ 
  /* Dump the loop information specified by LOOP to the stream FILE
     using auxiliary dump callback function LOOP_DUMP_AUX if non null.  */
  
Index: cfgloop.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cfgloop.h,v
retrieving revision 1.20
diff -c -3 -p -r1.20 cfgloop.h
*** cfgloop.h	20 Jun 2004 21:31:28 -0000	1.20
--- cfgloop.h	8 Jul 2004 15:52:50 -0000
*************** extern bool flow_loop_outside_edge_p (co
*** 253,258 ****
--- 253,259 ----
  extern bool flow_loop_nested_p	(const struct loop *, const struct loop *);
  extern bool flow_bb_inside_loop_p (const struct loop *, const basic_block);
  extern struct loop * find_common_loop (struct loop *, struct loop *);
+ struct loop *superloop_at_depth (struct loop *, unsigned);
  extern int num_loop_insns (struct loop *);
  extern int average_num_loop_insns (struct loop *);
  extern unsigned get_loop_level (const struct loop *);
Index: common.opt
===================================================================
RCS file: /cvs/gcc/gcc/gcc/common.opt,v
retrieving revision 1.39
diff -c -3 -p -r1.39 common.opt
*** common.opt	30 Jun 2004 21:28:59 -0000	1.39
--- common.opt	8 Jul 2004 15:52:50 -0000
*************** ftree-fre
*** 777,782 ****
--- 777,786 ----
  Common Report Var(flag_tree_fre)
  Enable Full Redundancy Elimination (FRE) on trees
  
+ ftree-lim
+ Common Report Var(flag_tree_lim) Init(1)
+ Enable loop invariant motion on trees
+ 
  ftree-loop-optimize
  Common Report Var(flag_tree_loop_optimize) Init(1)
  Enable loop optimizations on tree level
Index: params.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/params.def,v
retrieving revision 1.41
diff -c -3 -p -r1.41 params.def
*** params.def	25 May 2004 12:54:54 -0000	1.41
--- params.def	8 Jul 2004 15:52:50 -0000
*************** DEFPARAM(PARAM_MAX_CSE_PATH_LENGTH,
*** 304,309 ****
--- 304,316 ----
  	 "The maximum length of path considered in cse",
  	 10)
  
+ /* The cost of expression in loop invariant motion that is considered
+    expensive.  */
+ DEFPARAM(PARAM_LIM_EXPENSIVE,
+ 	 "lim-expensive",
+ 	 "The minimum cost of an expensive expression in the loop invariant motion",
+ 	 20)
+ 
  /* The product of the next two is used to decide whether or not to
     use .GLOBAL_VAR.  See tree-dfa.c.  */
  DEFPARAM(PARAM_GLOBAL_VAR_THRESHOLD,
Index: timevar.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/timevar.def,v
retrieving revision 1.29
diff -c -3 -p -r1.29 timevar.def
*** timevar.def	29 Jun 2004 01:53:03 -0000	1.29
--- timevar.def	8 Jul 2004 15:52:50 -0000
*************** DEFTIMEVAR (TV_TREE_DCE		     , "tree co
*** 82,87 ****
--- 82,88 ----
  DEFTIMEVAR (TV_TREE_CD_DCE	     , "tree aggressive DCE")
  DEFTIMEVAR (TV_TREE_DSE		     , "tree DSE")
  DEFTIMEVAR (TV_TREE_LOOP	     , "tree loop optimization")
+ DEFTIMEVAR (TV_LIM                   , "loop invariant motion")
  DEFTIMEVAR (TV_TREE_CH		     , "tree copy headers")
  DEFTIMEVAR (TV_TREE_SSA_TO_NORMAL    , "tree SSA to normal")
  DEFTIMEVAR (TV_TREE_NRV		     , "tree NRV optimization")
Index: tree-dfa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-dfa.c,v
retrieving revision 2.15
diff -c -3 -p -r2.15 tree-dfa.c
*** tree-dfa.c	25 Jun 2004 18:30:09 -0000	2.15
--- tree-dfa.c	8 Jul 2004 15:52:50 -0000
*************** compute_immediate_uses (int flags, bool 
*** 178,184 ****
        tree phi;
  
        for (phi = phi_nodes (bb); phi; phi = PHI_CHAIN (phi))
! 	compute_immediate_uses_for_phi (phi, calc_for);
  
        for (si = bsi_start (bb); !bsi_end_p (si); bsi_next (&si))
          {
--- 178,197 ----
        tree phi;
  
        for (phi = phi_nodes (bb); phi; phi = PHI_CHAIN (phi))
! 	{
! 	  if (is_gimple_reg (PHI_RESULT (phi)))
! 	    {
! 	      if (!(flags & TDFA_USE_OPS))
! 		continue;
! 	    }
! 	  else
! 	    {
! 	      if (!(flags & TDFA_USE_VOPS))
! 		continue;
! 	    }
! 
! 	  compute_immediate_uses_for_phi (phi, calc_for);
! 	}
  
        for (si = bsi_start (bb); !bsi_end_p (si); bsi_next (&si))
          {
Index: tree-eh.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-eh.c,v
retrieving revision 2.8
diff -c -3 -p -r2.8 tree-eh.c
*** tree-eh.c	29 Jun 2004 06:59:34 -0000	2.8
--- tree-eh.c	8 Jul 2004 15:52:50 -0000
*************** tree_could_trap_p (tree expr)
*** 1702,1708 ****
    bool honor_nans = false;
    bool honor_snans = false;
    bool fp_operation = false;
!   tree t;
  
    if (TREE_CODE_CLASS (code) == '<'
        || TREE_CODE_CLASS (code) == '1'
--- 1702,1708 ----
    bool honor_nans = false;
    bool honor_snans = false;
    bool fp_operation = false;
!   tree t, base, idx;
  
    if (TREE_CODE_CLASS (code) == '<'
        || TREE_CODE_CLASS (code) == '1'
*************** tree_could_trap_p (tree expr)
*** 1719,1732 ****
  
    switch (code)
      {
-     case ARRAY_REF:
-     case ARRAY_RANGE_REF:
      case COMPONENT_REF:
      case REALPART_EXPR:
      case IMAGPART_EXPR:
      case BIT_FIELD_REF:
!       t = get_base_address (expr);
!       return !t || tree_could_trap_p (t);
  
      case INDIRECT_REF:
        return !TREE_THIS_NOTRAP (expr);
--- 1719,1742 ----
  
    switch (code)
      {
      case COMPONENT_REF:
      case REALPART_EXPR:
      case IMAGPART_EXPR:
      case BIT_FIELD_REF:
!       t = TREE_OPERAND (expr, 0);
!       return tree_could_trap_p (t);
! 
!     case ARRAY_REF:
!     case ARRAY_RANGE_REF:
!       base = TREE_OPERAND (expr, 0);
!       idx = TREE_OPERAND (expr, 1);
!       if (tree_could_trap_p (base))
! 	return true;
! 
!       if (TREE_CODE_CLASS (TREE_CODE (idx)) != 'c')
! 	return true;
! 
!       return !in_array_bounds_p (base, idx);
  
      case INDIRECT_REF:
        return !TREE_THIS_NOTRAP (expr);
Index: tree-flow.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-flow.h,v
retrieving revision 2.20
diff -c -3 -p -r2.20 tree-flow.h
*** tree-flow.h	8 Jul 2004 06:34:22 -0000	2.20
--- tree-flow.h	8 Jul 2004 15:52:50 -0000
*************** struct tree_ann_common_d GTY(())
*** 80,85 ****
--- 80,89 ----
    /* Annotation type.  */
    enum tree_ann_type type;
  
+  /* Auxiliary info specific to a pass.  At all times, this
+     should either point to valid data or be NULL.  */
+   PTR GTY ((skip (""))) aux;
+ 
    /* The value handle for this expression.  Used by GVN-PRE.  */
    tree GTY((skip)) value_handle;
  };
*************** extern void propagate_tree_value (tree *
*** 597,602 ****
--- 601,612 ----
  extern void replace_exp (use_operand_p, tree);
  extern bool may_propagate_copy (tree, tree);
  
+ /* In tree-ssa-loop-*.c  */
+ 
+ void loop_commit_inserts (void);
+ bool for_each_index (tree *, bool (*) (tree, tree *, void *), void *);
+ void tree_ssa_lim (struct loops *);
+ 
  /* In tree-flow-inline.h  */
  static inline int phi_arg_from_edge (tree, edge);
  static inline bool is_call_clobbered (tree);
Index: tree-optimize.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-optimize.c,v
retrieving revision 2.26
diff -c -3 -p -r2.26 tree-optimize.c
*** tree-optimize.c	7 Jul 2004 03:19:55 -0000	2.26
--- tree-optimize.c	8 Jul 2004 15:52:50 -0000
*************** init_tree_optimization_passes (void)
*** 327,332 ****
--- 327,333 ----
  
    p = &pass_loop.sub;
    NEXT_PASS (pass_loop_init);
+   NEXT_PASS (pass_lim);
    NEXT_PASS (pass_loop_done);
    *p = NULL;
  
Index: tree-pass.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-pass.h,v
retrieving revision 2.4
diff -c -3 -p -r2.4 tree-pass.h
*** tree-pass.h	30 Jun 2004 21:28:59 -0000	2.4
--- tree-pass.h	8 Jul 2004 15:52:50 -0000
*************** extern struct tree_opt_pass pass_tail_re
*** 107,112 ****
--- 107,113 ----
  extern struct tree_opt_pass pass_tail_calls;
  extern struct tree_opt_pass pass_loop;
  extern struct tree_opt_pass pass_loop_init;
+ extern struct tree_opt_pass pass_lim;
  extern struct tree_opt_pass pass_loop_done;
  extern struct tree_opt_pass pass_ch;
  extern struct tree_opt_pass pass_ccp;
Index: tree-ssa-loop-im.c
===================================================================
RCS file: tree-ssa-loop-im.c
diff -N tree-ssa-loop-im.c
*** /dev/null	1 Jan 1970 00:00:00 -0000
--- tree-ssa-loop-im.c	8 Jul 2004 15:52:50 -0000
***************
*** 0 ****
--- 1,1217 ----
+ /* Loop invariant motion.
+    Copyright (C) 2003 Free Software Foundation, Inc.
+    
+ This file is part of GCC.
+    
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by the
+ Free Software Foundation; either version 2, or (at your option) any
+ later version.
+    
+ GCC is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+    
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING.  If not, write to the Free
+ Software Foundation, 59 Temple Place - Suite 330, Boston, MA
+ 02111-1307, USA.  */
+ 
+ #include "config.h"
+ #include "system.h"
+ #include "coretypes.h"
+ #include "tm.h"
+ #include "tree.h"
+ #include "rtl.h"
+ #include "tm_p.h"
+ #include "hard-reg-set.h"
+ #include "basic-block.h"
+ #include "output.h"
+ #include "diagnostic.h"
+ #include "tree-flow.h"
+ #include "tree-dump.h"
+ #include "timevar.h"
+ #include "cfgloop.h"
+ #include "domwalk.h"
+ #include "params.h"
+ #include "tree-pass.h"
+ #include "flags.h"
+ 
+ /* A type for the list of statements that have to be moved in order to be able
+    to hoist an invariant computation.  */
+ 
+ struct depend
+ {
+   tree stmt;
+   struct depend *next;
+ };
+ 
+ /* The possibilities of statement movement.  */
+ 
+ enum move_pos
+ {
+   MOVE_IMPOSSIBLE,		/* No movement -- side effect expression.  */
+   MOVE_PRESERVE_EXECUTION,	/* Must not cause the non-executed statement
+ 				   become executed -- memory accesses, ... */
+   MOVE_POSSIBLE			/* Unlimited movement.  */
+ };
+ 
+ /* The auxiliary data kept for each statement.  */
+ 
+ struct lim_aux_data
+ {
+   struct loop *max_loop;	/* The outermost loop in that the statement
+ 				   is invariant.  */
+ 
+   struct loop *tgt_loop;	/* The loop out of that we want to move the
+ 				   invariant.  */
+ 
+   struct loop *always_executed_in;
+ 				/* The outermost loop for that we are sure
+ 				   the statement is executed if the loop
+ 				   is entered.  */
+ 
+   bool sm_done;			/* True iff the store motion for a memory
+ 				   reference in the statement has already
+ 				   been executed.  */
+ 
+   unsigned cost;		/* Cost of the computation performed by the
+ 				   statement.  */
+ 
+   struct depend *depends;	/* List of statements that must be also hoisted
+ 				   out of the loop when this statement is
+ 				   hoisted; i.e. those that define the operands
+ 				   of the statement and are inside of the
+ 				   MAX_LOOP loop.  */
+ };
+ 
+ #define LIM_DATA(STMT) ((struct lim_aux_data *) (stmt_ann (STMT)->common.aux))
+ 
+ /* Description of a memory reference for store motion.  */
+ 
+ struct mem_ref
+ {
+   tree *ref;			/* The reference itself.  */
+   tree stmt;			/* The statement in that it occurs.  */
+   struct mem_ref *next;		/* Next use in the chain.  */
+ };
+ 
+ /* Minimum cost of an expensive expression.  */
+ #define LIM_EXPENSIVE ((unsigned) PARAM_VALUE (PARAM_LIM_EXPENSIVE))
+ 
+ /* The outermost loop for that execution of the header guarantees that the
+    block will be executed.  */
+ #define ALWAYS_EXECUTED_IN(BB) ((struct loop *) (BB)->aux)
+ 
+ /* Maximum uid in the statement in the function.  */
+ 
+ static unsigned max_uid;
+ 
+ /* Calls CBCK for each index in memory reference ADDR_P.  There are two
+    situations handled:
+    
+    Access to an array: ARRAY_REF (base, index).  In this case we pass the pointer
+    to the index, the base if it is an array and DATA to the callback.
+ 
+    Pointer dereference: INDIRECT_REF (index).  In this case we pass the pointer
+    to the index and DATA to the callback, and pass NULL_TREE instead of the
+    array base.
+    
+    If the callback returns false, the whole search stops and false is returned.
+    Otherwise the function returns true after traversing through the whole
+    reference *ADDR_P.  */
+ 
+ bool
+ for_each_index (tree *addr_p, bool (*cbck) (tree, tree *, void *), void *data)
+ {
+   tree *nxt;
+ 
+   for (; ; addr_p = nxt)
+     {
+       switch (TREE_CODE (*addr_p))
+ 	{
+ 	case SSA_NAME:
+ 	  return cbck (NULL, addr_p, data);
+ 
+ 	case INDIRECT_REF:
+ 	  nxt = &TREE_OPERAND (*addr_p, 0);
+ 	  return cbck (NULL, nxt, data);
+ 
+ 	case BIT_FIELD_REF:
+ 	case COMPONENT_REF:
+ 	case VIEW_CONVERT_EXPR:
+ 	  nxt = &TREE_OPERAND (*addr_p, 0);
+ 	  break;
+ 
+ 	case ARRAY_REF:
+ 	case ARRAY_RANGE_REF:
+ 	  nxt = &TREE_OPERAND (*addr_p, 0);
+ 	  if (!cbck (*nxt, &TREE_OPERAND (*addr_p, 1), data))
+ 	    return false;
+ 	  break;
+ 
+ 	case VAR_DECL:
+ 	case PARM_DECL:
+ 	case STRING_CST:
+ 	case RESULT_DECL:
+ 	  return true;
+ 
+ 	default:
+     	  abort ();
+ 	}
+     }
+ }
+ 
+ /* If it is possible to hoist the statement STMT unconditionally,
+    returns MOVE_POSSIBLE.
+    If it is possible to hoist the statement STMT, but we must avoid making
+    it executed if it would not be executed in the original program (e.g.
+    because it may trap), return MOVE_PRESERVE_EXECUTION.
+    Otherwise return MOVE_IMPOSSIBLE.  */
+ 
+ static enum move_pos
+ movement_possibility (tree stmt)
+ {
+   tree lhs, rhs;
+ 
+   if (flag_unswitch_loops
+       && TREE_CODE (stmt) == COND_EXPR)
+     {
+       /* If we perform unswitching, force the operands of the invariant
+ 	 condition to be moved out of the loop.  */
+       get_stmt_operands (stmt);
+ 
+       return MOVE_POSSIBLE;
+     }
+ 
+   if (TREE_CODE (stmt) != MODIFY_EXPR)
+     return MOVE_IMPOSSIBLE;
+ 
+   if (stmt_ends_bb_p (stmt))
+     return MOVE_IMPOSSIBLE;
+ 
+   get_stmt_operands (stmt);
+ 
+   if (stmt_ann (stmt)->has_volatile_ops)
+     return MOVE_IMPOSSIBLE;
+ 
+   lhs = TREE_OPERAND (stmt, 0);
+   if (TREE_CODE (lhs) == SSA_NAME
+       && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
+     return MOVE_IMPOSSIBLE;
+ 
+   rhs = TREE_OPERAND (stmt, 1);
+ 
+   if (TREE_SIDE_EFFECTS (rhs))
+     return MOVE_IMPOSSIBLE;
+ 
+   if (TREE_CODE (lhs) != SSA_NAME
+       || tree_could_trap_p (rhs))
+     return MOVE_PRESERVE_EXECUTION;
+ 
+   return MOVE_POSSIBLE;
+ }
+ 
+ /* Suppose that operand DEF is used inside the LOOP.  Returns the outermost
+    loop to that we could move the expresion using DEF if it did not have
+    other operands, i.e. the outermost loop enclosing LOOP in that the value
+    of DEF is invariant.  */
+ 
+ static struct loop *
+ outermost_invariant_loop (tree def, struct loop *loop)
+ {
+   tree def_stmt;
+   basic_block def_bb;
+   struct loop *max_loop;
+ 
+   if (is_gimple_min_invariant (def))
+     return superloop_at_depth (loop, 1);
+ 
+   def_stmt = SSA_NAME_DEF_STMT (def);
+   def_bb = bb_for_stmt (def_stmt);
+   if (!def_bb)
+     return superloop_at_depth (loop, 1);
+ 
+   max_loop = find_common_loop (loop, def_bb->loop_father);
+ 
+   if (LIM_DATA (def_stmt) && LIM_DATA (def_stmt)->max_loop)
+     max_loop = find_common_loop (max_loop,
+ 				 LIM_DATA (def_stmt)->max_loop->outer);
+   if (max_loop == loop)
+     return NULL;
+   max_loop = superloop_at_depth (loop, max_loop->depth + 1);
+ 
+   return max_loop;
+ }
+ 
+ /* DATA is a structure containing information associated with a statement
+    inside LOOP.  DEF is one of the operands of this statement.
+    
+    Find the outermost loop enclosing LOOP in that value of DEF is invariant
+    and record this in DATA->max_loop field.  If DEF itself is defined inside
+    this loop as well (i.e. we need to hoist it out of the loop if we want
+    to hoist the statement represented by DATA), record the statement in that
+    DEF is defined to the DATA->depends list.  Additionally if ADD_COST is true,
+    add the cost of the computation of DEF to the DATA->cost.
+    
+    If DEF is not invariant in LOOP, return false.  Otherwise return TRUE.  */
+ 
+ static bool
+ add_dependency (tree def, struct lim_aux_data *data, struct loop *loop,
+ 		bool add_cost)
+ {
+   tree def_stmt = SSA_NAME_DEF_STMT (def);
+   basic_block def_bb = bb_for_stmt (def_stmt);
+   struct loop *max_loop;
+   struct depend *dep;
+ 
+   if (!def_bb)
+     return true;
+ 
+   max_loop = outermost_invariant_loop (def, loop);
+   if (!max_loop)
+     return false;
+ 
+   if (flow_loop_nested_p (data->max_loop, max_loop))
+     data->max_loop = max_loop;
+ 
+   if (!LIM_DATA (def_stmt))
+     return true;
+ 
+   if (add_cost
+       /* Only add the cost if the statement defining DEF is inside LOOP,
+ 	 i.e. if it is likely that by moving the invariants dependent
+ 	 on it, we will be able to avoid creating a new register for
+ 	 it (since it will be only used in these dependent invariants).  */
+       && def_bb->loop_father == loop)
+     data->cost += LIM_DATA (def_stmt)->cost;
+ 
+   dep = xmalloc (sizeof (struct depend));
+   dep->stmt = def_stmt;
+   dep->next = data->depends;
+   data->depends = dep;
+ 
+   return true;
+ }
+ 
+ /* Returns an estimate for a cost of statement STMT.  TODO -- the values here
+    are just ad-hoc constants.  The estimates should be based on target-specific
+    values.  */
+ 
+ static unsigned
+ stmt_cost (tree stmt)
+ {
+   tree lhs, rhs;
+   unsigned cost = 1;
+ 
+   /* Always try to create possibilities for unswitching.  */
+   if (TREE_CODE (stmt) == COND_EXPR)
+     return LIM_EXPENSIVE;
+ 
+   lhs = TREE_OPERAND (stmt, 0);
+   rhs = TREE_OPERAND (stmt, 1);
+ 
+   /* Hoisting memory references out should almost surely be a win.  */
+   if (!is_gimple_variable (lhs))
+     cost += 20;
+   if (is_gimple_addr_expr_arg (rhs) && !is_gimple_variable (rhs))
+     cost += 20;
+ 
+   switch (TREE_CODE (rhs))
+     {
+     case CALL_EXPR:
+       /* We should be hoisting calls if possible.  */
+ 
+       /* Unless the call is a builtin_constant_p; this always folds to a
+ 	 constant, so moving it is useless.  */
+       rhs = get_callee_fndecl (rhs);
+       if (DECL_BUILT_IN (rhs)
+ 	  && DECL_FUNCTION_CODE (rhs) == BUILT_IN_CONSTANT_P)
+ 	return 0;
+ 
+       cost += 20;
+       break;
+ 
+     case MULT_EXPR:
+     case TRUNC_DIV_EXPR:
+     case CEIL_DIV_EXPR:
+     case FLOOR_DIV_EXPR:
+     case ROUND_DIV_EXPR:
+     case EXACT_DIV_EXPR:
+     case CEIL_MOD_EXPR:
+     case FLOOR_MOD_EXPR:
+     case ROUND_MOD_EXPR:
+     case TRUNC_MOD_EXPR:
+       /* Division and multiplication are usually expensive.  */
+       cost += 20;
+       break;
+ 
+     default:
+       break;
+     }
+ 
+   return cost;
+ }
+ 
+ /* Determine the outermost loop to that it is possible to hoist a statement
+    STMT and store it to LIM_DATA (STMT)->max_loop.  To do this we determine
+    the outermost loop in that the value computed by STMT is invariant.
+    If MUST_PRESERVE_EXEC is true, additionally choose such a loop that
+    we preserve the fact whether STMT is executed.  It also fills other related
+    information to LIM_DATA (STMT).
+    
+    The function returns false if STMT cannot be hoisted outside of the loop it
+    is defined in, and true otherwise.  */
+ 
+ static bool
+ determine_max_movement (tree stmt, bool must_preserve_exec)
+ {
+   basic_block bb = bb_for_stmt (stmt);
+   struct loop *loop = bb->loop_father;
+   struct loop *level;
+   struct lim_aux_data *lim_data = LIM_DATA (stmt);
+   use_optype uses;
+   vuse_optype vuses;
+   v_may_def_optype v_may_defs;
+   stmt_ann_t ann = stmt_ann (stmt);
+   unsigned i;
+   
+   if (must_preserve_exec)
+     level = ALWAYS_EXECUTED_IN (bb);
+   else
+     level = superloop_at_depth (loop, 1);
+   lim_data->max_loop = level;
+ 
+   uses = USE_OPS (ann);
+   for (i = 0; i < NUM_USES (uses); i++)
+     if (!add_dependency (USE_OP (uses, i), lim_data, loop, true))
+       return false;
+ 
+   vuses = VUSE_OPS (ann);
+   for (i = 0; i < NUM_VUSES (vuses); i++)
+     if (!add_dependency (VUSE_OP (vuses, i), lim_data, loop, false))
+       return false;
+ 
+   v_may_defs = V_MAY_DEF_OPS (ann);
+   for (i = 0; i < NUM_V_MAY_DEFS (v_may_defs); i++)
+     if (!add_dependency (V_MAY_DEF_OP (v_may_defs, i), lim_data, loop, false))
+       return false;
+ 
+   lim_data->cost += stmt_cost (stmt);
+ 
+   return true;
+ }
+ 
+ /* Suppose that some statement in ORIG_LOOP is hoisted to the loop LEVEL,
+    and that one of the operands of this statement is computed by STMT.
+    Ensure that STMT (together with all the statements that define its
+    operands) is hoisted at least out of the loop LEVEL.  */
+ 
+ static void
+ set_level (tree stmt, struct loop *orig_loop, struct loop *level)
+ {
+   struct loop *stmt_loop = bb_for_stmt (stmt)->loop_father;
+   struct depend *dep;
+ 
+   stmt_loop = find_common_loop (orig_loop, stmt_loop);
+   if (LIM_DATA (stmt) && LIM_DATA (stmt)->tgt_loop)
+     stmt_loop = find_common_loop (stmt_loop,
+ 				  LIM_DATA (stmt)->tgt_loop->outer);
+   if (flow_loop_nested_p (stmt_loop, level))
+     return;
+ 
+   if (!LIM_DATA (stmt))
+     abort ();
+ 
+   if (level != LIM_DATA (stmt)->max_loop
+       && !flow_loop_nested_p (LIM_DATA (stmt)->max_loop, level))
+     abort ();
+ 
+   LIM_DATA (stmt)->tgt_loop = level;
+   for (dep = LIM_DATA (stmt)->depends; dep; dep = dep->next)
+     set_level (dep->stmt, orig_loop, level);
+ }
+ 
+ /* Determines an outermost loop from that we want to hoist the statement STMT.
+    For now we chose the outermost possible loop.  TODO -- use profiling
+    information to set it more sanely.  */
+ 
+ static void
+ set_profitable_level (tree stmt)
+ {
+   set_level (stmt, bb_for_stmt (stmt)->loop_father, LIM_DATA (stmt)->max_loop);
+ }
+ 
+ /* Returns true if STMT is not a pure call.  */
+ 
+ static bool
+ nonpure_call_p (tree stmt)
+ {
+   tree call = get_call_expr_in (stmt);
+ 
+   if (!call)
+     return false;
+ 
+   return TREE_SIDE_EFFECTS (call) != 0;
+ }
+ 
+ /* Releases the memory occupied by DATA.  */
+ 
+ static void
+ free_lim_aux_data (struct lim_aux_data *data)
+ {
+   struct depend *dep, *next;
+ 
+   for (dep = data->depends; dep; dep = next)
+     {
+       next = dep->next;
+       free (dep);
+     }
+   free (data);
+ }
+ 
+ /* Determine the outermost loops in that statements in basic block BB are
+    invariant, and record them to the LIM_DATA associated with the statements.
+    Callback for walk_dominator_tree.  */
+ 
+ static void
+ determine_invariantness_stmt (struct dom_walk_data *dw_data ATTRIBUTE_UNUSED,
+ 			      basic_block bb)
+ {
+   enum move_pos pos;
+   block_stmt_iterator bsi;
+   tree stmt;
+   bool maybe_never = ALWAYS_EXECUTED_IN (bb) == NULL;
+   struct loop *outermost = ALWAYS_EXECUTED_IN (bb);
+ 
+   if (!bb->loop_father->outer)
+     return;
+ 
+   if (dump_file && (dump_flags & TDF_DETAILS))
+     fprintf (dump_file, "Basic block %d (loop %d -- depth %d):\n\n",
+ 	     bb->index, bb->loop_father->num, bb->loop_father->depth);
+ 
+   for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
+     {
+       stmt = bsi_stmt (bsi);
+ 
+       pos = movement_possibility (stmt);
+       if (pos == MOVE_IMPOSSIBLE)
+ 	{
+ 	  if (nonpure_call_p (stmt))
+ 	    {
+ 	      maybe_never = true;
+ 	      outermost = NULL;
+ 	    }
+ 	  continue;
+ 	}
+ 
+       stmt_ann (stmt)->common.aux = xcalloc (1, sizeof (struct lim_aux_data));
+       LIM_DATA (stmt)->always_executed_in = outermost;
+ 
+       if (maybe_never && pos == MOVE_PRESERVE_EXECUTION)
+ 	continue;
+ 
+       if (!determine_max_movement (stmt, pos == MOVE_PRESERVE_EXECUTION))
+ 	{
+ 	  LIM_DATA (stmt)->max_loop = NULL;
+ 	  continue;
+ 	}
+ 
+       if (dump_file && (dump_flags & TDF_DETAILS))
+ 	{
+ 	  print_generic_stmt_indented (dump_file, stmt, 0, 2);
+ 	  fprintf (dump_file, "  invariant up to level %d, cost %d.\n\n",
+ 		   LIM_DATA (stmt)->max_loop->depth,
+ 		   LIM_DATA (stmt)->cost);
+ 	}
+ 
+       if (LIM_DATA (stmt)->cost >= LIM_EXPENSIVE)
+ 	set_profitable_level (stmt);
+     }
+ }
+ 
+ /* For each statement determines the outermost loop in that it is invariant,
+    statements on whose motion it depends and the cost of the computation.
+    This information is stored to the LIM_DATA structure associated with
+    each statement.  */
+ 
+ static void
+ determine_invariantness (void)
+ {
+   struct dom_walk_data walk_data;
+ 
+   memset (&walk_data, 0, sizeof (struct dom_walk_data));
+   walk_data.before_dom_children_before_stmts = determine_invariantness_stmt;
+ 
+   init_walk_dominator_tree (&walk_data);
+   walk_dominator_tree (&walk_data, ENTRY_BLOCK_PTR);
+   fini_walk_dominator_tree (&walk_data);
+ }
+ 
+ /* Commits edge insertions and updates loop structures.  */
+ 
+ void
+ loop_commit_inserts (void)
+ {
+   unsigned old_last_basic_block, i;
+   basic_block bb;
+ 
+   old_last_basic_block = last_basic_block;
+   bsi_commit_edge_inserts (NULL);
+   for (i = old_last_basic_block; i < (unsigned) last_basic_block; i++)
+     {
+       bb = BASIC_BLOCK (i);
+       add_bb_to_loop (bb,
+ 		      find_common_loop (bb->succ->dest->loop_father,
+ 					bb->pred->src->loop_father));
+     }
+ }
+ 
+ /* Hoist the statements in basic block BB out of the loops prescribed by
+    data stored in LIM_DATA structres associated with each statement.  Callback
+    for walk_dominator_tree.  */
+ 
+ static void
+ move_computations_stmt (struct dom_walk_data *dw_data ATTRIBUTE_UNUSED,
+ 			basic_block bb)
+ {
+   struct loop *level;
+   block_stmt_iterator bsi;
+   tree stmt;
+   unsigned cost = 0;
+ 
+   if (!bb->loop_father->outer)
+     return;
+ 
+   for (bsi = bsi_start (bb); !bsi_end_p (bsi); )
+     {
+       stmt = bsi_stmt (bsi);
+ 
+       if (!LIM_DATA (stmt))
+ 	{
+ 	  bsi_next (&bsi);
+ 	  continue;
+ 	}
+ 
+       cost = LIM_DATA (stmt)->cost;
+       level = LIM_DATA (stmt)->tgt_loop;
+       free_lim_aux_data (LIM_DATA (stmt));
+       stmt_ann (stmt)->common.aux = NULL;
+ 
+       if (!level)
+ 	{
+ 	  bsi_next (&bsi);
+ 	  continue;
+ 	}
+ 
+       /* We do not really want to move conditionals out of the loop; we just
+ 	 placed it here to force its operands to be moved if necessary.  */
+       if (TREE_CODE (stmt) == COND_EXPR)
+ 	continue;
+ 
+       if (dump_file && (dump_flags & TDF_DETAILS))
+ 	{
+ 	  fprintf (dump_file, "Moving statement\n");
+ 	  print_generic_stmt (dump_file, stmt, 0);
+ 	  fprintf (dump_file, "(cost %u) out of loop %d.\n\n",
+ 		   cost, level->num);
+ 	}
+       bsi_insert_on_edge (loop_preheader_edge (level), stmt);
+       bsi_remove (&bsi);
+     }
+ }
+ 
+ /* Hoist the statements out of the loops prescribed by data stored in
+    LIM_DATA structres associated with each statement.*/
+ 
+ static void
+ move_computations (void)
+ {
+   struct dom_walk_data walk_data;
+ 
+   memset (&walk_data, 0, sizeof (struct dom_walk_data));
+   walk_data.before_dom_children_before_stmts = move_computations_stmt;
+ 
+   init_walk_dominator_tree (&walk_data);
+   walk_dominator_tree (&walk_data, ENTRY_BLOCK_PTR);
+   fini_walk_dominator_tree (&walk_data);
+ 
+   loop_commit_inserts ();
+   rewrite_into_ssa (false);
+   bitmap_clear (vars_to_rename);
+ }
+ 
+ /* Checks whether the statement defining variable *INDEX can be hoisted
+    out of the loop passed in DATA.  Callback for for_each_index.  */
+ 
+ static bool
+ may_move_till (tree base ATTRIBUTE_UNUSED, tree *index, void *data)
+ {
+   struct loop *loop = data, *max_loop;
+ 
+   if (TREE_CODE (*index) != SSA_NAME)
+     return true;
+ 
+   max_loop = outermost_invariant_loop (*index, loop);
+ 
+   if (!max_loop)
+     return false;
+ 
+   if (loop == max_loop
+       || flow_loop_nested_p (max_loop, loop))
+     return true;
+ 
+   return false;
+ }
+ 
+ /* Forces statement defining variable *INDEX to be moved out of the
+    loop passed in DATA.  Callback for for_each_index.  */
+ 
+ static bool
+ force_move_till (tree base ATTRIBUTE_UNUSED, tree *index, void *data)
+ {
+   tree stmt;
+ 
+   if (TREE_CODE (*index) != SSA_NAME)
+     return true;
+ 
+   stmt = SSA_NAME_DEF_STMT (*index);
+   if (IS_EMPTY_STMT (stmt))
+     return true;
+ 
+   set_level (stmt, bb_for_stmt (stmt)->loop_father, data);
+ 
+   return true;
+ }
+ 
+ /* Records memory reference *REF (that occurs in statement STMT)
+    to the list MEM_REFS.  */
+ 
+ static void
+ record_mem_ref (struct mem_ref **mem_refs, tree stmt, tree *ref)
+ {
+   struct mem_ref *aref = xmalloc (sizeof (struct mem_ref));
+ 
+   aref->stmt = stmt;
+   aref->ref = ref;
+ 
+   aref->next = *mem_refs;
+   *mem_refs = aref;
+ }
+ 
+ /* Releases list of memory references MEM_REFS.  */
+ 
+ static void
+ free_mem_refs (struct mem_ref *mem_refs)
+ {
+   struct mem_ref *act;
+ 
+   while (mem_refs)
+     {
+       act = mem_refs;
+       mem_refs = mem_refs->next;
+       free (act);
+     }
+ }
+ 
+ /* If VAR is defined in LOOP and the statement it is defined in does not belong
+    to the set SEEN, add the statement to QUEUE of length IN_QUEUE and
+    to the set SEEN.  */
+ 
+ static void
+ maybe_queue_var (tree var, struct loop *loop,
+ 		 sbitmap seen, tree *queue, unsigned *in_queue)
+ {
+   tree stmt = SSA_NAME_DEF_STMT (var);
+   basic_block def_bb = bb_for_stmt (stmt);
+ 	      
+   if (!def_bb
+       || !flow_bb_inside_loop_p (loop, def_bb)
+       || TEST_BIT (seen, stmt_ann (stmt)->uid))
+     return;
+ 	  
+   SET_BIT (seen, stmt_ann (stmt)->uid);
+   queue[(*in_queue)++] = stmt;
+ }
+ 
+ /* Determine whether all memory references inside the LOOP that correspond
+    to virtual ssa names defined in statement STMT are equal.
+    If so, store the list of the references to MEM_REFS, and return one
+    of them.  Otherwise store NULL to MEM_REFS and return NULL_TREE.  */
+ 
+ static tree
+ single_reachable_address (struct loop *loop, tree stmt,
+ 			  struct mem_ref **mem_refs)
+ {
+   tree *queue = xmalloc (sizeof (tree) * max_uid);
+   sbitmap seen = sbitmap_alloc (max_uid);
+   tree common_ref = NULL, *aref;
+   unsigned in_queue = 1;
+   dataflow_t df;
+   unsigned i, n;
+   v_may_def_optype v_may_defs;
+   vuse_optype vuses;
+ 
+   sbitmap_zero (seen);
+ 
+   *mem_refs = NULL;
+ 
+   queue[0] = stmt;
+   SET_BIT (seen, stmt_ann (stmt)->uid);
+ 
+   while (in_queue)
+     {
+       stmt = queue[--in_queue];
+ 
+       if (LIM_DATA (stmt)
+ 	  && LIM_DATA (stmt)->sm_done)
+ 	goto fail;
+ 
+       switch (TREE_CODE (stmt))
+ 	{
+ 	case MODIFY_EXPR:
+ 	  aref = &TREE_OPERAND (stmt, 0);
+ 	  if (is_gimple_reg (*aref)
+ 	      || !is_gimple_lvalue (*aref))
+ 	    aref = &TREE_OPERAND (stmt, 1);
+ 	  if (is_gimple_reg (*aref)
+ 	      || !is_gimple_lvalue (*aref)
+ 	      || (common_ref && !operand_equal_p (*aref, common_ref, 0)))
+ 	    goto fail;
+ 	  common_ref = *aref;
+ 
+ 	  record_mem_ref (mem_refs, stmt, aref);
+ 
+ 	  /* Traverse also definitions of the VUSES (there may be other
+ 	     distinct from the one we used to get to this statement).  */
+ 	  v_may_defs = STMT_V_MAY_DEF_OPS (stmt);
+ 	  for (i = 0; i < NUM_V_MAY_DEFS (v_may_defs); i++)
+ 	    maybe_queue_var (V_MAY_DEF_OP (v_may_defs, i), loop,
+ 			     seen, queue, &in_queue);
+ 
+ 	  vuses = STMT_VUSE_OPS (stmt);
+ 	  for (i = 0; i < NUM_VUSES (vuses); i++)
+ 	    maybe_queue_var (VUSE_OP (vuses, i), loop,
+ 			     seen, queue, &in_queue);
+ 	  break;
+ 
+ 	case PHI_NODE:
+ 	  for (i = 0; i < (unsigned) PHI_NUM_ARGS (stmt); i++)
+ 	    maybe_queue_var (PHI_ARG_DEF (stmt, i), loop,
+ 			     seen, queue, &in_queue);
+ 	  break;
+ 
+ 	default:
+ 	  goto fail;
+ 	}
+ 
+       /* Find uses of virtual names.  */
+       df = get_immediate_uses (stmt);
+       n = num_immediate_uses (df);
+ 
+       for (i = 0; i < n; i++)
+ 	{
+ 	  stmt = immediate_use (df, i);
+ 
+ 	  if (!flow_bb_inside_loop_p (loop, bb_for_stmt (stmt)))
+ 	    continue;
+ 
+ 	  if (TEST_BIT (seen, stmt_ann (stmt)->uid))
+ 	    continue;
+ 	  SET_BIT (seen, stmt_ann (stmt)->uid);
+ 
+ 	  queue[in_queue++] = stmt;
+ 	}
+     }
+ 
+   free (queue);
+   sbitmap_free (seen);
+ 
+   return common_ref;
+ 
+ fail:
+   free_mem_refs (*mem_refs);
+   *mem_refs = NULL;
+   free (queue);
+   sbitmap_free (seen);
+ 
+   return NULL;
+ }
+ 
+ /* Rewrites memory references in list MEM_REFS by variable TMP_VAR.  */
+ 
+ static void
+ rewrite_mem_refs (tree tmp_var, struct mem_ref *mem_refs)
+ {
+   v_may_def_optype v_may_defs;
+   v_must_def_optype v_must_defs;
+   vuse_optype vuses;
+   unsigned i;
+   tree var;
+ 
+   for (; mem_refs; mem_refs = mem_refs->next)
+     {
+       v_may_defs = STMT_V_MAY_DEF_OPS (mem_refs->stmt);
+       for (i = 0; i < NUM_V_MAY_DEFS (v_may_defs); i++)
+ 	{
+ 	  var = SSA_NAME_VAR (V_MAY_DEF_RESULT (v_may_defs, i));
+ 	  bitmap_set_bit (vars_to_rename, var_ann (var)->uid);
+ 	}
+ 
+       v_must_defs = STMT_V_MUST_DEF_OPS (mem_refs->stmt);
+       for (i = 0; i < NUM_V_MUST_DEFS (v_must_defs); i++)
+ 	{
+ 	  var = SSA_NAME_VAR (V_MUST_DEF_OP (v_must_defs, i));
+ 	  bitmap_set_bit (vars_to_rename, var_ann (var)->uid);
+ 	}
+ 
+       vuses = STMT_VUSE_OPS (mem_refs->stmt);
+       for (i = 0; i < NUM_VUSES (vuses); i++)
+ 	{
+ 	  var = SSA_NAME_VAR (VUSE_OP (vuses, i));
+ 	  bitmap_set_bit (vars_to_rename, var_ann (var)->uid);
+ 	}
+ 
+       *mem_refs->ref = tmp_var;
+       modify_stmt (mem_refs->stmt);
+     }
+ }
+ 
+ /* Records request for store motion of memory reference REF from LOOP.
+    MEM_REFS is the list of occurences of the reference REF inside LOOP;
+    these references are rewritten by a new temporary variable.
+    Exits from the LOOP are stored in EXITS, there are N_EXITS of them.
+    The initialization of the temporary variable is put to the preheader
+    of the loop, and assignments to the reference from the temporary variable
+    are emitted to exits.  */
+ 
+ static void
+ schedule_sm (struct loop *loop, edge *exits, unsigned n_exits, tree ref,
+ 	     struct mem_ref *mem_refs)
+ {
+   struct mem_ref *aref;
+   tree tmp_var;
+   unsigned i;
+   tree load, store;
+ 
+   tmp_var = make_rename_temp (TREE_TYPE (ref), "lsm_tmp");
+ 
+   for_each_index (&ref, force_move_till, loop);
+ 
+   rewrite_mem_refs (tmp_var, mem_refs);
+   for (aref = mem_refs; aref; aref = aref->next)
+     if (LIM_DATA (aref->stmt))
+       LIM_DATA (aref->stmt)->sm_done = true;
+ 
+   /* Emit the load & stores.  */
+   load = build (MODIFY_EXPR, void_type_node, tmp_var, ref);
+   modify_stmt (load);
+   stmt_ann (load)->common.aux = xcalloc (1, sizeof (struct lim_aux_data));
+   LIM_DATA (load)->max_loop = loop;
+   LIM_DATA (load)->tgt_loop = loop;
+ 
+   /* Put this into the latch, so that we are sure it will be processed after
+      all dependencies.  */
+   bsi_insert_on_edge (loop_latch_edge (loop), load);
+ 
+   for (i = 0; i < n_exits; i++)
+     {
+       store = build (MODIFY_EXPR, void_type_node,
+ 		     unshare_expr (ref), tmp_var);
+       bsi_insert_on_edge (exits[i], store);
+     }
+ }
+ 
+ /* Determine whether all memory references inside LOOP corresponding to the
+    virtual ssa name REG are equal to each other, and whether the address of
+    this common reference can be hoisted outside of the loop.  If this is true,
+    prepare the statements that load the value of the memory reference to a
+    temporary variable in the loop preheader, store it back on the loop exits,
+    and replace all the references inside LOOP by this temporary variable.
+    LOOP has N_EXITS stored in EXITS.  */
+ 
+ static void
+ determine_lsm_reg (struct loop *loop, edge *exits, unsigned n_exits, tree reg)
+ {
+   tree ref;
+   struct mem_ref *mem_refs, *aref;
+   struct loop *must_exec;
+   
+   if (is_gimple_reg (reg))
+     return;
+   
+   ref = single_reachable_address (loop, SSA_NAME_DEF_STMT (reg), &mem_refs);
+   if (!ref)
+     return;
+ 
+   if (!for_each_index (&ref, may_move_till, loop))
+     {
+       free_mem_refs (mem_refs);
+       return;
+     }
+ 
+   if (tree_could_trap_p (ref))
+     {
+       /* If the memory access is unsafe (i.e. it might trap), ensure that some
+ 	 of the statements in that it occurs is always executed when the loop
+ 	 is entered.  This way we know that by moving the load from the
+ 	 reference out of the loop we will not cause the error that would not
+ 	 occur otherwise.
+ 
+ 	 TODO -- in fact we would like to check for anticipability of the
+ 	 reference, i.e. that on each path from loop entry to loop exit at
+ 	 least one of the statements containing the memory reference is
+ 	 executed.  */
+ 
+       for (aref = mem_refs; aref; aref = aref->next)
+ 	{
+ 	  if (!LIM_DATA (aref->stmt))
+ 	    continue;
+ 
+ 	  must_exec = LIM_DATA (aref->stmt)->always_executed_in;
+ 	  if (!must_exec)
+ 	    continue;
+ 
+ 	  if (must_exec == loop
+ 	      || flow_loop_nested_p (must_exec, loop))
+ 	    break;
+ 	}
+ 
+       if (!aref)
+ 	{
+ 	  free_mem_refs (mem_refs);
+ 	  return;
+ 	}
+     }
+ 
+   schedule_sm (loop, exits, n_exits, ref, mem_refs);
+   free_mem_refs (mem_refs);
+ }
+ 
+ /* Checks whether LOOP (with N_EXITS exits stored in EXITS array) is suitable
+    for a store motion optimization (i.e. whether we can insert statement
+    on its exits).  */
+ 
+ static bool
+ loop_suitable_for_sm (struct loop *loop ATTRIBUTE_UNUSED, edge *exits,
+ 		      unsigned n_exits)
+ {
+   unsigned i;
+ 
+   for (i = 0; i < n_exits; i++)
+     if (exits[i]->flags & EDGE_ABNORMAL)
+       return false;
+ 
+   return true;
+ }
+ 
+ /* Try to perform store motion for all memory references modified inside
+    LOOP.  */
+ 
+ static void
+ determine_lsm_loop (struct loop *loop)
+ {
+   tree phi;
+   unsigned n_exits;
+   edge *exits = get_loop_exit_edges (loop, &n_exits);
+ 
+   if (!loop_suitable_for_sm (loop, exits, n_exits))
+     {
+       free (exits);
+       return;
+     }
+ 
+   for (phi = phi_nodes (loop->header); phi; phi = TREE_CHAIN (phi))
+     determine_lsm_reg (loop, exits, n_exits, PHI_RESULT (phi));
+ 
+   free (exits);
+ }
+ 
+ /* Try to perform store motion for all memory references modified inside
+    any of LOOPS.  */
+ 
+ static void
+ determine_lsm (struct loops *loops)
+ {
+   struct loop *loop;
+   basic_block bb;
+ 
+   /* Create a UID for each statement in the function.  Ordering of the
+      UIDs is not important for this pass.  */
+   max_uid = 0;
+   FOR_EACH_BB (bb)
+     {
+       block_stmt_iterator bsi;
+       tree phi;
+ 
+       for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
+ 	stmt_ann (bsi_stmt (bsi))->uid = max_uid++;
+ 
+       for (phi = phi_nodes (bb); phi; phi = TREE_CHAIN (phi))
+ 	stmt_ann (phi)->uid = max_uid++;
+     }
+ 
+   compute_immediate_uses (TDFA_USE_VOPS, NULL);
+ 
+   /* Pass the loops from the outermost.  For each virtual operand loop phi node
+      check whether all the references inside the loop correspond to a single
+      address, and if so, move them.  */
+ 
+   loop = loops->tree_root->inner;
+   while (1)
+     {
+       determine_lsm_loop (loop);
+ 
+       if (loop->inner)
+ 	{
+ 	  loop = loop->inner;
+ 	  continue;
+ 	}
+       while (!loop->next)
+ 	{
+ 	  loop = loop->outer;
+ 	  if (loop == loops->tree_root)
+ 	    {
+ 	      free_df ();
+ 	      loop_commit_inserts ();
+ 	      return;
+ 	    }
+ 	}
+       loop = loop->next;
+     }
+ }
+ 
+ /* Fills ALWAYS_EXECUTED_IN information for basic blocks of LOOP, i.e.
+    for each such basic block bb records the outermost loop for that execution
+    of its header implies execution of bb.  CONTAINS_CALL is the bitmap of
+    blocks that contain a nonpure call.  */
+ 
+ static void
+ fill_always_executed_in (struct loop *loop, sbitmap contains_call)
+ {
+   basic_block bb = NULL, *bbs, last = NULL;
+   unsigned i;
+   edge e;
+   struct loop *inn_loop = loop;
+ 
+   if (!loop->header->aux)
+     {
+       bbs = get_loop_body_in_dom_order (loop);
+ 
+       for (i = 0; i < loop->num_nodes; i++)
+ 	{
+ 	  bb = bbs[i];
+ 
+ 	  if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
+ 	    last = bb;
+ 
+ 	  if (TEST_BIT (contains_call, bb->index))
+ 	    break;
+ 
+ 	  for (e = bb->succ; e; e = e->succ_next)
+ 	    if (!flow_bb_inside_loop_p (loop, e->dest))
+ 	      break;
+ 	  if (e)
+ 	    break;
+ 
+ 	  /* A loop might be infinite (TODO use simple loop analysis
+ 	     to disprove this if possible).  */
+ 	  if (bb->flags & BB_IRREDUCIBLE_LOOP)
+ 	    break;
+ 
+ 	  if (!flow_bb_inside_loop_p (inn_loop, bb))
+ 	    break;
+ 
+ 	  if (bb->loop_father->header == bb)
+ 	    {
+ 	      if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
+ 		break;
+ 
+ 	      /* In a loop that is always entered we may proceed anyway.
+ 		 But record that we entered it and stop once we leave it.  */
+ 	      inn_loop = bb->loop_father;
+ 	    }
+ 	}
+ 
+       while (1)
+ 	{
+ 	  last->aux = loop;
+ 	  if (last == loop->header)
+ 	    break;
+ 	  last = get_immediate_dominator (CDI_DOMINATORS, last);
+ 	}
+ 
+       free (bbs);
+     }
+ 
+   for (loop = loop->inner; loop; loop = loop->next)
+     fill_always_executed_in (loop, contains_call);
+ }
+ 
+ /* Compute the global information needed by the loop invariant motion pass.
+    LOOPS is the loop tree.  */
+ 
+ static void
+ tree_ssa_lim_initialize (struct loops *loops)
+ {
+   sbitmap contains_call = sbitmap_alloc (last_basic_block);
+   block_stmt_iterator bsi;
+   struct loop *loop;
+   basic_block bb;
+ 
+   sbitmap_zero (contains_call);
+   FOR_EACH_BB (bb)
+     {
+       for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
+ 	{
+ 	  if (nonpure_call_p (bsi_stmt (bsi)))
+ 	    break;
+ 	}
+ 
+       if (!bsi_end_p (bsi))
+ 	SET_BIT (contains_call, bb->index);
+     }
+ 
+   for (loop = loops->tree_root->inner; loop; loop = loop->next)
+     fill_always_executed_in (loop, contains_call);
+ 
+   sbitmap_free (contains_call);
+ }
+ 
+ /* Cleans up after the invariant motion pass.  */
+ 
+ static void
+ tree_ssa_lim_finalize (void)
+ {
+   basic_block bb;
+ 
+   FOR_EACH_BB (bb)
+     {
+       bb->aux = NULL;
+     }
+ }
+ 
+ /* Moves invariants from LOOPS.  Only "expensive" invariants are moved out --
+    i.e. those that are likely to be win regardless of the register pressure.  */
+ 
+ void
+ tree_ssa_lim (struct loops *loops)
+ {
+   tree_ssa_lim_initialize (loops);
+ 
+   /* For each statement determine the outermost loop in that it is
+      invariant and cost for computing the invariant.  */
+   determine_invariantness ();
+ 
+   /* For each memory reference determine whether it is possible to hoist it
+      out of the loop.  Force the necessary invariants to be moved out of the
+      loops as well.  */
+   determine_lsm (loops);
+ 
+   /* Move the expressions that are expensive enough.  */
+   move_computations ();
+ 
+   tree_ssa_lim_finalize ();
+ }
Index: tree-ssa-loop.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop.c,v
retrieving revision 2.7
diff -c -3 -p -r2.7 tree-ssa-loop.c
*** tree-ssa-loop.c	30 Jun 2004 21:28:59 -0000	2.7
--- tree-ssa-loop.c	8 Jul 2004 15:52:50 -0000
*************** struct tree_opt_pass pass_loop_init = 
*** 111,116 ****
--- 111,149 ----
    0					/* todo_flags_finish */
  };
  
+ /* Loop invariant motion pass.  */
+ 
+ static void
+ tree_ssa_loop_im (void)
+ {
+   if (!current_loops)
+     return;
+ 
+   tree_ssa_lim (current_loops);
+ }
+ 
+ static bool
+ gate_tree_ssa_loop_im (void)
+ {
+   return flag_tree_lim != 0;
+ }
+ 
+ struct tree_opt_pass pass_lim = 
+ {
+   "lim",				/* name */
+   gate_tree_ssa_loop_im,		/* gate */
+   tree_ssa_loop_im,			/* execute */
+   NULL,					/* sub */
+   NULL,					/* next */
+   0,					/* static_pass_number */
+   TV_LIM,				/* tv_id */
+   PROP_cfg,				/* properties_required */
+   0,					/* properties_provided */
+   0,					/* properties_destroyed */
+   0,					/* todo_flags_start */
+   TODO_dump_func                	/* todo_flags_finish */
+ };
+ 
  /* Loop optimizer finalization.  */
  
  static void
Index: tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.c,v
retrieving revision 1.392
diff -c -3 -p -r1.392 tree.c
*** tree.c	8 Jul 2004 08:44:09 -0000	1.392
--- tree.c	8 Jul 2004 15:52:51 -0000
*************** build_empty_stmt (void)
*** 5662,5667 ****
--- 5662,5697 ----
  }
  
  
+ /* Returns true if it is possible to prove that IDX is inside bounds of
+  * ARRAY.  */
+ 
+ bool
+ in_array_bounds_p (tree array, tree idx)
+ {
+   tree dom = TYPE_DOMAIN (TREE_TYPE (array));
+   tree min, max;
+ 
+   if (TREE_CODE (idx) != INTEGER_CST)
+     return false;
+ 	    
+   if (!dom)
+     return false;
+ 
+   min = TYPE_MIN_VALUE (dom);
+   max = TYPE_MAX_VALUE (dom);
+   if (!min
+       || !max
+       || TREE_CODE (min) != INTEGER_CST
+       || TREE_CODE (max) != INTEGER_CST)
+     return false;
+ 
+   if (tree_int_cst_lt (idx, min)
+       || tree_int_cst_lt (max, idx))
+     return false;
+ 
+   return true;
+ }
+ 
  /* Return true if T (assumed to be a DECL) must be assigned a memory
     location.  */
  
Index: tree.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.h,v
retrieving revision 1.547
diff -c -3 -p -r1.547 tree.h
*** tree.h	8 Jul 2004 08:44:10 -0000	1.547
--- tree.h	8 Jul 2004 15:52:51 -0000
*************** extern tree build_method_type (tree, tre
*** 2722,2727 ****
--- 2722,2728 ----
  extern tree build_offset_type (tree, tree);
  extern tree build_complex_type (tree);
  extern tree array_type_nelts (tree);
+ extern bool in_array_bounds_p (tree, tree);
  
  extern tree value_member (tree, tree);
  extern tree purpose_member (tree, tree);
Index: doc/invoke.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.480
diff -c -3 -p -r1.480 invoke.texi
*** doc/invoke.texi	7 Jul 2004 19:24:57 -0000	1.480
--- doc/invoke.texi	8 Jul 2004 15:52:51 -0000
*************** in the following sections.
*** 313,318 ****
--- 313,319 ----
  -funroll-all-loops  -funroll-loops  -fpeel-loops @gol
  -funswitch-loops  -fold-unroll-loops  -fold-unroll-all-loops @gol
  -ftree-pre  -ftree-ccp  -ftree-dce -ftree-loop-optimize @gol
+ -ftree-lim @gol
  -ftree-dominator-opts -ftree-dse -ftree-copyrename @gol
  -ftree-ch -ftree-sra -ftree-ter -ftree-lrs -ftree-fre @gol
  --param @var{name}=@var{value}
*************** usually increases code size.
*** 4401,4406 ****
--- 4402,4415 ----
  Perform loop optimizations on trees.  This flag is enabled by default at -O
  and higher.
  
+ @item -ftree-lim
+ Perform loop invariant motion on trees.  This pass moves only invartiants that
+ would be hard to handle on rtl level (function calls, operations that expand to
+ nontrivial sequences of insns).  With @option{-funswitch-loops} it also moves
+ operands of conditions that are invariant out of the loop, so that we can use
+ just trivial invariantness analysis in loop unswitching.  The pass also includes
+ store motion.
+ 
  @item -ftree-sra
  Perform scalar replacement of aggregates.  This pass replaces structure
  references with scalars to prevent committing structures to memory too
*************** The maximum number of insns of an unswit
*** 5174,5179 ****
--- 5183,5191 ----
  @item max-unswitch-level
  The maximum number of branches unswitched in a single loop.
  
+ @item lim-expensive
+ The minimum cost of an expensive expression in the loop invariant motion.
+ 
  @item hot-bb-count-fraction
  Select fraction of the maximal count of repetitions of basic block in program
  given basic block needs to have to be considered hot.
Index: doc/passes.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/passes.texi,v
retrieving revision 1.34
diff -c -3 -p -r1.34 passes.texi
*** doc/passes.texi	20 Jun 2004 21:31:32 -0000	1.34
--- doc/passes.texi	8 Jul 2004 15:52:51 -0000
*************** and is described by @code{pass_pre}.
*** 363,371 ****
  
  @item Loop optimization
  
! TODO: Presumably we're going to do something with loops here.  At
! present we don't, and this is a placeholder.  The pass is located
! in @file{tree-ssa-loop.c} and is described by @code{pass_loop}.
  
  @item Conditional constant propagation
  
--- 363,382 ----
  
  @item Loop optimization
  
! The main driver of the pass is placed in @file{tree-ssa-loop.c}
! and described by @code{pass_loop}.
! 
! The optimizations performed by this pass are:
! 
! Loop invariant motion.  This pass moves only invariants that
! would be hard to handle on rtl level (function calls, operations that expand to
! nontrivial sequences of insns).  With @option{-funswitch-loops} it also moves
! operands of conditions that are invariant out of the loop, so that we can use
! just trivial invariantness analysis in loop unswitching.  The pass also includes
! store motion.  The pass is implemented in @file{tree-ssa-loop-im.c}.
! 
! The optimizations also use various utility functions contained in
! @file{cfgloop.c}, @file{cfgloopanal.c} and @file{cfgloopmanip.c}.
  
  @item Conditional constant propagation
  


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]