This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] for PR 18316 (updated)


Hello,

> The rest of the patch looks reasonable.  Do you have any benchmark
> figures on the impact of this patch?  We've had problems in the past
> with the number of induction variables we find and their effect on
> register pressure.  If there's no (large) performance degradation on
> SPEC with this change, I'll approve it for mainline.

here is the patch that includes the changes you requested, and also an
improvement to fix the huge regressions the old patch caused on some
testcases (mgrid).

The reason for the regressions was the following:  ivopts create
expressions like

iv + &a[x + y + 8] - &a[x + y]

and expect fold to take care of turning this to

iv + 8 * sizeof(*a)

However fold is not clever enough to do this in many cases.  This did
not cause that many problems before, since the operations were cleaned
up after expansion to rtl.  However, in mgrid there are several nested
loops and "x" and "y" happen to be induction variables (with nonconstant
step) of those outer loops.  And the attempts of ivopts to optimize
these uses of course cause complete mess, and even the rtl optimizers
are no longer able to get the "8 * sizeof(*a)" constants, and the result
is a huge performance regression.

The patch fixes this problem by implementing improved folding of this
type of expressions (basically for the affine combinations of
variables).  This may be relatively compile time expensive, thus I
haven't added this to fold, but kept as a separate set of functions.
(On a side note, the decomposition to the affine expression can
be used to significantly simplify the process of generation of
TARGET_MEM_REFs in the patch http://gcc.gnu.org/ml/gcc-patches/2005-04/msg01360.html;
I would like to work on that once this patch is approved).

With this version of the patch, we get no significant regressions on
spec on i686, overall unchanged score on specint and a minor improvement
on specfp (see the results below, at -O2, peak with the patch).

Bootstrapped & regtested on i686, x86_64 and ia64.

Zdenek

   164.gzip          1400   199       704    *     1400   199       704    *
   175.vpr           1400   393       356    *     1400   392       357    *
   176.gcc                                   X                             X
   181.mcf           1800   739       244    *     1800   736       245    *
   186.crafty        1000   120       834    *     1000   118       846    *
   197.parser        1800   395       456    *     1800   394       457    *
   252.eon           1300   143       909    *     1300   142       913    *
   253.perlbmk       1800   203       887    *     1800   204       884    *
   254.gap           1100   174       634    *     1100   175       629    *
   255.vortex        1900   232       818    *     1900   232       817    *
   256.bzip2         1500   335       447    *     1500   338       444    *
   300.twolf         3000   786       382    *     3000   786       382    *
   Est. SPECint_base2000              559    
   Est. SPECint2000                                                 559    

   168.wupwise       1600   283       565    *     1600   283       566    *
   171.swim          3100   673       461    *     3100   676       458    *
   172.mgrid         1800   507       355    *     1800   480       375    *
   173.applu         2100   578       363    *     2100   542       388    *
   177.mesa          1400   201       696    *     1400   202       695    *
   178.galgel                                X                             X
   179.art           2600  1472       177    *     2600  1465       178    *
   183.equake        1300   291       446    *     1300   294       443    *
   187.facerec       1900   478       397    *     1900   468       406    *
   188.ammp          2200   610       361    *     2200   621       354    *
   189.lucas         2000   428       467    *     2000   429       467    *
   191.fma3d         2100   396       531    *     2100   396       530    *
   200.sixtrack      1100   237       463    *     1100   238       461    *
   301.apsi          2600   688       378    *     2600   660       394    *
   Est. SPECfp_base2000               417    
   Est. SPECfp2000                                                  422    

	PR tree-optimization/18316
	PR tree-optimization/19126
	* tree.c (build_int_cst_type): Avoid shift by size of type.
	* tree-scalar-evolution.c (simple_iv): Add allow_nonconstant_step
	argument.
	* tree-scalar-evolution.h (simple_iv): Declaration changed.
	* tree-ssa-loop-ivopts.c (struct iv_cand): Add depends_on
	field.
	(dump_cand): Dump depends_on information.
	(determine_biv_step): Add argument to simple_iv call.
	(contains_abnormal_ssa_name_p): Handle case expr == NULL.
	(find_bivs, find_givs_in_stmt_scev): Do not require step to be a
	constant.
	(add_candidate_1): Record depends_on for candidates.
	(tree_int_cst_sign_bit, constant_multiple_of): New functions.
	(get_computation_at, get_computation_cost_at, may_eliminate_iv):
	Handle ivs with nonconstant step.
	(iv_ca_set_remove_invariants, iv_ca_set_add_invariants): New functions.
	(iv_ca_set_no_cp, iv_ca_set_cp): Handle cand->depends_on.
	(create_new_iv): Unshare the step before passing it to create_iv.
	(free_loop_data): Free cand->depends_on.
	(build_addr_strip_iref): New function.
	(find_interesting_uses_address): Use build_addr_strip_iref.
	(strip_offset_1): Split the recursive part from strip_offset.
	Strip constant offset component_refs and array_refs.
	(strip_offset): Split the recursive part to strip_offset_1.
	(add_address_candidates): Removed.
	(add_derived_ivs_candidates): Do not use add_address_candidates.
	(add_iv_value_candidates): Add candidates with stripped constant
	offset.  Consider all candidates with initial value 0 important.
	(struct affine_tree_combination): New.
	(aff_combination_const, aff_combination_elt, aff_combination_scale,
	aff_combination_add_elt, aff_combination_add,
	tree_to_aff_combination, add_elt_to_tree, aff_combination_to_tree,
	fold_affine_sum): New functions.
	(get_computation_at): Use fold_affine_sum.
	* tree-ssa-loop-manip.c (create_iv): Handle ivs with nonconstant step.
	* tree-ssa-loop-niter.c (number_of_iterations_exit): Add argument
	to simple_iv call.

Index: tree-scalar-evolution.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v
retrieving revision 2.21
diff -c -3 -p -r2.21 tree-scalar-evolution.c
*** tree-scalar-evolution.c	21 Apr 2005 08:48:53 -0000	2.21
--- tree-scalar-evolution.c	22 Apr 2005 21:23:04 -0000
*************** scev_reset (void)
*** 2547,2556 ****
  }
  
  /* Checks whether OP behaves as a simple affine iv of LOOP in STMT and returns
!    its BASE and STEP if possible.  */
  
  bool
! simple_iv (struct loop *loop, tree stmt, tree op, tree *base, tree *step)
  {
    basic_block bb = bb_for_stmt (stmt);
    tree type, ev;
--- 2547,2559 ----
  }
  
  /* Checks whether OP behaves as a simple affine iv of LOOP in STMT and returns
!    its BASE and STEP if possible.  If ALLOW_NONCONSTANT_STEP is true, we
!    want STEP to be invariant in LOOP.  Otherwise we require it to be an
!    integer constant.  */
  
  bool
! simple_iv (struct loop *loop, tree stmt, tree op, tree *base, tree *step,
! 	   bool allow_nonconstant_step)
  {
    basic_block bb = bb_for_stmt (stmt);
    tree type, ev;
*************** simple_iv (struct loop *loop, tree stmt,
*** 2579,2586 ****
      return false;
  
    *step = CHREC_RIGHT (ev);
!   if (TREE_CODE (*step) != INTEGER_CST)
      return false;
    *base = CHREC_LEFT (ev);
    if (tree_contains_chrecs (*base, NULL)
        || chrec_contains_symbols_defined_in_loop (*base, loop->num))
--- 2582,2596 ----
      return false;
  
    *step = CHREC_RIGHT (ev);
!   if (allow_nonconstant_step)
!     {
!       if (tree_contains_chrecs (*step, NULL)
! 	  || chrec_contains_symbols_defined_in_loop (*step, loop->num))
! 	return false;
!     }
!   else if (TREE_CODE (*step) != INTEGER_CST)
      return false;
+ 
    *base = CHREC_LEFT (ev);
    if (tree_contains_chrecs (*base, NULL)
        || chrec_contains_symbols_defined_in_loop (*base, loop->num))
Index: tree-scalar-evolution.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.h,v
retrieving revision 2.3
diff -c -3 -p -r2.3 tree-scalar-evolution.h
*** tree-scalar-evolution.h	17 Nov 2004 22:06:00 -0000	2.3
--- tree-scalar-evolution.h	22 Apr 2005 21:23:04 -0000
*************** extern tree analyze_scalar_evolution (st
*** 32,37 ****
  extern tree instantiate_parameters (struct loop *, tree);
  extern void gather_stats_on_scev_database (void);
  extern void scev_analysis (void);
! extern bool simple_iv (struct loop *, tree, tree, tree *, tree *);
  
  #endif  /* GCC_TREE_SCALAR_EVOLUTION_H  */
--- 32,37 ----
  extern tree instantiate_parameters (struct loop *, tree);
  extern void gather_stats_on_scev_database (void);
  extern void scev_analysis (void);
! extern bool simple_iv (struct loop *, tree, tree, tree *, tree *, bool);
  
  #endif  /* GCC_TREE_SCALAR_EVOLUTION_H  */
Index: tree-ssa-loop-ivopts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivopts.c,v
retrieving revision 2.60
diff -c -3 -p -r2.60 tree-ssa-loop-ivopts.c
*** tree-ssa-loop-ivopts.c	17 Apr 2005 06:41:53 -0000	2.60
--- tree-ssa-loop-ivopts.c	22 Apr 2005 21:23:04 -0000
*************** struct iv_cand
*** 188,193 ****
--- 188,195 ----
  			   to replace the final value of an iv by direct
  			   computation of the value.  */
    unsigned cost;	/* Cost of the candidate.  */
+   bitmap depends_on;	/* The list of invariants that are used in step of the
+ 			   biv.  */
  };
  
  /* The data used by the induction variable optimizations.  */
*************** dump_cand (FILE *file, struct iv_cand *c
*** 481,486 ****
--- 483,494 ----
    fprintf (file, "candidate %d%s\n",
  	   cand->id, cand->important ? " (important)" : "");
  
+   if (cand->depends_on)
+     {
+       fprintf (file, "  depends on ");
+       dump_bitmap (file, cand->depends_on);
+     }
+ 
    if (!iv)
      {
        fprintf (file, "  final value replacement\n");
*************** get_iv (struct ivopts_data *data, tree v
*** 853,875 ****
    return name_info (data, var)->iv;
  }
  
! /* Determines the step of a biv defined in PHI.  */
  
  static tree
  determine_biv_step (tree phi)
  {
    struct loop *loop = bb_for_stmt (phi)->loop_father;
    tree name = PHI_RESULT (phi), base, step;
-   tree type = TREE_TYPE (name);
  
    if (!is_gimple_reg (name))
      return NULL_TREE;
  
!   if (!simple_iv (loop, phi, name, &base, &step))
      return NULL_TREE;
  
!   if (!step)
!     return build_int_cst (type, 0);
  
    return step;
  }
--- 861,883 ----
    return name_info (data, var)->iv;
  }
  
! /* Determines the step of a biv defined in PHI.  Returns NULL if PHI does
!    not define a simple affine biv with nonzero step.  */
  
  static tree
  determine_biv_step (tree phi)
  {
    struct loop *loop = bb_for_stmt (phi)->loop_father;
    tree name = PHI_RESULT (phi), base, step;
  
    if (!is_gimple_reg (name))
      return NULL_TREE;
  
!   if (!simple_iv (loop, phi, name, &base, &step, true))
      return NULL_TREE;
  
!   if (zero_p (step))
!     return NULL_TREE;
  
    return step;
  }
*************** idx_contains_abnormal_ssa_name_p (tree b
*** 912,920 ****
  static bool
  contains_abnormal_ssa_name_p (tree expr)
  {
!   enum tree_code code = TREE_CODE (expr);
!   enum tree_code_class class = TREE_CODE_CLASS (code);
!     
    if (code == SSA_NAME)
      return SSA_NAME_OCCURS_IN_ABNORMAL_PHI (expr) != 0;
  
--- 920,934 ----
  static bool
  contains_abnormal_ssa_name_p (tree expr)
  {
!   enum tree_code code;
!   enum tree_code_class class;
! 
!   if (!expr)
!     return false;
! 
!   code = TREE_CODE (expr);
!   class = TREE_CODE_CLASS (code);
! 
    if (code == SSA_NAME)
      return SSA_NAME_OCCURS_IN_ABNORMAL_PHI (expr) != 0;
  
*************** find_bivs (struct ivopts_data *data)
*** 963,987 ****
  	continue;
  
        step = determine_biv_step (phi);
- 
        if (!step)
  	continue;
-       if (cst_and_fits_in_hwi (step)
- 	  && int_cst_value (step) == 0)
- 	continue;
  
        base = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
!       if (contains_abnormal_ssa_name_p (base))
  	continue;
  
        type = TREE_TYPE (PHI_RESULT (phi));
        base = fold_convert (type, base);
!       step = fold_convert (type, step);
! 
!       /* FIXME: We do not handle induction variables whose step does
! 	 not satisfy cst_and_fits_in_hwi.  */
!       if (!cst_and_fits_in_hwi (step))
! 	continue;
  
        set_iv (data, PHI_RESULT (phi), base, step);
        found = true;
--- 977,994 ----
  	continue;
  
        step = determine_biv_step (phi);
        if (!step)
  	continue;
  
        base = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
!       if (contains_abnormal_ssa_name_p (base)
! 	  || contains_abnormal_ssa_name_p (step))
  	continue;
  
        type = TREE_TYPE (PHI_RESULT (phi));
        base = fold_convert (type, base);
!       if (step)
! 	step = fold_convert (type, step);
  
        set_iv (data, PHI_RESULT (phi), base, step);
        found = true;
*************** find_givs_in_stmt_scev (struct ivopts_da
*** 1042,1057 ****
    if (TREE_CODE (lhs) != SSA_NAME)
      return false;
  
!   if (!simple_iv (loop, stmt, TREE_OPERAND (stmt, 1), base, step))
!     return false;
! 
!   /* FIXME: We do not handle induction variables whose step does
!      not satisfy cst_and_fits_in_hwi.  */
!   if (!zero_p (*step)
!       && !cst_and_fits_in_hwi (*step))
      return false;
  
!   if (contains_abnormal_ssa_name_p (*base))
      return false;
  
    return true;
--- 1049,1059 ----
    if (TREE_CODE (lhs) != SSA_NAME)
      return false;
  
!   if (!simple_iv (loop, stmt, TREE_OPERAND (stmt, 1), base, step, true))
      return false;
  
!   if (contains_abnormal_ssa_name_p (*base)
!       || contains_abnormal_ssa_name_p (*step))
      return false;
  
    return true;
*************** idx_find_step (tree base, tree *idx, voi
*** 1440,1452 ****
        return false;
      }
  
!   step = fold_binary_to_constant (MULT_EXPR, type, step, iv_step);
  
    if (!*dta->step_p)
      *dta->step_p = step;
    else
!     *dta->step_p = fold_binary_to_constant (PLUS_EXPR, type,
! 					    *dta->step_p, step);
  
    return true;
  }
--- 1442,1453 ----
        return false;
      }
  
!   step = fold_build2 (MULT_EXPR, type, step, iv_step);
  
    if (!*dta->step_p)
      *dta->step_p = step;
    else
!     *dta->step_p = fold_build2 (PLUS_EXPR, type, *dta->step_p, step);
  
    return true;
  }
*************** may_be_unaligned_p (tree ref)
*** 1498,1503 ****
--- 1499,1523 ----
    return false;
  }
  
+ /* Builds ADDR_EXPR of object OBJ.  If OBJ is an INDIRECT_REF, the indirect_ref
+    is stripped instead.  */
+ 
+ static tree
+ build_addr_strip_iref (tree obj)
+ {
+   tree type;
+ 
+   if (TREE_CODE (obj) == INDIRECT_REF)
+     {
+       type = build_pointer_type (TREE_TYPE (obj));
+       obj = fold_convert (type, TREE_OPERAND (obj, 0));
+     }
+   else
+     obj = build_addr (obj);
+ 
+   return obj;
+ }
+ 
  /* Finds addresses in *OP_P inside STMT.  */
  
  static void
*************** find_interesting_uses_address (struct iv
*** 1527,1536 ****
    gcc_assert (TREE_CODE (base) != ALIGN_INDIRECT_REF);
    gcc_assert (TREE_CODE (base) != MISALIGNED_INDIRECT_REF);
  
!   if (TREE_CODE (base) == INDIRECT_REF)
!     base = TREE_OPERAND (base, 0);
!   else
!     base = build_addr (base);
  
    civ = alloc_iv (base, step);
    record_use (data, op_p, civ, stmt, USE_ADDRESS);
--- 1547,1553 ----
    gcc_assert (TREE_CODE (base) != ALIGN_INDIRECT_REF);
    gcc_assert (TREE_CODE (base) != MISALIGNED_INDIRECT_REF);
  
!   base = build_addr_strip_iref (base);
  
    civ = alloc_iv (base, step);
    record_use (data, op_p, civ, stmt, USE_ADDRESS);
*************** find_interesting_uses (struct ivopts_dat
*** 1742,1759 ****
  }
  
  /* Strips constant offsets from EXPR and stores them to OFFSET.  If INSIDE_ADDR
!    is true, assume we are inside an address.  */
  
  static tree
! strip_offset (tree expr, bool inside_addr, unsigned HOST_WIDE_INT *offset)
  {
!   tree op0 = NULL_TREE, op1 = NULL_TREE, step;
    enum tree_code code;
    tree type, orig_type = TREE_TYPE (expr);
    unsigned HOST_WIDE_INT off0, off1, st;
    tree orig_expr = expr;
  
    STRIP_NOPS (expr);
    type = TREE_TYPE (expr);
    code = TREE_CODE (expr);
    *offset = 0;
--- 1759,1779 ----
  }
  
  /* Strips constant offsets from EXPR and stores them to OFFSET.  If INSIDE_ADDR
!    is true, assume we are inside an address.  If TOP_COMPREF is true, assume
!    we are at the top-level of the processed address.  */
  
  static tree
! strip_offset_1 (tree expr, bool inside_addr, bool top_compref,
! 		unsigned HOST_WIDE_INT *offset)
  {
!   tree op0 = NULL_TREE, op1 = NULL_TREE, tmp, step;
    enum tree_code code;
    tree type, orig_type = TREE_TYPE (expr);
    unsigned HOST_WIDE_INT off0, off1, st;
    tree orig_expr = expr;
  
    STRIP_NOPS (expr);
+ 
    type = TREE_TYPE (expr);
    code = TREE_CODE (expr);
    *offset = 0;
*************** strip_offset (tree expr, bool inside_add
*** 1773,1780 ****
        op0 = TREE_OPERAND (expr, 0);
        op1 = TREE_OPERAND (expr, 1);
  
!       op0 = strip_offset (op0, false, &off0);
!       op1 = strip_offset (op1, false, &off1);
  
        *offset = (code == PLUS_EXPR ? off0 + off1 : off0 - off1);
        if (op0 == TREE_OPERAND (expr, 0)
--- 1793,1800 ----
        op0 = TREE_OPERAND (expr, 0);
        op1 = TREE_OPERAND (expr, 1);
  
!       op0 = strip_offset_1 (op0, false, false, &off0);
!       op1 = strip_offset_1 (op1, false, false, &off1);
  
        *offset = (code == PLUS_EXPR ? off0 + off1 : off0 - off1);
        if (op0 == TREE_OPERAND (expr, 0)
*************** strip_offset (tree expr, bool inside_add
*** 1788,1797 ****
  	  if (code == PLUS_EXPR)
  	    expr = op1;
  	  else
! 	    expr = build1 (NEGATE_EXPR, type, op1);
  	}
        else
! 	expr = build2 (code, type, op0, op1);
  
        return fold_convert (orig_type, expr);
  
--- 1808,1817 ----
  	  if (code == PLUS_EXPR)
  	    expr = op1;
  	  else
! 	    expr = fold_build1 (NEGATE_EXPR, type, op1);
  	}
        else
! 	expr = fold_build2 (code, type, op0, op1);
  
        return fold_convert (orig_type, expr);
  
*************** strip_offset (tree expr, bool inside_add
*** 1805,1821 ****
  
        st = int_cst_value (step);
        op1 = TREE_OPERAND (expr, 1);
!       op1 = strip_offset (op1, false, &off1);
        *offset = off1 * st;
        break;
  
      case COMPONENT_REF:
        if (!inside_addr)
  	return orig_expr;
        break;
  
      case ADDR_EXPR:
!       inside_addr = true;
        break;
  
      default:
--- 1825,1873 ----
  
        st = int_cst_value (step);
        op1 = TREE_OPERAND (expr, 1);
!       op1 = strip_offset_1 (op1, false, false, &off1);
        *offset = off1 * st;
+ 
+       if (top_compref
+ 	  && zero_p (op1))
+ 	{
+ 	  /* Strip the component reference completely.  */
+ 	  op0 = TREE_OPERAND (expr, 0);
+ 	  op0 = strip_offset_1 (op0, inside_addr, top_compref, &off0);
+ 	  *offset += off0;
+ 	  return op0;
+ 	}
        break;
  
      case COMPONENT_REF:
        if (!inside_addr)
  	return orig_expr;
+ 
+       tmp = component_ref_field_offset (expr);
+       if (top_compref
+ 	  && cst_and_fits_in_hwi (tmp))
+ 	{
+ 	  /* Strip the component reference completely.  */
+ 	  op0 = TREE_OPERAND (expr, 0);
+ 	  op0 = strip_offset_1 (op0, inside_addr, top_compref, &off0);
+ 	  *offset = off0 + int_cst_value (tmp);
+ 	  return op0;
+ 	}
        break;
  
      case ADDR_EXPR:
!       op0 = TREE_OPERAND (expr, 0);
!       op0 = strip_offset_1 (op0, true, true, &off0);
!       *offset += off0;
! 
!       if (op0 == TREE_OPERAND (expr, 0))
! 	return orig_expr;
! 
!       expr = build_addr_strip_iref (op0);
!       return fold_convert (orig_type, expr);
! 
!     case INDIRECT_REF:
!       inside_addr = false;
        break;
  
      default:
*************** strip_offset (tree expr, bool inside_add
*** 1825,1831 ****
    /* Default handling of expressions for that we want to recurse into
       the first operand.  */
    op0 = TREE_OPERAND (expr, 0);
!   op0 = strip_offset (op0, inside_addr, &off0);
    *offset += off0;
  
    if (op0 == TREE_OPERAND (expr, 0)
--- 1877,1883 ----
    /* Default handling of expressions for that we want to recurse into
       the first operand.  */
    op0 = TREE_OPERAND (expr, 0);
!   op0 = strip_offset_1 (op0, inside_addr, false, &off0);
    *offset += off0;
  
    if (op0 == TREE_OPERAND (expr, 0)
*************** strip_offset (tree expr, bool inside_add
*** 1837,1843 ****
    if (op1)
      TREE_OPERAND (expr, 1) = op1;
  
!   return fold_convert (orig_type, expr);
  }
  
  /* Returns variant of TYPE that can be used as base for different uses.
--- 1889,1908 ----
    if (op1)
      TREE_OPERAND (expr, 1) = op1;
  
!   /* Inside address, we might strip the top level component references,
!      thus changing type of the expresion.  Handling of ADDR_EXPR
!      will fix that.  */
!   expr = fold_convert (orig_type, expr);
! 
!   return expr;
! }
! 
! /* Strips constant offsets from EXPR and stores them to OFFSET.  */
! 
! static tree
! strip_offset (tree expr, unsigned HOST_WIDE_INT *offset)
! {
!   return strip_offset_1 (expr, false, false, offset);
  }
  
  /* Returns variant of TYPE that can be used as base for different uses.
*************** generic_type_for (tree type)
*** 1856,1861 ****
--- 1921,1950 ----
    return unsigned_type_for (type);
  }
  
+ /* Records invariants in *EXPR_P.  Callback for walk_tree.  DATA contains
+    the bitmap to that we should store it.  */
+ 
+ static struct ivopts_data *fd_ivopts_data;
+ static tree
+ find_depends (tree *expr_p, int *ws ATTRIBUTE_UNUSED, void *data)
+ {
+   bitmap *depends_on = data;
+   struct version_info *info;
+ 
+   if (TREE_CODE (*expr_p) != SSA_NAME)
+     return NULL_TREE;
+   info = name_info (fd_ivopts_data, *expr_p);
+ 
+   if (!info->inv_id || info->has_nonlin_use)
+     return NULL_TREE;
+ 
+   if (!*depends_on)
+     *depends_on = BITMAP_ALLOC (NULL);
+   bitmap_set_bit (*depends_on, info->inv_id);
+ 
+   return NULL_TREE;
+ }
+ 
  /* Adds a candidate BASE + STEP * i.  Important field is set to IMPORTANT and
     position to POS.  If USE is not NULL, the candidate is set as related to
     it.  If both BASE and STEP are NULL, we add a pseudocandidate for the
*************** add_candidate_1 (struct ivopts_data *dat
*** 1938,1943 ****
--- 2027,2039 ----
        cand->incremented_at = incremented_at;
        VARRAY_PUSH_GENERIC_PTR_NOGC (data->iv_candidates, cand);
  
+       if (step
+ 	  && TREE_CODE (step) != INTEGER_CST)
+ 	{
+ 	  fd_ivopts_data = data;
+ 	  walk_tree (&step, find_depends, &cand->depends_on, NULL);
+ 	}
+ 
        if (dump_file && (dump_flags & TDF_DETAILS))
  	dump_cand (dump_file, cand);
      }
*************** static void
*** 2073,2122 ****
  add_iv_value_candidates (struct ivopts_data *data,
  			 struct iv *iv, struct iv_use *use)
  {
-   add_candidate (data, iv->base, iv->step, false, use);
- 
-   /* The same, but with initial value zero.  */
-   add_candidate (data, build_int_cst (TREE_TYPE (iv->base), 0),
- 		 iv->step, false, use);
- }
- 
- /* Adds candidates based on the address IV and USE.  */
- 
- static void
- add_address_candidates (struct ivopts_data *data,
- 			struct iv *iv, struct iv_use *use)
- {
-   tree base, abase;
    unsigned HOST_WIDE_INT offset;
  
!   /* First, the trivial choices.  */
!   add_iv_value_candidates (data, iv, use);
! 
!   /* Second, try removing the COMPONENT_REFs.  */
!   if (TREE_CODE (iv->base) == ADDR_EXPR)
!     {
!       base = TREE_OPERAND (iv->base, 0);
!       while (TREE_CODE (base) == COMPONENT_REF
! 	     || (TREE_CODE (base) == ARRAY_REF
! 		 && TREE_CODE (TREE_OPERAND (base, 1)) == INTEGER_CST))
! 	base = TREE_OPERAND (base, 0);
! 
!       if (base != TREE_OPERAND (iv->base, 0))
! 	{ 
! 	  gcc_assert (TREE_CODE (base) != ALIGN_INDIRECT_REF);
! 	  gcc_assert (TREE_CODE (base) != MISALIGNED_INDIRECT_REF);
  
! 	  if (TREE_CODE (base) == INDIRECT_REF)
! 	    base = TREE_OPERAND (base, 0);
! 	  else
! 	    base = build_addr (base);
! 	  add_candidate (data, base, iv->step, false, use);
! 	}
!     }
  
    /* Third, try removing the constant offset.  */
!   abase = iv->base;
!   base = strip_offset (abase, false, &offset);
    if (offset)
      add_candidate (data, base, iv->step, false, use);
  }
--- 2169,2187 ----
  add_iv_value_candidates (struct ivopts_data *data,
  			 struct iv *iv, struct iv_use *use)
  {
    unsigned HOST_WIDE_INT offset;
+   tree base;
  
!   add_candidate (data, iv->base, iv->step, false, use);
  
!   /* The same, but with initial value zero.  Make such variable important,
!      since it is generic enough so that possibly many uses may be based
!      on it.  */
!   add_candidate (data, build_int_cst (TREE_TYPE (iv->base), 0),
! 		 iv->step, true, use);
  
    /* Third, try removing the constant offset.  */
!   base = strip_offset (iv->base, &offset);
    if (offset)
      add_candidate (data, base, iv->step, false, use);
  }
*************** add_derived_ivs_candidates (struct ivopt
*** 2156,2161 ****
--- 2221,2227 ----
  	{
  	case USE_NONLINEAR_EXPR:
  	case USE_COMPARE:
+ 	case USE_ADDRESS:
  	  /* Just add the ivs based on the value of the iv used here.  */
  	  add_iv_value_candidates (data, use->iv, use);
  	  break;
*************** add_derived_ivs_candidates (struct ivopt
*** 2168,2177 ****
  	  add_iv_outer_candidates (data, use);
  	  break;
  
- 	case USE_ADDRESS:
- 	  add_address_candidates (data, use->iv, use);
- 	  break;
- 
  	default:
  	  gcc_unreachable ();
  	}
--- 2234,2239 ----
*************** var_at_stmt (struct loop *loop, struct i
*** 2475,2480 ****
--- 2537,2988 ----
      return cand->var_before;
  }
  
+ /* Return the most significant (sign) bit of T.  Similar to tree_int_cst_msb,
+    but the bit is determined from TYPE_PRECISION, not MODE_BITSIZE.  */
+ 
+ static int
+ tree_int_cst_sign_bit (tree t)
+ {
+   unsigned bitno = TYPE_PRECISION (TREE_TYPE (t)) - 1;
+   unsigned HOST_WIDE_INT w;
+ 
+   if (bitno < HOST_BITS_PER_WIDE_INT)
+     w = TREE_INT_CST_LOW (t);
+   else
+     {
+       w = TREE_INT_CST_HIGH (t);
+       bitno -= HOST_BITS_PER_WIDE_INT;
+     }
+ 
+   return (w >> bitno) & 1;
+ }
+ 
+ /* If we can prove that TOP = cst * BOT for some constant cst in TYPE,
+    return cst.  Otherwise return NULL_TREE.  */
+ 
+ static tree
+ constant_multiple_of (tree type, tree top, tree bot)
+ {
+   tree res, mby, p0, p1;
+   enum tree_code code;
+   bool negate;
+ 
+   STRIP_NOPS (top);
+   STRIP_NOPS (bot);
+ 
+   if (operand_equal_p (top, bot, 0))
+     return build_int_cst (type, 1);
+ 
+   code = TREE_CODE (top);
+   switch (code)
+     {
+     case MULT_EXPR:
+       mby = TREE_OPERAND (top, 1);
+       if (TREE_CODE (mby) != INTEGER_CST)
+ 	return NULL_TREE;
+ 
+       res = constant_multiple_of (type, TREE_OPERAND (top, 0), bot);
+       if (!res)
+ 	return NULL_TREE;
+ 
+       return fold_binary_to_constant (MULT_EXPR, type, res,
+ 				      fold_convert (type, mby));
+ 
+     case PLUS_EXPR:
+     case MINUS_EXPR:
+       p0 = constant_multiple_of (type, TREE_OPERAND (top, 0), bot);
+       if (!p0)
+ 	return NULL_TREE;
+       p1 = constant_multiple_of (type, TREE_OPERAND (top, 1), bot);
+       if (!p1)
+ 	return NULL_TREE;
+ 
+       return fold_binary_to_constant (code, type, p0, p1);
+ 
+     case INTEGER_CST:
+       if (TREE_CODE (bot) != INTEGER_CST)
+ 	return NULL_TREE;
+ 
+       bot = fold_convert (type, bot);
+       top = fold_convert (type, top);
+ 
+       /* If BOT seems to be negative, try dividing by -BOT instead, and negate
+ 	 the result afterwards.  */
+       if (tree_int_cst_sign_bit (bot))
+ 	{
+ 	  negate = true;
+ 	  bot = fold_unary_to_constant (NEGATE_EXPR, type, bot);
+ 	}
+       else
+ 	negate = false;
+ 
+       /* Ditto for TOP.  */
+       if (tree_int_cst_sign_bit (top))
+ 	{
+ 	  negate = !negate;
+ 	  top = fold_unary_to_constant (NEGATE_EXPR, type, top);
+ 	}
+ 
+       if (!zero_p (fold_binary_to_constant (TRUNC_MOD_EXPR, type, top, bot)))
+ 	return NULL_TREE;
+ 
+       res = fold_binary_to_constant (EXACT_DIV_EXPR, type, top, bot);
+       if (negate)
+ 	res = fold_unary_to_constant (NEGATE_EXPR, type, res);
+       return res;
+ 
+     default:
+       return NULL_TREE;
+     }
+ }
+ 
+ /* Affine combination of trees.  We keep track of at most MAX_AFF_ELTS elements
+    to make things simpler; this is sufficient in most cases.  */
+ 
+ #define MAX_AFF_ELTS 8
+ 
+ struct affine_tree_combination
+ {
+   /* Type of the result of the combination.  */
+   tree type;
+ 
+   /* Mask modulo that the operations are performed.  */
+   unsigned HOST_WIDE_INT mask;
+ 
+   /* Constant offset.  */
+   unsigned HOST_WIDE_INT offset;
+ 
+   /* Number of elements of the combination.  */
+   unsigned n;
+ 
+   /* Elements and their coefficients.  */
+   tree elts[MAX_AFF_ELTS];
+   unsigned HOST_WIDE_INT coefs[MAX_AFF_ELTS];
+ 
+   /* Remainder of the expression.  */
+   tree rest;
+ };
+ 
+ /* Sets COMB to CST.  */
+ 
+ static void
+ aff_combination_const (struct affine_tree_combination *comb, tree type,
+ 		       unsigned HOST_WIDE_INT cst)
+ {
+   comb->type = type;
+   comb->mask = 2;
+   comb->mask <<= TYPE_PRECISION (type) - 1;
+   comb->mask--;
+ 
+   comb->n = 0;
+   comb->rest = NULL_TREE;
+   comb->offset = cst & comb->mask;
+ }
+ 
+ /* Sets COMB to single element ELT.  */
+ 
+ static void
+ aff_combination_elt (struct affine_tree_combination *comb, tree type, tree elt)
+ {
+   comb->type = type;
+   comb->mask = 2;
+   comb->mask <<= TYPE_PRECISION (type) - 1;
+   comb->mask--;
+ 
+   comb->n = 1;
+   comb->elts[0] = elt;
+   comb->coefs[0] = 1;
+   comb->rest = NULL_TREE;
+   comb->offset = 0;
+ }
+ 
+ /* Scales COMB by SCALE.  */
+ 
+ static void
+ aff_combination_scale (struct affine_tree_combination *comb,
+ 		       unsigned HOST_WIDE_INT scale)
+ {
+   unsigned i, j;
+ 
+   if (scale == 1)
+     return;
+ 
+   if (scale == 0)
+     {
+       aff_combination_const (comb, comb->type, 0);
+       return;
+     }
+ 
+   comb->offset = (scale * comb->offset) & comb->mask;
+   for (i = 0, j = 0; i < comb->n; i++)
+     {
+       comb->coefs[j] = (scale * comb->coefs[i]) & comb->mask;
+       comb->elts[j] = comb->elts[i];
+       if (comb->coefs[j] != 0)
+ 	j++;
+     }
+   comb->n = j;
+ 
+   if (comb->rest)
+     {
+       if (comb->n < MAX_AFF_ELTS)
+ 	{
+ 	  comb->coefs[comb->n] = scale;
+ 	  comb->elts[comb->n] = comb->rest;
+ 	  comb->rest = NULL_TREE;
+ 	  comb->n++;
+ 	}
+       else
+ 	comb->rest = fold_build2 (MULT_EXPR, comb->type, comb->rest,
+ 				  build_int_cst_type (comb->type, scale));
+     }
+ }
+ 
+ /* Adds ELT * SCALE to COMB.  */
+ 
+ static void
+ aff_combination_add_elt (struct affine_tree_combination *comb, tree elt,
+ 			 unsigned HOST_WIDE_INT scale)
+ {
+   unsigned i;
+ 
+   if (scale == 0)
+     return;
+ 
+   for (i = 0; i < comb->n; i++)
+     if (operand_equal_p (comb->elts[i], elt, 0))
+       {
+ 	comb->coefs[i] = (comb->coefs[i] + scale) & comb->mask;
+ 	if (comb->coefs[i])
+ 	  return;
+ 
+ 	comb->n--;
+ 	comb->coefs[i] = comb->coefs[comb->n];
+ 	comb->elts[i] = comb->elts[comb->n];
+ 	return;
+       }
+   if (comb->n < MAX_AFF_ELTS)
+     {
+       comb->coefs[comb->n] = scale;
+       comb->elts[comb->n] = elt;
+       comb->n++;
+       return;
+     }
+ 
+   if (scale == 1)
+     elt = fold_convert (comb->type, elt);
+   else
+     elt = fold_build2 (MULT_EXPR, comb->type,
+ 		       fold_convert (comb->type, elt),
+ 		       build_int_cst_type (comb->type, scale)); 
+ 
+   if (comb->rest)
+     comb->rest = fold_build2 (PLUS_EXPR, comb->type, comb->rest, elt);
+   else
+     comb->rest = elt;
+ }
+ 
+ /* Adds COMB2 to COMB1.  */
+ 
+ static void
+ aff_combination_add (struct affine_tree_combination *comb1,
+ 		     struct affine_tree_combination *comb2)
+ {
+   unsigned i;
+ 
+   comb1->offset = (comb1->offset + comb2->offset) & comb1->mask;
+   for (i = 0; i < comb2-> n; i++)
+     aff_combination_add_elt (comb1, comb2->elts[i], comb2->coefs[i]);
+   if (comb2->rest)
+     aff_combination_add_elt (comb1, comb2->rest, 1);
+ }
+ 
+ /* Splits EXPR into an affine combination of parts.  */
+ 
+ static void
+ tree_to_aff_combination (tree expr, tree type,
+ 			 struct affine_tree_combination *comb)
+ {
+   struct affine_tree_combination tmp;
+   enum tree_code code;
+   tree cst, core, toffset;
+   HOST_WIDE_INT bitpos, bitsize;
+   enum machine_mode mode;
+   int unsignedp, volatilep;
+ 
+   STRIP_NOPS (expr);
+ 
+   code = TREE_CODE (expr);
+   switch (code)
+     {
+     case INTEGER_CST:
+       aff_combination_const (comb, type, int_cst_value (expr));
+       return;
+ 
+     case PLUS_EXPR:
+     case MINUS_EXPR:
+       tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+       tree_to_aff_combination (TREE_OPERAND (expr, 1), type, &tmp);
+       if (code == MINUS_EXPR)
+ 	aff_combination_scale (&tmp, -1);
+       aff_combination_add (comb, &tmp);
+       return;
+ 
+     case MULT_EXPR:
+       cst = TREE_OPERAND (expr, 1);
+       if (TREE_CODE (cst) != INTEGER_CST)
+ 	break;
+       tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+       aff_combination_scale (comb, int_cst_value (cst));
+       return;
+ 
+     case NEGATE_EXPR:
+       tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+       aff_combination_scale (comb, -1);
+       return;
+ 
+     case ADDR_EXPR:
+       core = get_inner_reference (TREE_OPERAND (expr, 0), &bitsize, &bitpos,
+ 				  &toffset, &mode, &unsignedp, &volatilep,
+ 				  false);
+       if (bitpos % BITS_PER_UNIT != 0)
+ 	break;
+       aff_combination_const (comb, type, bitpos / BITS_PER_UNIT);
+       core = build_addr_strip_iref (core);
+       if (TREE_CODE (core) == ADDR_EXPR)
+ 	aff_combination_add_elt (comb, core, 1);
+       else
+ 	{
+ 	  tree_to_aff_combination (core, type, &tmp);
+ 	  aff_combination_add (comb, &tmp);
+ 	}
+       if (toffset)
+ 	{
+ 	  tree_to_aff_combination (toffset, type, &tmp);
+ 	  aff_combination_add (comb, &tmp);
+ 	}
+       return;
+ 
+     default:
+       break;
+     }
+ 
+   aff_combination_elt (comb, type, expr);
+ }
+ 
+ /* Creates EXPR + ELT * SCALE in TYPE.  MASK is the mask for width of TYPE.  */
+ 
+ static tree
+ add_elt_to_tree (tree expr, tree type, tree elt, unsigned HOST_WIDE_INT scale,
+ 		 unsigned HOST_WIDE_INT mask)
+ {
+   enum tree_code code;
+ 
+   scale &= mask;
+   elt = fold_convert (type, elt);
+ 
+   if (scale == 1)
+     {
+       if (!expr)
+ 	return elt;
+ 
+       return fold_build2 (PLUS_EXPR, type, expr, elt);
+     }
+ 
+   if (scale == mask)
+     {
+       if (!expr)
+ 	return fold_build1 (NEGATE_EXPR, type, elt);
+ 
+       return fold_build2 (MINUS_EXPR, type, expr, elt);
+     }
+ 
+   if (!expr)
+     return fold_build2 (MULT_EXPR, type, elt,
+ 			build_int_cst_type (type, scale));
+ 
+   if ((scale | (mask >> 1)) == mask)
+     {
+       /* Scale is negative.  */
+       code = MINUS_EXPR;
+       scale = (-scale) & mask;
+     }
+   else
+     code = PLUS_EXPR;
+ 
+   elt = fold_build2 (MULT_EXPR, type, elt,
+ 		     build_int_cst_type (type, scale));
+   return fold_build2 (code, type, expr, elt);
+ }
+ 
+ /* Makes tree from the affine combination COMB.  */
+ 
+ static tree
+ aff_combination_to_tree (struct affine_tree_combination *comb)
+ {
+   tree type = comb->type;
+   tree expr = comb->rest;
+   unsigned i;
+   unsigned HOST_WIDE_INT off, sgn;
+ 
+   gcc_assert (comb->n == MAX_AFF_ELTS || comb->rest == NULL_TREE);
+ 
+   for (i = 0; i < comb->n; i++)
+     expr = add_elt_to_tree (expr, type, comb->elts[i], comb->coefs[i],
+ 			    comb->mask);
+ 
+   if ((comb->offset | (comb->mask >> 1)) == comb->mask)
+     {
+       /* Offset is negative.  */
+       off = (-comb->offset) & comb->mask;
+       sgn = comb->mask;
+     }
+   else
+     {
+       off = comb->offset;
+       sgn = 1;
+     }
+   return add_elt_to_tree (expr, type, build_int_cst_type (type, off), sgn,
+ 			  comb->mask);
+ }
+ 
+ /* Folds X + RATIO * Y in TYPE.  */
+ 
+ static tree
+ fold_affine_sum (tree type, tree x, tree y, HOST_WIDE_INT ratio)
+ {
+   enum tree_code code;
+   tree cst;
+   struct affine_tree_combination cx, cy;
+ 
+   if (TYPE_PRECISION (type) > HOST_BITS_PER_WIDE_INT)
+     {
+       if (ratio == 1)
+ 	return fold_build2 (PLUS_EXPR, type, x, y);
+       if (ratio == -1)
+ 	return fold_build2 (MINUS_EXPR, type, x, y);
+ 
+       if (ratio < 0)
+ 	{
+ 	  code = MINUS_EXPR;
+ 	  ratio = -ratio;
+ 	}
+       else
+ 	code = PLUS_EXPR;
+ 
+       cst = build_int_cst_type (type, ratio);
+       y = fold_build2 (MULT_EXPR, type, y, cst);
+       return fold_build2 (code, type, x, y);
+     }
+ 
+   tree_to_aff_combination (x, type, &cx);
+   tree_to_aff_combination (y, type, &cy);
+   aff_combination_scale (&cy, ratio);
+   aff_combination_add (&cx, &cy);
+ 
+   return aff_combination_to_tree (&cx);
+ }
+ 
  /* Determines the expression by that USE is expressed from induction variable
     CAND at statement AT in LOOP.  */
  
*************** get_computation_at (struct loop *loop,
*** 2523,2541 ****
        cstep = fold_convert (uutype, cstep);
      }
  
!   if (!cst_and_fits_in_hwi (cstep)
!       || !cst_and_fits_in_hwi (ustep))
!     return NULL_TREE;
  
!   ustepi = int_cst_value (ustep);
!   cstepi = int_cst_value (cstep);
  
!   if (!divide (TYPE_PRECISION (uutype), ustepi, cstepi, &ratioi))
      {
!       /* TODO maybe consider case when ustep divides cstep and the ratio is
! 	 a power of 2 (so that the division is fast to execute)?  We would
! 	 need to be much more careful with overflows etc. then.  */
!       return NULL_TREE;
      }
  
    /* We may need to shift the value if we are after the increment.  */
--- 3031,3071 ----
        cstep = fold_convert (uutype, cstep);
      }
  
!   if (cst_and_fits_in_hwi (cstep)
!       && cst_and_fits_in_hwi (ustep))
!     {
!       ustepi = int_cst_value (ustep);
!       cstepi = int_cst_value (cstep);
  
!       if (!divide (TYPE_PRECISION (uutype), ustepi, cstepi, &ratioi))
! 	{
! 	  /* TODO maybe consider case when ustep divides cstep and the ratio is
! 	     a power of 2 (so that the division is fast to execute)?  We would
! 	     need to be much more careful with overflows etc. then.  */
! 	  return NULL_TREE;
! 	}
  
!       ratio = build_int_cst_type (uutype, ratioi);
!     }
!   else
      {
!       ratio = constant_multiple_of (uutype, ustep, cstep);
!       if (!ratio)
! 	return NULL_TREE;
! 
!       /* Ratioi is only used to detect special cases when the multiplicative
! 	 factor is 1 or -1, so if we cannot convert ratio to HOST_WIDE_INT,
! 	 we may set it to 0.  We prefer cst_and_fits_in_hwi/int_cst_value
! 	 to integer_onep/integer_all_onesp, since the former ignores
! 	 TREE_OVERFLOW.  */
!       if (cst_and_fits_in_hwi (ratio))
! 	ratioi = int_cst_value (ratio);
!       else if (integer_onep (ratio))
! 	ratioi = 1;
!       else if (integer_all_onesp (ratio))
! 	ratioi = -1;
!       else
! 	ratioi = 0;
      }
  
    /* We may need to shift the value if we are after the increment.  */
*************** get_computation_at (struct loop *loop,
*** 2552,2572 ****
  
    if (ratioi == 1)
      {
!       delta = fold (build2 (MINUS_EXPR, uutype, ubase, cbase));
!       expr = fold (build2 (PLUS_EXPR, uutype, expr, delta));
      }
    else if (ratioi == -1)
      {
!       delta = fold (build2 (PLUS_EXPR, uutype, ubase, cbase));
!       expr = fold (build2 (MINUS_EXPR, uutype, delta, expr));
      }
    else
      {
!       ratio = build_int_cst_type (uutype, ratioi);
!       delta = fold (build2 (MULT_EXPR, uutype, ratio, cbase));
!       delta = fold (build2 (MINUS_EXPR, uutype, ubase, delta));
!       expr = fold (build2 (MULT_EXPR, uutype, ratio, expr));
!       expr = fold (build2 (PLUS_EXPR, uutype, delta, expr));
      }
  
    return fold_convert (utype, expr);
--- 3082,3106 ----
  
    if (ratioi == 1)
      {
!       delta = fold_affine_sum (uutype, ubase, cbase, -1);
!       expr = fold_build2 (PLUS_EXPR, uutype, expr, delta);
      }
    else if (ratioi == -1)
      {
!       delta = fold_affine_sum (uutype, ubase, cbase, 1);
!       expr = fold_build2 (MINUS_EXPR, uutype, delta, expr);
      }
    else
      {
!       if (ratioi)
! 	delta = fold_affine_sum (uutype, ubase, cbase, -ratioi);
!       else
! 	{
! 	  delta = fold_build2 (MULT_EXPR, uutype, ratio, cbase);
! 	  delta = fold_affine_sum (uutype, ubase, delta, -1);
! 	}
!       expr = fold_build2 (MULT_EXPR, uutype, ratio, expr);
!       expr = fold_build2 (PLUS_EXPR, uutype, delta, expr);
      }
  
    return fold_convert (utype, expr);
*************** get_address_cost (bool symbol_present, b
*** 2830,2860 ****
  
    return cost + acost;
  }
- 
- /* Records invariants in *EXPR_P.  Callback for walk_tree.  DATA contains
-    the bitmap to that we should store it.  */
- 
- static struct ivopts_data *fd_ivopts_data;
- static tree
- find_depends (tree *expr_p, int *ws ATTRIBUTE_UNUSED, void *data)
- {
-   bitmap *depends_on = data;
-   struct version_info *info;
- 
-   if (TREE_CODE (*expr_p) != SSA_NAME)
-     return NULL_TREE;
-   info = name_info (fd_ivopts_data, *expr_p);
- 
-   if (!info->inv_id || info->has_nonlin_use)
-     return NULL_TREE;
- 
-   if (!*depends_on)
-     *depends_on = BITMAP_ALLOC (NULL);
-   bitmap_set_bit (*depends_on, info->inv_id);
- 
-   return NULL_TREE;
- }
- 
  /* Estimates cost of forcing EXPR into a variable.  DEPENDS_ON is a set of the
     invariants the computation depends on.  */
  
--- 3364,3369 ----
*************** difference_cost (struct ivopts_data *dat
*** 3088,3095 ****
    enum machine_mode mode = TYPE_MODE (TREE_TYPE (e1));
    unsigned HOST_WIDE_INT off1, off2;
  
!   e1 = strip_offset (e1, false, &off1);
!   e2 = strip_offset (e2, false, &off2);
    *offset += off1 - off2;
  
    STRIP_NOPS (e1);
--- 3597,3604 ----
    enum machine_mode mode = TYPE_MODE (TREE_TYPE (e1));
    unsigned HOST_WIDE_INT off1, off2;
  
!   e1 = strip_offset (e1, &off1);
!   e2 = strip_offset (e2, &off2);
    *offset += off1 - off2;
  
    STRIP_NOPS (e1);
*************** get_computation_cost_at (struct ivopts_d
*** 3172,3200 ****
  	return INFTY;
      }
  
-   if (!cst_and_fits_in_hwi (ustep)
-       || !cst_and_fits_in_hwi (cstep))
-     return INFTY;
- 
-   if (TREE_CODE (ubase) == INTEGER_CST
-       && !cst_and_fits_in_hwi (ubase))
-     goto fallback;
- 
-   if (TREE_CODE (cbase) == INTEGER_CST
-       && !cst_and_fits_in_hwi (cbase))
-     goto fallback;
-     
-   ustepi = int_cst_value (ustep);
-   cstepi = int_cst_value (cstep);
- 
    if (TYPE_PRECISION (utype) != TYPE_PRECISION (ctype))
      {
        /* TODO -- add direct handling of this case.  */
        goto fallback;
      }
  
!   if (!divide (TYPE_PRECISION (utype), ustepi, cstepi, &ratio))
!     return INFTY;
  
    /* use = ubase + ratio * (var - cbase).  If either cbase is a constant
       or ratio == 1, it is better to handle this like
--- 3681,3729 ----
  	return INFTY;
      }
  
    if (TYPE_PRECISION (utype) != TYPE_PRECISION (ctype))
      {
        /* TODO -- add direct handling of this case.  */
        goto fallback;
      }
  
!   /* CSTEPI is removed from the offset in case statement is after the
!      increment.  If the step is not constant, we use zero instead.
!      This is a bit inprecise (there is the extra addition), but
!      redundancy elimination is likely to transform the code so that
!      it uses value of the variable before increment anyway,
!      so it is not that much unrealistic.  */
!   if (cst_and_fits_in_hwi (cstep))
!     cstepi = int_cst_value (cstep);
!   else
!     cstepi = 0;
! 
!   if (cst_and_fits_in_hwi (ustep)
!       && cst_and_fits_in_hwi (cstep))
!     {
!       ustepi = int_cst_value (ustep);
! 
!       if (!divide (TYPE_PRECISION (utype), ustepi, cstepi, &ratio))
! 	return INFTY;
!     }
!   else
!     {
!       tree rat;
!       
!       rat = constant_multiple_of (utype, ustep, cstep);
!     
!       if (!rat)
! 	return INFTY;
! 
!       if (cst_and_fits_in_hwi (rat))
! 	ratio = int_cst_value (rat);
!       else if (integer_onep (rat))
! 	ratio = 1;
!       else if (integer_all_onesp (rat))
! 	ratio = -1;
!       else
! 	return INFTY;
!     }
  
    /* use = ubase + ratio * (var - cbase).  If either cbase is a constant
       or ratio == 1, it is better to handle this like
*************** get_computation_cost_at (struct ivopts_d
*** 3203,3209 ****
       
       (also holds in the case ratio == -1, TODO.  */
  
!   if (TREE_CODE (cbase) == INTEGER_CST)
      {
        offset = - ratio * int_cst_value (cbase); 
        cost += difference_cost (data,
--- 3732,3738 ----
       
       (also holds in the case ratio == -1, TODO.  */
  
!   if (cst_and_fits_in_hwi (cbase))
      {
        offset = - ratio * int_cst_value (cbase); 
        cost += difference_cost (data,
*************** may_eliminate_iv (struct ivopts_data *da
*** 3398,3403 ****
--- 3927,3935 ----
    tree wider_type, period, per_type;
    struct loop *loop = data->current_loop;
    
+   if (TREE_CODE (cand->iv->step) != INTEGER_CST)
+     return false;
+ 
    /* For now works only for exits that dominate the loop latch.  TODO -- extend
       for other conditions inside loop body.  */
    ex_bb = bb_for_stmt (use->stmt);
*************** iv_ca_recount_cost (struct ivopts_data *
*** 3867,3882 ****
    ivs->cost = cost;
  }
  
  /* Set USE not to be expressed by any candidate in IVS.  */
  
  static void
  iv_ca_set_no_cp (struct ivopts_data *data, struct iv_ca *ivs,
  		 struct iv_use *use)
  {
!   unsigned uid = use->id, cid, iid;
!   bitmap deps;
    struct cost_pair *cp;
-   bitmap_iterator bi;
  
    cp = ivs->cand_for_use[uid];
    if (!cp)
--- 4399,4431 ----
    ivs->cost = cost;
  }
  
+ /* Remove invariants in set INVS to set IVS.  */
+ 
+ static void
+ iv_ca_set_remove_invariants (struct iv_ca *ivs, bitmap invs)
+ {
+   bitmap_iterator bi;
+   unsigned iid;
+ 
+   if (!invs)
+     return;
+ 
+   EXECUTE_IF_SET_IN_BITMAP (invs, 0, iid, bi)
+     {
+       ivs->n_invariant_uses[iid]--;
+       if (ivs->n_invariant_uses[iid] == 0)
+ 	ivs->n_regs--;
+     }
+ }
+ 
  /* Set USE not to be expressed by any candidate in IVS.  */
  
  static void
  iv_ca_set_no_cp (struct ivopts_data *data, struct iv_ca *ivs,
  		 struct iv_use *use)
  {
!   unsigned uid = use->id, cid;
    struct cost_pair *cp;
  
    cp = ivs->cand_for_use[uid];
    if (!cp)
*************** iv_ca_set_no_cp (struct ivopts_data *dat
*** 3895,3917 ****
  	ivs->n_regs--;
        ivs->n_cands--;
        ivs->cand_cost -= cp->cand->cost;
      }
  
    ivs->cand_use_cost -= cp->cost;
  
!   deps = cp->depends_on;
  
!   if (deps)
      {
!       EXECUTE_IF_SET_IN_BITMAP (deps, 0, iid, bi)
! 	{
! 	  ivs->n_invariant_uses[iid]--;
! 	  if (ivs->n_invariant_uses[iid] == 0)
! 	    ivs->n_regs--;
! 	}
      }
- 
-   iv_ca_recount_cost (data, ivs);
  }
  
  /* Set cost pair for USE in set IVS to CP.  */
--- 4444,4476 ----
  	ivs->n_regs--;
        ivs->n_cands--;
        ivs->cand_cost -= cp->cand->cost;
+ 
+       iv_ca_set_remove_invariants (ivs, cp->cand->depends_on);
      }
  
    ivs->cand_use_cost -= cp->cost;
  
!   iv_ca_set_remove_invariants (ivs, cp->depends_on);
!   iv_ca_recount_cost (data, ivs);
! }
! 
! /* Add invariants in set INVS to set IVS.  */
! 
! static void
! iv_ca_set_add_invariants (struct iv_ca *ivs, bitmap invs)
! {
!   bitmap_iterator bi;
!   unsigned iid;
  
!   if (!invs)
!     return;
! 
!   EXECUTE_IF_SET_IN_BITMAP (invs, 0, iid, bi)
      {
!       ivs->n_invariant_uses[iid]++;
!       if (ivs->n_invariant_uses[iid] == 1)
! 	ivs->n_regs++;
      }
  }
  
  /* Set cost pair for USE in set IVS to CP.  */
*************** static void
*** 3920,3928 ****
  iv_ca_set_cp (struct ivopts_data *data, struct iv_ca *ivs,
  	      struct iv_use *use, struct cost_pair *cp)
  {
!   unsigned uid = use->id, cid, iid;
!   bitmap deps;
!   bitmap_iterator bi;
  
    if (ivs->cand_for_use[uid] == cp)
      return;
--- 4479,4485 ----
  iv_ca_set_cp (struct ivopts_data *data, struct iv_ca *ivs,
  	      struct iv_use *use, struct cost_pair *cp)
  {
!   unsigned uid = use->id, cid;
  
    if (ivs->cand_for_use[uid] == cp)
      return;
*************** iv_ca_set_cp (struct ivopts_data *data, 
*** 3945,3966 ****
  	    ivs->n_regs++;
  	  ivs->n_cands++;
  	  ivs->cand_cost += cp->cand->cost;
- 	}
- 
-       ivs->cand_use_cost += cp->cost;
- 
-       deps = cp->depends_on;
  
!       if (deps)
! 	{
! 	  EXECUTE_IF_SET_IN_BITMAP (deps, 0, iid, bi)
! 	    {
! 	      ivs->n_invariant_uses[iid]++;
! 	      if (ivs->n_invariant_uses[iid] == 1)
! 		ivs->n_regs++;
! 	    }
  	}
  
        iv_ca_recount_cost (data, ivs);
      }
  }
--- 4502,4513 ----
  	    ivs->n_regs++;
  	  ivs->n_cands++;
  	  ivs->cand_cost += cp->cand->cost;
  
! 	  iv_ca_set_add_invariants (ivs, cp->cand->depends_on);
  	}
  
+       ivs->cand_use_cost += cp->cost;
+       iv_ca_set_add_invariants (ivs, cp->depends_on);
        iv_ca_recount_cost (data, ivs);
      }
  }
*************** create_new_iv (struct ivopts_data *data,
*** 4643,4649 ****
  
    base = unshare_expr (cand->iv->base);
  
!   create_iv (base, cand->iv->step, cand->var_before, data->current_loop,
  	     &incr_pos, after, &cand->var_before, &cand->var_after);
  }
  
--- 5190,5197 ----
  
    base = unshare_expr (cand->iv->base);
  
!   create_iv (base, unshare_expr (cand->iv->step),
! 	     cand->var_before, data->current_loop,
  	     &incr_pos, after, &cand->var_before, &cand->var_after);
  }
  
*************** free_loop_data (struct ivopts_data *data
*** 5245,5250 ****
--- 5793,5800 ----
  
        if (cand->iv)
  	free (cand->iv);
+       if (cand->depends_on)
+ 	BITMAP_FREE (cand->depends_on);
        free (cand);
      }
    VARRAY_POP_ALL (data->iv_candidates);
Index: tree-ssa-loop-manip.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-manip.c,v
retrieving revision 2.30
diff -c -3 -p -r2.30 tree-ssa-loop-manip.c
*** tree-ssa-loop-manip.c	17 Apr 2005 06:41:54 -0000	2.30
--- tree-ssa-loop-manip.c	22 Apr 2005 21:23:04 -0000
*************** create_iv (tree base, tree step, tree va
*** 54,59 ****
--- 54,60 ----
    tree stmt, initial, step1, stmts;
    tree vb, va;
    enum tree_code incr_op = PLUS_EXPR;
+   edge pe = loop_preheader_edge (loop);
  
    if (!var)
      {
*************** create_iv (tree base, tree step, tree va
*** 92,97 ****
--- 93,104 ----
  	}
      }
  
+   /* Gimplify the step if necessary.  We put the computations in front of the
+      loop (i.e. the step should be loop invariant).  */
+   step = force_gimple_operand (step, &stmts, true, var);
+   if (stmts)
+     bsi_insert_on_edge_immediate_loop (pe, stmts);
+ 
    stmt = build2 (MODIFY_EXPR, void_type_node, va,
  		 build2 (incr_op, TREE_TYPE (base),
  			 vb, step));
*************** create_iv (tree base, tree step, tree va
*** 103,113 ****
  
    initial = force_gimple_operand (base, &stmts, true, var);
    if (stmts)
!     {
!       edge pe = loop_preheader_edge (loop);
! 
!       bsi_insert_on_edge_immediate_loop (pe, stmts);
!     }
  
    stmt = create_phi_node (vb, loop->header);
    SSA_NAME_DEF_STMT (vb) = stmt;
--- 110,116 ----
  
    initial = force_gimple_operand (base, &stmts, true, var);
    if (stmts)
!     bsi_insert_on_edge_immediate_loop (pe, stmts);
  
    stmt = create_phi_node (vb, loop->header);
    SSA_NAME_DEF_STMT (vb) = stmt;
Index: tree-ssa-loop-niter.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-niter.c,v
retrieving revision 2.23
diff -c -3 -p -r2.23 tree-ssa-loop-niter.c
*** tree-ssa-loop-niter.c	17 Apr 2005 06:41:55 -0000	2.23
--- tree-ssa-loop-niter.c	22 Apr 2005 21:23:04 -0000
*************** number_of_iterations_exit (struct loop *
*** 881,889 ****
        && !POINTER_TYPE_P (type))
      return false;
       
!   if (!simple_iv (loop, stmt, op0, &base0, &step0))
      return false;
!   if (!simple_iv (loop, stmt, op1, &base1, &step1))
      return false;
  
    niter->niter = NULL_TREE;
--- 881,889 ----
        && !POINTER_TYPE_P (type))
      return false;
       
!   if (!simple_iv (loop, stmt, op0, &base0, &step0, false))
      return false;
!   if (!simple_iv (loop, stmt, op1, &base1, &step1, false))
      return false;
  
    niter->niter = NULL_TREE;
Index: tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.c,v
retrieving revision 1.474
diff -c -3 -p -r1.474 tree.c
*** tree.c	22 Apr 2005 10:56:57 -0000	1.474
--- tree.c	22 Apr 2005 21:23:04 -0000
*************** tree
*** 518,524 ****
  build_int_cst_type (tree type, HOST_WIDE_INT low)
  {
    unsigned HOST_WIDE_INT val = (unsigned HOST_WIDE_INT) low;
!   unsigned HOST_WIDE_INT hi;
    unsigned bits;
    bool signed_p;
    bool negative;
--- 518,524 ----
  build_int_cst_type (tree type, HOST_WIDE_INT low)
  {
    unsigned HOST_WIDE_INT val = (unsigned HOST_WIDE_INT) low;
!   unsigned HOST_WIDE_INT hi, mask;
    unsigned bits;
    bool signed_p;
    bool negative;
*************** build_int_cst_type (tree type, HOST_WIDE
*** 538,547 ****
        negative = ((val >> (bits - 1)) & 1) != 0;
  
        /* Mask out the bits outside of the precision of the constant.  */
        if (signed_p && negative)
! 	val = val | ((~(unsigned HOST_WIDE_INT) 0) << bits);
        else
! 	val = val & ~((~(unsigned HOST_WIDE_INT) 0) << bits);
      }
  
    /* Determine the high bits.  */
--- 538,549 ----
        negative = ((val >> (bits - 1)) & 1) != 0;
  
        /* Mask out the bits outside of the precision of the constant.  */
+       mask = (((unsigned HOST_WIDE_INT) 2) << (bits - 1)) - 1;
+ 
        if (signed_p && negative)
! 	val |= ~mask;
        else
! 	val &= mask;
      }
  
    /* Determine the high bits.  */
*************** build_int_cst_type (tree type, HOST_WIDE
*** 556,562 ****
        else
  	{
  	  bits -= HOST_BITS_PER_WIDE_INT;
! 	  hi = hi & ~((~(unsigned HOST_WIDE_INT) 0) << bits);
  	}
      }
  
--- 558,565 ----
        else
  	{
  	  bits -= HOST_BITS_PER_WIDE_INT;
! 	  mask = (((unsigned HOST_WIDE_INT) 2) << (bits - 1)) - 1;
! 	  hi &= mask;
  	}
      }
  


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]