This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[patch] Improvements to work with affine combinations (from killloop-branch)
- From: Zdenek Dvorak <rakdver at atrey dot karlin dot mff dot cuni dot cz>
- To: gcc-patches at gcc dot gnu dot org
- Date: Wed, 30 Nov 2005 12:35:43 +0100
- Subject: [patch] Improvements to work with affine combinations (from killloop-branch)
Hello,
this patch improves the representation of affine combinations, most
importantly, the coefficients are no longer limited to HOST_WIDE_INTs.
This enables some significant cleanups in ivopts -- the cases when
coefficients do not fit into HOST_WIDE_INT do no longer need to be
handled specially.
This in turn enables us to get rid of computation_cost function that
was used to determine cost of a computation by expanding it to RTL in
these rare cases, and the related hack -- decl_rtl_to_reset -- which
was used to make us forget the effects of the expansion.
Bootstrapped & regtested on i686, ia64 and x86_64.
Zdenek
* tree-affine.c: New file.
* tree-affine.h: New file.
* tree-ssa-loop-niter.c (zero_p, nonzero_p): Moved to ...
* tree.c (zero_p, nonzero_p): ... here.
(signed_type_for): Generate a signed integer type for pointers.
* tree.h (nonzero_p, fixed_address_object_p): Declare.
* tree-ssa-loop-ivopts.c: Include tree-affine.h.
(AVG_LOOP_NITER): Use expected_loop_iterations.
(struct iv_use, struct iv_cand): Add base_aff field.
(decl_rtl_to_reset): Removed.
(single_dom_exit): Make the argument const.
(niter_for_exit): Expand simple operations in number of iterations.
(tree_ssa_iv_optimize_init, free_loop_data, tree_ssa_iv_optimize_finalize):
Do not handle decl_rtl_to_reset.
(determine_base_object): Handle MULT_EXPR.
(record_use, add_candidate_1): Initialize base_aff.
(extract_cond_operands, before_loop_cost): New functions.
(find_interesting_uses_cond): Use extract_cond_operands.
(add_old_iv_candidates): Use add_iv_value_candidates.
(add_iv_value_candidates): Add all_important parameter.
(add_derived_ivs_candidates): Pass false as all_important to
add_iv_value_candidates.
(set_use_iv_cost): Simplify.
(set_use_iv_cost_infinity, set_use_iv_cost_zero,
set_use_iv_cost_generic, set_use_iv_cost_outer,
set_use_iv_cost_compare): New functions.
(produce_memory_decl_rtl, prepare_decl_rtl, computation_cost): Removed.
(aff_combination_const, aff_combination_elt, aff_combination_scale,
aff_combination_add_elt, aff_combination_add, tree_to_aff_combination,
add_elt_to_tree, unshare_aff_combination, aff_combination_to_tree):
Moved to tree-affine.c.
(get_computation_aff): Do not perform any computations in trees.
(get_computation_at): Changed due to get_computation_aff changes.
(get_address_cost): Handle arbitrary offsets.
(force_expr_to_var_cost): Do not use computation_cost. Handle
conversions.
(convert_cost, integer_cost, fixed_address_cost): New functions.
(split_address_cost, ptr_difference_cost, difference_cost): Removed.
(aff_combination_cost, aff_combination_cost_address): New functions.
(get_computation_cost_at): Use aff_combination_cost functions.
(compare_cost): New function.
(determine_use_iv_cost_generic, determine_use_iv_cost_address,
determine_use_iv_cost_condition, determine_use_iv_cost_outer):
Use new set_use_iv_cost functions.
(determine_iv_cost): Use before_loop_cost.
(rewrite_use_address): Changed due to get_computation_aff changes.
* tree-ssa-address.c: Include tree-affine.h.
(fixed_address_object_p): Export.
(add_to_parts): Do not take coef argument.
(most_expensive_mult_to_index, addr_to_parts): Work with double_ints.
* tree-flow.h (single_dom_exit): Make parameter const.
(MAX_AFF_ELTS, struct affine_tree_combination): Moved to tree-affine.h.
* Makefile.in (tree-affine.o): Add.
(tree-ssa-loop-ivopts.o, tree-ssa-address.o): Add tree-affine.h
dependency.
* hwint.h (double_int): Define.
Index: tree-ssa-loop-niter.c
===================================================================
*** tree-ssa-loop-niter.c (revision 107712)
--- tree-ssa-loop-niter.c (working copy)
*************** Software Foundation, 51 Franklin Street,
*** 52,87 ****
*/
- /* Returns true if ARG is either NULL_TREE or constant zero. Unlike
- integer_zerop, it does not care about overflow flags. */
-
- bool
- zero_p (tree arg)
- {
- if (!arg)
- return true;
-
- if (TREE_CODE (arg) != INTEGER_CST)
- return false;
-
- return (TREE_INT_CST_LOW (arg) == 0 && TREE_INT_CST_HIGH (arg) == 0);
- }
-
- /* Returns true if ARG a nonzero constant. Unlike integer_nonzerop, it does
- not care about overflow flags. */
-
- static bool
- nonzero_p (tree arg)
- {
- if (!arg)
- return false;
-
- if (TREE_CODE (arg) != INTEGER_CST)
- return false;
-
- return (TREE_INT_CST_LOW (arg) != 0 || TREE_INT_CST_HIGH (arg) != 0);
- }
-
/* Returns inverse of X modulo 2^s, where MASK = 2^s-1. */
static tree
--- 52,57 ----
Index: tree.c
===================================================================
*** tree.c (revision 107712)
--- tree.c (working copy)
*************** integer_zerop (tree expr)
*** 1156,1161 ****
--- 1156,1191 ----
&& integer_zerop (TREE_IMAGPART (expr))));
}
+ /* Returns true if ARG is either NULL_TREE or constant zero. Unlike
+ integer_zerop, it does not care about overflow flags. */
+
+ bool
+ zero_p (tree arg)
+ {
+ if (!arg)
+ return true;
+
+ if (TREE_CODE (arg) != INTEGER_CST)
+ return false;
+
+ return (TREE_INT_CST_LOW (arg) == 0 && TREE_INT_CST_HIGH (arg) == 0);
+ }
+
+ /* Returns true if ARG a nonzero constant. Unlike integer_nonzerop, it does
+ not care about overflow flags. */
+
+ bool
+ nonzero_p (tree arg)
+ {
+ if (!arg)
+ return false;
+
+ if (TREE_CODE (arg) != INTEGER_CST)
+ return false;
+
+ return (TREE_INT_CST_LOW (arg) != 0 || TREE_INT_CST_HIGH (arg) != 0);
+ }
+
/* Return 1 if EXPR is the integer constant one or the corresponding
complex constant. */
*************** unsigned_type_for (tree type)
*** 6817,6822 ****
--- 6847,6855 ----
tree
signed_type_for (tree type)
{
+ if (POINTER_TYPE_P (type))
+ return lang_hooks.types.type_for_size (TYPE_PRECISION (type), 0);
+
return lang_hooks.types.signed_type (type);
}
Index: tree.h
===================================================================
*** tree.h (revision 107712)
--- tree.h (working copy)
*************** extern int integer_pow2p (tree);
*** 3600,3605 ****
--- 3600,3606 ----
extern int integer_nonzerop (tree);
extern bool zero_p (tree);
+ extern bool nonzero_p (tree);
extern bool cst_and_fits_in_hwi (tree);
extern tree num_ending_zeros (tree);
*************** extern int tree_map_eq (const void *, co
*** 4202,4207 ****
--- 4203,4209 ----
/* In tree-ssa-address.c. */
extern tree tree_mem_ref_addr (tree, tree);
extern void copy_mem_ref_info (tree, tree);
+ extern bool fixed_address_object_p (tree);
/* In tree-object-size.c. */
extern void init_object_sizes (void);
Index: tree-ssa-loop-ivopts.c
===================================================================
*** tree-ssa-loop-ivopts.c (revision 107712)
--- tree-ssa-loop-ivopts.c (working copy)
*************** Software Foundation, 51 Franklin Street,
*** 89,102 ****
#include "cfgloop.h"
#include "params.h"
#include "langhooks.h"
/* The infinite cost. */
#define INFTY 10000000
! /* The expected number of loop iterations. TODO -- use profiling instead of
! this. */
! #define AVG_LOOP_NITER(LOOP) 5
!
/* Representation of the induction variable. */
struct iv
--- 89,103 ----
#include "cfgloop.h"
#include "params.h"
#include "langhooks.h"
+ #include "tree-affine.h"
/* The infinite cost. */
#define INFTY 10000000
! /* The expected number of loop iterations. The plus one is because
! expected_loop_iterations returns number of executions of latch
! edge, not of loop body. */
! #define AVG_LOOP_NITER(LOOP) (expected_loop_iterations (LOOP) + 1)
/* Representation of the induction variable. */
struct iv
*************** struct iv_use
*** 154,159 ****
--- 155,162 ----
unsigned id; /* The id of the use. */
enum use_type type; /* Type of the use. */
struct iv *iv; /* The induction variable it is based on. */
+ aff_tree base_aff; /* Base of the induction variable, as an affine
+ combination. */
tree stmt; /* Statement in that it occurs. */
tree *op_p; /* The place where it occurs. */
bitmap related_cands; /* The set of "related" iv candidates, plus the common
*************** struct iv_cand
*** 190,195 ****
--- 193,200 ----
"pseudocandidate" used to indicate the possibility
to replace the final value of an iv by direct
computation of the value. */
+ aff_tree base_aff; /* Base of the induction variable, as an affine
+ combination. */
unsigned cost; /* Cost of the candidate. */
bitmap depends_on; /* The list of invariants that are used in step of the
biv. */
*************** struct iv_ca_delta
*** 311,321 ****
#define ALWAYS_PRUNE_CAND_SET_BOUND \
((unsigned) PARAM_VALUE (PARAM_IV_ALWAYS_PRUNE_CAND_SET_BOUND))
- /* The list of trees for that the decl_rtl field must be reset is stored
- here. */
-
- static VEC(tree,heap) *decl_rtl_to_reset;
-
/* Number of uses recorded in DATA. */
static inline unsigned
--- 316,321 ----
*************** loop_data (struct loop *loop)
*** 356,365 ****
return loop->aux;
}
/* The single loop exit if it dominates the latch, NULL otherwise. */
edge
! single_dom_exit (struct loop *loop)
{
edge exit = loop->single_exit;
--- 356,376 ----
return loop->aux;
}
+ /* Computes cost accounted for computations whose cost is COST that are
+ performed before the LOOP. */
+
+ static unsigned
+ before_loop_cost (struct loop *loop, unsigned cost)
+ {
+ unsigned niter = AVG_LOOP_NITER (loop);
+
+ return (cost + niter - 1) / niter;
+ }
+
/* The single loop exit if it dominates the latch, NULL otherwise. */
edge
! single_dom_exit (const struct loop *loop)
{
edge exit = loop->single_exit;
*************** niter_for_exit (struct ivopts_data *data
*** 715,720 ****
--- 726,735 ----
exit, &nfe_desc->niter,
true);
*slot = nfe_desc;
+
+ if (nfe_desc->valid_p)
+ nfe_desc->niter.niter
+ = expand_simple_operations (nfe_desc->niter.niter);
}
else
nfe_desc = *slot;
*************** tree_ssa_iv_optimize_init (struct loops
*** 762,768 ****
data->iv_uses = VEC_alloc (iv_use_p, heap, 20);
data->iv_candidates = VEC_alloc (iv_cand_p, heap, 20);
- decl_rtl_to_reset = VEC_alloc (tree, heap, 20);
}
/* Returns a memory object to that EXPR points. In case we are able to
--- 777,782 ----
*************** determine_base_object (tree expr)
*** 780,785 ****
--- 794,800 ----
switch (code)
{
case INTEGER_CST:
+ case MULT_EXPR:
return NULL_TREE;
case ADDR_EXPR:
*************** record_use (struct ivopts_data *data, tr
*** 1168,1173 ****
--- 1183,1189 ----
use->id = n_iv_uses (data);
use->type = use_type;
use->iv = iv;
+ tree_to_aff_combination (iv->base, TREE_TYPE (iv->base), &use->base_aff);
use->stmt = stmt;
use->op_p = use_p;
use->related_cands = BITMAP_ALLOC (NULL);
*************** find_interesting_uses_outer (struct ivop
*** 1279,1343 ****
return find_interesting_uses_outer_or_nonlin (data, op, USE_OUTER);
}
/* Checks whether the condition *COND_P in STMT is interesting
and if so, records it. */
static void
find_interesting_uses_cond (struct ivopts_data *data, tree stmt, tree *cond_p)
{
! tree *op0_p;
! tree *op1_p;
! struct iv *iv0 = NULL, *iv1 = NULL, *civ;
! struct iv const_iv;
! tree zero = integer_zero_node;
!
! const_iv.step = NULL_TREE;
!
! if (TREE_CODE (*cond_p) != SSA_NAME
! && !COMPARISON_CLASS_P (*cond_p))
! return;
! if (TREE_CODE (*cond_p) == SSA_NAME)
{
! op0_p = cond_p;
! op1_p = &zero;
! }
! else
! {
! op0_p = &TREE_OPERAND (*cond_p, 0);
! op1_p = &TREE_OPERAND (*cond_p, 1);
! }
!
! if (TREE_CODE (*op0_p) == SSA_NAME)
! iv0 = get_iv (data, *op0_p);
! else
! iv0 = &const_iv;
!
! if (TREE_CODE (*op1_p) == SSA_NAME)
! iv1 = get_iv (data, *op1_p);
! else
! iv1 = &const_iv;
!
! if (/* When comparing with non-invariant value, we may not do any senseful
! induction variable elimination. */
! (!iv0 || !iv1)
! /* Eliminating condition based on two ivs would be nontrivial.
! ??? TODO -- it is not really important to handle this case. */
! || (!zero_p (iv0->step) && !zero_p (iv1->step)))
! {
! find_interesting_uses_op (data, *op0_p);
! find_interesting_uses_op (data, *op1_p);
! return;
! }
!
! if (zero_p (iv0->step) && zero_p (iv1->step))
! {
! /* If both are invariants, this is a work for unswitching. */
return;
}
civ = xmalloc (sizeof (struct iv));
! *civ = zero_p (iv0->step) ? *iv1: *iv0;
record_use (data, cond_p, civ, stmt, USE_COMPARE);
}
--- 1295,1388 ----
return find_interesting_uses_outer_or_nonlin (data, op, USE_OUTER);
}
+ /* Given a condition *COND_P, checks whether it is a compare of an induction
+ variable and an invariant. If this is the case, CONTROL_VAR is set
+ to location of the iv, BOUND to the location of the invariant,
+ IV_VAR and IV_BOUND are set to the corresponding induction variable
+ descriptions, and true is returned. If this is not the case,
+ CONTROL_VAR and BOUND are set to the arguments of the condition and
+ false is returned. */
+
+ static bool
+ extract_cond_operands (struct ivopts_data *data, tree *cond_p,
+ tree **control_var, tree **bound,
+ struct iv **iv_var, struct iv **iv_bound)
+ {
+ /* The nodes returned when COND has just one operand. Note that you should
+ not modify anything in BOUND or IV_BOUND because of this. */
+ static struct iv const_iv;
+ static tree zero;
+ tree cond = *cond_p;
+ tree *op0 = &zero, *op1 = &zero, *tmp_op;
+ struct iv *iv0 = &const_iv, *iv1 = &const_iv, *tmp_iv;
+ bool ret = false;
+
+ zero = integer_zero_node;
+ if (TREE_CODE (cond) == SSA_NAME)
+ {
+ op0 = cond_p;
+ iv0 = get_iv (data, cond);
+ ret = (iv0 && !zero_p (iv0->step));
+ goto end;
+ }
+
+ if (!COMPARISON_CLASS_P (cond))
+ {
+ op0 = cond_p;
+ goto end;
+ }
+
+ op0 = &TREE_OPERAND (cond, 0);
+ op1 = &TREE_OPERAND (cond, 1);
+ if (TREE_CODE (*op0) == SSA_NAME)
+ iv0 = get_iv (data, *op0);
+ if (TREE_CODE (*op1) == SSA_NAME)
+ iv1 = get_iv (data, *op1);
+
+ /* Exactly one of the compared values must be an iv, and the other one must
+ be an invariant. */
+ if (!iv0 || !iv1)
+ goto end;
+
+ if (zero_p (iv0->step))
+ {
+ /* Control variable may be on the other side. */
+ tmp_op = op0; op0 = op1; op1 = tmp_op;
+ tmp_iv = iv0; iv0 = iv1; iv1 = tmp_iv;
+ }
+ ret = !zero_p (iv0->step) && zero_p (iv1->step);
+
+ end:
+ if (control_var)
+ *control_var = op0;;
+ if (iv_var)
+ *iv_var = iv0;;
+ if (bound)
+ *bound = op1;
+ if (iv_bound)
+ *iv_bound = iv1;
+
+ return ret;
+ }
+
/* Checks whether the condition *COND_P in STMT is interesting
and if so, records it. */
static void
find_interesting_uses_cond (struct ivopts_data *data, tree stmt, tree *cond_p)
{
! tree *var_p, *bound_p;
! struct iv *var_iv, *civ;
! if (!extract_cond_operands (data, cond_p, &var_p, &bound_p, &var_iv, NULL))
{
! find_interesting_uses_op (data, *var_p);
! find_interesting_uses_op (data, *bound_p);
return;
}
civ = xmalloc (sizeof (struct iv));
! *civ = *var_iv;
record_use (data, cond_p, civ, stmt, USE_COMPARE);
}
*************** add_candidate_1 (struct ivopts_data *dat
*** 1984,1989 ****
--- 2029,2037 ----
unsigned i;
struct iv_cand *cand = NULL;
tree type, orig_type;
+
+ /* Important candidate should not be specific to a single use. */
+ gcc_assert (!use || !important);
if (base)
{
*************** add_candidate_1 (struct ivopts_data *dat
*** 2041,2047 ****
if (!base && !step)
cand->iv = NULL;
else
! cand->iv = alloc_iv (base, step);
cand->pos = pos;
if (pos != IP_ORIGINAL && cand->iv)
--- 2089,2098 ----
if (!base && !step)
cand->iv = NULL;
else
! {
! cand->iv = alloc_iv (base, step);
! tree_to_aff_combination (base, TREE_TYPE (base), &cand->base_aff);
! }
cand->pos = pos;
if (pos != IP_ORIGINAL && cand->iv)
*************** add_standard_iv_candidates (struct ivopt
*** 2142,2147 ****
--- 2193,2236 ----
add_standard_iv_candidates_for_size (data, INT_TYPE_SIZE * 2);
}
+ /* Adds candidates based on the value of the induction variable IV and USE.
+ If ALL_IMPORTANT is true, mark all the candidates as important. */
+
+ static void
+ add_iv_value_candidates (struct ivopts_data *data,
+ struct iv *iv, struct iv_use *use,
+ bool all_important)
+ {
+ unsigned HOST_WIDE_INT offset;
+ tree base;
+ struct tree_niter_desc *niter = niter_for_single_dom_exit (data);
+ tree type = TREE_TYPE (iv->base);
+
+ add_candidate (data, iv->base, iv->step, all_important, use);
+
+ /* The same, but with initial value zero. If the iv is decreasing,
+ try using a variable with final value zero instead. Always make
+ such variable important, since it is generic enough so that possibly
+ many uses may be based on it. */
+ if (TREE_CODE (iv->step) == INTEGER_CST
+ && tree_int_cst_sign_bit (iv->step)
+ && niter)
+ {
+ base = fold_build2 (MULT_EXPR, type,
+ fold_convert (type, niter->niter),
+ fold_build1 (NEGATE_EXPR, type, iv->step));
+ add_candidate (data, base, iv->step, true, NULL);
+ }
+ else
+ add_candidate (data, build_int_cst (type, 0),
+ iv->step, true, NULL);
+
+ /* Third, try removing the constant offset. */
+ base = strip_offset (iv->base, &offset);
+ if (offset)
+ add_candidate (data, base, iv->step, all_important, use);
+ }
+
/* Adds candidates bases on the old induction variable IV. */
*************** add_old_iv_candidates (struct ivopts_dat
*** 2151,2162 ****
tree phi, def;
struct iv_cand *cand;
! add_candidate (data, iv->base, iv->step, true, NULL);
!
! /* The same, but with initial value zero. */
! add_candidate (data,
! build_int_cst (TREE_TYPE (iv->base), 0),
! iv->step, true, NULL);
phi = SSA_NAME_DEF_STMT (iv->ssa_name);
if (TREE_CODE (phi) == PHI_NODE)
--- 2240,2246 ----
tree phi, def;
struct iv_cand *cand;
! add_iv_value_candidates (data, iv, NULL, true);
phi = SSA_NAME_DEF_STMT (iv->ssa_name);
if (TREE_CODE (phi) == PHI_NODE)
*************** add_old_ivs_candidates (struct ivopts_da
*** 2189,2217 ****
}
}
- /* Adds candidates based on the value of the induction variable IV and USE. */
-
- static void
- add_iv_value_candidates (struct ivopts_data *data,
- struct iv *iv, struct iv_use *use)
- {
- unsigned HOST_WIDE_INT offset;
- tree base;
-
- add_candidate (data, iv->base, iv->step, false, use);
-
- /* The same, but with initial value zero. Make such variable important,
- since it is generic enough so that possibly many uses may be based
- on it. */
- add_candidate (data, build_int_cst (TREE_TYPE (iv->base), 0),
- iv->step, true, use);
-
- /* Third, try removing the constant offset. */
- base = strip_offset (iv->base, &offset);
- if (offset)
- add_candidate (data, base, iv->step, false, use);
- }
-
/* Possibly adds pseudocandidate for replacing the final value of USE by
a direct computation. */
--- 2273,2278 ----
*************** add_derived_ivs_candidates (struct ivopt
*** 2249,2259 ****
case USE_COMPARE:
case USE_ADDRESS:
/* Just add the ivs based on the value of the iv used here. */
! add_iv_value_candidates (data, use->iv, use);
break;
case USE_OUTER:
! add_iv_value_candidates (data, use->iv, use);
/* Additionally, add the pseudocandidate for the possibility to
replace the final value by a direct computation. */
--- 2310,2320 ----
case USE_COMPARE:
case USE_ADDRESS:
/* Just add the ivs based on the value of the iv used here. */
! add_iv_value_candidates (data, use->iv, use, false);
break;
case USE_OUTER:
! add_iv_value_candidates (data, use->iv, use, false);
/* Additionally, add the pseudocandidate for the possibility to
replace the final value by a direct computation. */
*************** set_use_iv_cost (struct ivopts_data *dat
*** 2376,2386 ****
if (data->consider_all_candidates)
{
! use->cost_map[cand->id].cand = cand;
! use->cost_map[cand->id].cost = cost;
! use->cost_map[cand->id].depends_on = depends_on;
! use->cost_map[cand->id].value = value;
! return;
}
/* n_map_members is a power of two, so this computes modulo. */
--- 2437,2444 ----
if (data->consider_all_candidates)
{
! i = cand->id;
! goto found;
}
/* n_map_members is a power of two, so this computes modulo. */
*************** found:
*** 2401,2406 ****
--- 2459,2521 ----
use->cost_map[i].value = value;
}
+ /* Sets cost of using candidate CAND for USE to infinity. Empty,
+ just to make code more readable and also for the case that
+ sometimes in future we wanted to do something in this case. */
+
+ static void
+ set_use_iv_cost_infinity (struct ivopts_data *data ATTRIBUTE_UNUSED,
+ struct iv_use *use ATTRIBUTE_UNUSED,
+ struct iv_cand *cand ATTRIBUTE_UNUSED)
+ {
+ }
+
+ /* Sets cost of using candidate CAND for USE to zero. */
+
+ static void
+ set_use_iv_cost_zero (struct ivopts_data *data, struct iv_use *use,
+ struct iv_cand *cand)
+ {
+ set_use_iv_cost (data, use, cand, 0, NULL, NULL_TREE);
+ }
+
+ /* Sets cost of using candidate CAND for USE to COST and record
+ that it depends on invariants in DEPENDS_ON bitmap. */
+
+ static void
+ set_use_iv_cost_generic (struct ivopts_data *data,
+ struct iv_use *use, struct iv_cand *cand,
+ unsigned cost, bitmap depends_on)
+ {
+ set_use_iv_cost (data, use, cand, cost, depends_on, NULL_TREE);
+ }
+
+ /* Sets cost of using candidate CAND for USE (that is of type USE_OUTER) to
+ COST and record that it depends on invariants in DEPENDS_ON bitmap. Record
+ that the final value is VALUE. */
+
+ static void
+ set_use_iv_cost_outer (struct ivopts_data *data,
+ struct iv_use *use, struct iv_cand *cand,
+ unsigned cost, bitmap depends_on, tree value)
+ {
+ gcc_assert (use->type == USE_OUTER);
+ set_use_iv_cost (data, use, cand, cost, depends_on, value);
+ }
+
+ /* Sets cost of using candidate CAND for USE (that is of type USE_COMPARE)
+ to COST and record that it depends on invariants in DEPENDS_ON bitmap.
+ Record that the replacement is to compare with BOUND. */
+
+ static void
+ set_use_iv_cost_compare (struct ivopts_data *data,
+ struct iv_use *use, struct iv_cand *cand,
+ unsigned cost, bitmap depends_on, tree bound)
+ {
+ gcc_assert (use->type == USE_COMPARE);
+ set_use_iv_cost (data, use, cand, cost, depends_on, bound);
+ }
+
/* Gets cost of (USE, CANDIDATE) pair. */
static struct cost_pair *
*************** seq_cost (rtx seq)
*** 2455,2560 ****
return cost;
}
- /* Produce DECL_RTL for object obj so it looks like it is stored in memory. */
- static rtx
- produce_memory_decl_rtl (tree obj, int *regno)
- {
- rtx x;
-
- gcc_assert (obj);
- if (TREE_STATIC (obj) || DECL_EXTERNAL (obj))
- {
- const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (obj));
- x = gen_rtx_SYMBOL_REF (Pmode, name);
- }
- else
- x = gen_raw_REG (Pmode, (*regno)++);
-
- return gen_rtx_MEM (DECL_MODE (obj), x);
- }
-
- /* Prepares decl_rtl for variables referred in *EXPR_P. Callback for
- walk_tree. DATA contains the actual fake register number. */
-
- static tree
- prepare_decl_rtl (tree *expr_p, int *ws, void *data)
- {
- tree obj = NULL_TREE;
- rtx x = NULL_RTX;
- int *regno = data;
-
- switch (TREE_CODE (*expr_p))
- {
- case ADDR_EXPR:
- for (expr_p = &TREE_OPERAND (*expr_p, 0);
- handled_component_p (*expr_p);
- expr_p = &TREE_OPERAND (*expr_p, 0))
- continue;
- obj = *expr_p;
- if (DECL_P (obj))
- x = produce_memory_decl_rtl (obj, regno);
- break;
-
- case SSA_NAME:
- *ws = 0;
- obj = SSA_NAME_VAR (*expr_p);
- if (!DECL_RTL_SET_P (obj))
- x = gen_raw_REG (DECL_MODE (obj), (*regno)++);
- break;
-
- case VAR_DECL:
- case PARM_DECL:
- case RESULT_DECL:
- *ws = 0;
- obj = *expr_p;
-
- if (DECL_RTL_SET_P (obj))
- break;
-
- if (DECL_MODE (obj) == BLKmode)
- x = produce_memory_decl_rtl (obj, regno);
- else
- x = gen_raw_REG (DECL_MODE (obj), (*regno)++);
-
- break;
-
- default:
- break;
- }
-
- if (x)
- {
- VEC_safe_push (tree, heap, decl_rtl_to_reset, obj);
- SET_DECL_RTL (obj, x);
- }
-
- return NULL_TREE;
- }
-
- /* Determines cost of the computation of EXPR. */
-
- static unsigned
- computation_cost (tree expr)
- {
- rtx seq, rslt;
- tree type = TREE_TYPE (expr);
- unsigned cost;
- /* Avoid using hard regs in ways which may be unsupported. */
- int regno = LAST_VIRTUAL_REGISTER + 1;
-
- walk_tree (&expr, prepare_decl_rtl, ®no, NULL);
- start_sequence ();
- rslt = expand_expr (expr, NULL_RTX, TYPE_MODE (type), EXPAND_NORMAL);
- seq = get_insns ();
- end_sequence ();
-
- cost = seq_cost (seq);
- if (MEM_P (rslt))
- cost += address_cost (XEXP (rslt, 0), TYPE_MODE (type));
-
- return cost;
- }
-
/* Returns variable containing the value of candidate CAND at statement AT. */
static tree
--- 2570,2575 ----
*************** constant_multiple_of (tree type, tree to
*** 2647,2988 ****
/* Ditto for TOP. */
if (tree_int_cst_sign_bit (top))
! {
! negate = !negate;
! top = fold_unary_to_constant (NEGATE_EXPR, type, top);
! }
!
! if (!zero_p (fold_binary_to_constant (TRUNC_MOD_EXPR, type, top, bot)))
! return NULL_TREE;
!
! res = fold_binary_to_constant (EXACT_DIV_EXPR, type, top, bot);
! if (negate)
! res = fold_unary_to_constant (NEGATE_EXPR, type, res);
! return res;
!
! default:
! return NULL_TREE;
! }
! }
!
! /* Sets COMB to CST. */
!
! static void
! aff_combination_const (struct affine_tree_combination *comb, tree type,
! unsigned HOST_WIDE_INT cst)
! {
! unsigned prec = TYPE_PRECISION (type);
!
! comb->type = type;
! comb->mask = (((unsigned HOST_WIDE_INT) 2 << (prec - 1)) - 1);
!
! comb->n = 0;
! comb->rest = NULL_TREE;
! comb->offset = cst & comb->mask;
! }
!
! /* Sets COMB to single element ELT. */
!
! static void
! aff_combination_elt (struct affine_tree_combination *comb, tree type, tree elt)
! {
! unsigned prec = TYPE_PRECISION (type);
!
! comb->type = type;
! comb->mask = (((unsigned HOST_WIDE_INT) 2 << (prec - 1)) - 1);
!
! comb->n = 1;
! comb->elts[0] = elt;
! comb->coefs[0] = 1;
! comb->rest = NULL_TREE;
! comb->offset = 0;
! }
!
! /* Scales COMB by SCALE. */
!
! static void
! aff_combination_scale (struct affine_tree_combination *comb,
! unsigned HOST_WIDE_INT scale)
! {
! unsigned i, j;
!
! if (scale == 1)
! return;
!
! if (scale == 0)
! {
! aff_combination_const (comb, comb->type, 0);
! return;
! }
!
! comb->offset = (scale * comb->offset) & comb->mask;
! for (i = 0, j = 0; i < comb->n; i++)
! {
! comb->coefs[j] = (scale * comb->coefs[i]) & comb->mask;
! comb->elts[j] = comb->elts[i];
! if (comb->coefs[j] != 0)
! j++;
! }
! comb->n = j;
!
! if (comb->rest)
! {
! if (comb->n < MAX_AFF_ELTS)
! {
! comb->coefs[comb->n] = scale;
! comb->elts[comb->n] = comb->rest;
! comb->rest = NULL_TREE;
! comb->n++;
! }
! else
! comb->rest = fold_build2 (MULT_EXPR, comb->type, comb->rest,
! build_int_cst_type (comb->type, scale));
! }
! }
!
! /* Adds ELT * SCALE to COMB. */
!
! static void
! aff_combination_add_elt (struct affine_tree_combination *comb, tree elt,
! unsigned HOST_WIDE_INT scale)
! {
! unsigned i;
!
! if (scale == 0)
! return;
!
! for (i = 0; i < comb->n; i++)
! if (operand_equal_p (comb->elts[i], elt, 0))
! {
! comb->coefs[i] = (comb->coefs[i] + scale) & comb->mask;
! if (comb->coefs[i])
! return;
!
! comb->n--;
! comb->coefs[i] = comb->coefs[comb->n];
! comb->elts[i] = comb->elts[comb->n];
!
! if (comb->rest)
! {
! gcc_assert (comb->n == MAX_AFF_ELTS - 1);
! comb->coefs[comb->n] = 1;
! comb->elts[comb->n] = comb->rest;
! comb->rest = NULL_TREE;
! comb->n++;
! }
! return;
! }
! if (comb->n < MAX_AFF_ELTS)
! {
! comb->coefs[comb->n] = scale;
! comb->elts[comb->n] = elt;
! comb->n++;
! return;
! }
!
! if (scale == 1)
! elt = fold_convert (comb->type, elt);
! else
! elt = fold_build2 (MULT_EXPR, comb->type,
! fold_convert (comb->type, elt),
! build_int_cst_type (comb->type, scale));
!
! if (comb->rest)
! comb->rest = fold_build2 (PLUS_EXPR, comb->type, comb->rest, elt);
! else
! comb->rest = elt;
! }
!
! /* Adds COMB2 to COMB1. */
!
! static void
! aff_combination_add (struct affine_tree_combination *comb1,
! struct affine_tree_combination *comb2)
! {
! unsigned i;
!
! comb1->offset = (comb1->offset + comb2->offset) & comb1->mask;
! for (i = 0; i < comb2->n; i++)
! aff_combination_add_elt (comb1, comb2->elts[i], comb2->coefs[i]);
! if (comb2->rest)
! aff_combination_add_elt (comb1, comb2->rest, 1);
! }
!
! /* Splits EXPR into an affine combination of parts. */
!
! static void
! tree_to_aff_combination (tree expr, tree type,
! struct affine_tree_combination *comb)
! {
! struct affine_tree_combination tmp;
! enum tree_code code;
! tree cst, core, toffset;
! HOST_WIDE_INT bitpos, bitsize;
! enum machine_mode mode;
! int unsignedp, volatilep;
!
! STRIP_NOPS (expr);
!
! code = TREE_CODE (expr);
! switch (code)
! {
! case INTEGER_CST:
! aff_combination_const (comb, type, int_cst_value (expr));
! return;
!
! case PLUS_EXPR:
! case MINUS_EXPR:
! tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
! tree_to_aff_combination (TREE_OPERAND (expr, 1), type, &tmp);
! if (code == MINUS_EXPR)
! aff_combination_scale (&tmp, -1);
! aff_combination_add (comb, &tmp);
! return;
!
! case MULT_EXPR:
! cst = TREE_OPERAND (expr, 1);
! if (TREE_CODE (cst) != INTEGER_CST)
! break;
! tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
! aff_combination_scale (comb, int_cst_value (cst));
! return;
!
! case NEGATE_EXPR:
! tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
! aff_combination_scale (comb, -1);
! return;
!
! case ADDR_EXPR:
! core = get_inner_reference (TREE_OPERAND (expr, 0), &bitsize, &bitpos,
! &toffset, &mode, &unsignedp, &volatilep,
! false);
! if (bitpos % BITS_PER_UNIT != 0)
! break;
! aff_combination_const (comb, type, bitpos / BITS_PER_UNIT);
! core = build_fold_addr_expr (core);
! if (TREE_CODE (core) == ADDR_EXPR)
! aff_combination_add_elt (comb, core, 1);
! else
! {
! tree_to_aff_combination (core, type, &tmp);
! aff_combination_add (comb, &tmp);
! }
! if (toffset)
! {
! tree_to_aff_combination (toffset, type, &tmp);
! aff_combination_add (comb, &tmp);
! }
! return;
!
! default:
! break;
! }
!
! aff_combination_elt (comb, type, expr);
! }
!
! /* Creates EXPR + ELT * SCALE in TYPE. MASK is the mask for width of TYPE. */
!
! static tree
! add_elt_to_tree (tree expr, tree type, tree elt, unsigned HOST_WIDE_INT scale,
! unsigned HOST_WIDE_INT mask)
! {
! enum tree_code code;
!
! scale &= mask;
! elt = fold_convert (type, elt);
!
! if (scale == 1)
! {
! if (!expr)
! return elt;
!
! return fold_build2 (PLUS_EXPR, type, expr, elt);
! }
!
! if (scale == mask)
! {
! if (!expr)
! return fold_build1 (NEGATE_EXPR, type, elt);
!
! return fold_build2 (MINUS_EXPR, type, expr, elt);
! }
!
! if (!expr)
! return fold_build2 (MULT_EXPR, type, elt,
! build_int_cst_type (type, scale));
!
! if ((scale | (mask >> 1)) == mask)
! {
! /* Scale is negative. */
! code = MINUS_EXPR;
! scale = (-scale) & mask;
! }
! else
! code = PLUS_EXPR;
!
! elt = fold_build2 (MULT_EXPR, type, elt,
! build_int_cst_type (type, scale));
! return fold_build2 (code, type, expr, elt);
! }
!
! /* Copies the tree elements of COMB to ensure that they are not shared. */
!
! static void
! unshare_aff_combination (struct affine_tree_combination *comb)
! {
! unsigned i;
!
! for (i = 0; i < comb->n; i++)
! comb->elts[i] = unshare_expr (comb->elts[i]);
! if (comb->rest)
! comb->rest = unshare_expr (comb->rest);
! }
!
! /* Makes tree from the affine combination COMB. */
!
! static tree
! aff_combination_to_tree (struct affine_tree_combination *comb)
! {
! tree type = comb->type;
! tree expr = comb->rest;
! unsigned i;
! unsigned HOST_WIDE_INT off, sgn;
!
! /* Handle the special case produced by get_computation_aff when
! the type does not fit in HOST_WIDE_INT. */
! if (comb->n == 0 && comb->offset == 0)
! return fold_convert (type, expr);
! gcc_assert (comb->n == MAX_AFF_ELTS || comb->rest == NULL_TREE);
! for (i = 0; i < comb->n; i++)
! expr = add_elt_to_tree (expr, type, comb->elts[i], comb->coefs[i],
! comb->mask);
! if ((comb->offset | (comb->mask >> 1)) == comb->mask)
! {
! /* Offset is negative. */
! off = (-comb->offset) & comb->mask;
! sgn = comb->mask;
! }
! else
! {
! off = comb->offset;
! sgn = 1;
}
- return add_elt_to_tree (expr, type, build_int_cst_type (type, off), sgn,
- comb->mask);
}
/* Determines the expression by that USE is expressed from induction variable
CAND at statement AT in LOOP. The expression is stored in a decomposed
! form into AFF. Returns false if USE cannot be expressed using CAND. */
static bool
get_computation_aff (struct loop *loop,
struct iv_use *use, struct iv_cand *cand, tree at,
! struct affine_tree_combination *aff)
{
tree ubase = use->iv->base;
tree ustep = use->iv->step;
--- 2662,2694 ----
/* Ditto for TOP. */
if (tree_int_cst_sign_bit (top))
! {
! negate = !negate;
! top = fold_unary_to_constant (NEGATE_EXPR, type, top);
! }
! if (!zero_p (fold_binary_to_constant (TRUNC_MOD_EXPR, type, top, bot)))
! return NULL_TREE;
! res = fold_binary_to_constant (EXACT_DIV_EXPR, type, top, bot);
! if (negate)
! res = fold_unary_to_constant (NEGATE_EXPR, type, res);
! return res;
! default:
! return NULL_TREE;
}
}
/* Determines the expression by that USE is expressed from induction variable
CAND at statement AT in LOOP. The expression is stored in a decomposed
! form into AFF. Returns false if USE cannot be expressed using CAND.
! The ratio by that the candidate is multiplied is returned in MUL_RAT. */
static bool
get_computation_aff (struct loop *loop,
struct iv_use *use, struct iv_cand *cand, tree at,
! aff_tree *aff, double_int *mul_rat)
{
tree ubase = use->iv->base;
tree ustep = use->iv->step;
*************** get_computation_aff (struct loop *loop,
*** 2990,3001 ****
tree cstep = cand->iv->step;
tree utype = TREE_TYPE (ubase), ctype = TREE_TYPE (cbase);
tree uutype;
! tree expr, delta;
! tree ratio;
unsigned HOST_WIDE_INT ustepi, cstepi;
! HOST_WIDE_INT ratioi;
! struct affine_tree_combination cbase_aff, expr_aff;
! tree cstep_orig = cstep, ustep_orig = ustep;
if (TYPE_PRECISION (utype) > TYPE_PRECISION (ctype))
{
--- 2696,2704 ----
tree cstep = cand->iv->step;
tree utype = TREE_TYPE (ubase), ctype = TREE_TYPE (cbase);
tree uutype;
! double_int rat;
unsigned HOST_WIDE_INT ustepi, cstepi;
! aff_tree cbase_aff, expr_aff;
if (TYPE_PRECISION (utype) > TYPE_PRECISION (ctype))
{
*************** get_computation_aff (struct loop *loop,
*** 3003,3077 ****
return false;
}
! expr = var_at_stmt (loop, cand, at);
! if (TREE_TYPE (expr) != ctype)
! {
! /* This may happen with the original ivs. */
! expr = fold_convert (ctype, expr);
! }
!
! if (TYPE_UNSIGNED (utype))
! uutype = utype;
! else
! {
! uutype = unsigned_type_for (utype);
! ubase = fold_convert (uutype, ubase);
! ustep = fold_convert (uutype, ustep);
! }
!
! if (uutype != ctype)
! {
! expr = fold_convert (uutype, expr);
! cbase = fold_convert (uutype, cbase);
! cstep = fold_convert (uutype, cstep);
! /* If the conversion is not noop, we must take it into account when
! considering the value of the step. */
! if (TYPE_PRECISION (utype) < TYPE_PRECISION (ctype))
! cstep_orig = cstep;
! }
!
! if (cst_and_fits_in_hwi (cstep_orig)
! && cst_and_fits_in_hwi (ustep_orig))
{
! ustepi = int_cst_value (ustep_orig);
! cstepi = int_cst_value (cstep_orig);
if (!divide (TYPE_PRECISION (uutype), ustepi, cstepi, &ratioi))
! {
! /* TODO maybe consider case when ustep divides cstep and the ratio is
! a power of 2 (so that the division is fast to execute)? We would
! need to be much more careful with overflows etc. then. */
! return false;
! }
! ratio = build_int_cst_type (uutype, ratioi);
}
else
{
! ratio = constant_multiple_of (uutype, ustep_orig, cstep_orig);
if (!ratio)
return false;
!
! /* Ratioi is only used to detect special cases when the multiplicative
! factor is 1 or -1, so if we cannot convert ratio to HOST_WIDE_INT,
! we may set it to 0. We prefer cst_and_fits_in_hwi/int_cst_value
! to integer_onep/integer_all_onesp, since the former ignores
! TREE_OVERFLOW. */
! if (cst_and_fits_in_hwi (ratio))
! ratioi = int_cst_value (ratio);
! else if (integer_onep (ratio))
! ratioi = 1;
! else if (integer_all_onesp (ratio))
! ratioi = -1;
! else
! ratioi = 0;
}
! /* We may need to shift the value if we are after the increment. */
if (stmt_after_increment (loop, cand, at))
! cbase = fold_build2 (PLUS_EXPR, uutype, cbase, cstep);
/* use = ubase - ratio * cbase + ratio * var.
--- 2706,2753 ----
return false;
}
! uutype = unsigned_type_for (utype);
! /* If the conversion is not noop, we must take it into account when
! considering the value of the step. */
! if (TYPE_PRECISION (utype) < TYPE_PRECISION (ctype))
! cstep = fold_convert (uutype, cstep);
! if (cst_and_fits_in_hwi (cstep)
! && cst_and_fits_in_hwi (ustep))
{
! HOST_WIDE_INT ratioi;
! ustepi = int_cst_value (ustep);
! cstepi = int_cst_value (cstep);
if (!divide (TYPE_PRECISION (uutype), ustepi, cstepi, &ratioi))
! return false;
! rat = hwi_to_double_int (ratioi);
}
else
{
! tree ratio = constant_multiple_of (uutype, ustep, cstep);
if (!ratio)
return false;
! rat = tree_to_double_int (ratio);
}
! *aff = use->base_aff;
! cbase_aff = cand->base_aff;
! tree_to_aff_combination (var_at_stmt (loop, cand, at), ctype, &expr_aff);
! aff_combination_convert (aff, uutype);
! aff_combination_convert (&cbase_aff, uutype);
! aff_combination_convert (&expr_aff, uutype);
!
! /* We need to shift the value if we are after the increment. */
if (stmt_after_increment (loop, cand, at))
! {
! aff_tree cstep_aff;
!
! tree_to_aff_combination (cstep, uutype, &cstep_aff);
! aff_combination_add (&cbase_aff, &cstep_aff);
! }
/* use = ubase - ratio * cbase + ratio * var.
*************** get_computation_aff (struct loop *loop,
*** 3081,3130 ****
happen, fold is able to apply the distributive law to obtain this form
anyway. */
! if (TYPE_PRECISION (uutype) > HOST_BITS_PER_WIDE_INT)
! {
! /* Let's compute in trees and just return the result in AFF. This case
! should not be very common, and fold itself is not that bad either,
! so making the aff. functions more complicated to handle this case
! is not that urgent. */
! if (ratioi == 1)
! {
! delta = fold_build2 (MINUS_EXPR, uutype, ubase, cbase);
! expr = fold_build2 (PLUS_EXPR, uutype, expr, delta);
! }
! else if (ratioi == -1)
! {
! delta = fold_build2 (PLUS_EXPR, uutype, ubase, cbase);
! expr = fold_build2 (MINUS_EXPR, uutype, delta, expr);
! }
! else
! {
! delta = fold_build2 (MULT_EXPR, uutype, cbase, ratio);
! delta = fold_build2 (MINUS_EXPR, uutype, ubase, delta);
! expr = fold_build2 (MULT_EXPR, uutype, ratio, expr);
! expr = fold_build2 (PLUS_EXPR, uutype, delta, expr);
! }
!
! aff->type = uutype;
! aff->n = 0;
! aff->offset = 0;
! aff->mask = 0;
! aff->rest = expr;
! return true;
! }
!
! /* If we got here, the types fits in HOST_WIDE_INT, thus it must be
! possible to compute ratioi. */
! gcc_assert (ratioi);
!
! tree_to_aff_combination (ubase, uutype, aff);
! tree_to_aff_combination (cbase, uutype, &cbase_aff);
! tree_to_aff_combination (expr, uutype, &expr_aff);
! aff_combination_scale (&cbase_aff, -ratioi);
! aff_combination_scale (&expr_aff, ratioi);
aff_combination_add (aff, &cbase_aff);
aff_combination_add (aff, &expr_aff);
return true;
}
--- 2757,2770 ----
happen, fold is able to apply the distributive law to obtain this form
anyway. */
! aff_combination_scale (&cbase_aff, double_int_negate (cbase_aff.mask, rat));
! aff_combination_scale (&expr_aff, rat);
aff_combination_add (aff, &cbase_aff);
aff_combination_add (aff, &expr_aff);
+ if (mul_rat)
+ *mul_rat = rat;
+
return true;
}
*************** static tree
*** 3135,3144 ****
get_computation_at (struct loop *loop,
struct iv_use *use, struct iv_cand *cand, tree at)
{
! struct affine_tree_combination aff;
tree type = TREE_TYPE (use->iv->base);
! if (!get_computation_aff (loop, use, cand, at, &aff))
return NULL_TREE;
unshare_aff_combination (&aff);
return fold_convert (type, aff_combination_to_tree (&aff));
--- 2775,2784 ----
get_computation_at (struct loop *loop,
struct iv_use *use, struct iv_cand *cand, tree at)
{
! aff_tree aff;
tree type = TREE_TYPE (use->iv->base);
! if (!get_computation_aff (loop, use, cand, at, &aff, NULL))
return NULL_TREE;
unshare_aff_combination (&aff);
return fold_convert (type, aff_combination_to_tree (&aff));
*************** get_address_cost (bool symbol_present, b
*** 3320,3347 ****
if (!initialized)
{
! HOST_WIDE_INT i;
initialized = true;
-
reg1 = gen_raw_REG (Pmode, LAST_VIRTUAL_REGISTER + 1);
-
addr = gen_rtx_fmt_ee (PLUS, Pmode, reg1, NULL_RTX);
! for (i = 1; i <= 1 << 20; i <<= 1)
{
XEXP (addr, 1) = gen_int_mode (i, Pmode);
if (!memory_address_p (Pmode, addr))
break;
}
- max_offset = i >> 1;
off = max_offset;
! for (i = 1; i <= 1 << 20; i <<= 1)
{
! XEXP (addr, 1) = gen_int_mode (-i, Pmode);
if (!memory_address_p (Pmode, addr))
break;
}
- min_offset = -(i >> 1);
if (dump_file && (dump_flags & TDF_DETAILS))
{
--- 2960,2995 ----
if (!initialized)
{
! HOST_WIDE_INT i, j;
! HOST_WIDE_INT bits = HOST_BITS_PER_WIDE_INT;
! if (GET_MODE_BITSIZE (Pmode) < bits)
! bits = GET_MODE_BITSIZE (Pmode);
initialized = true;
reg1 = gen_raw_REG (Pmode, LAST_VIRTUAL_REGISTER + 1);
addr = gen_rtx_fmt_ee (PLUS, Pmode, reg1, NULL_RTX);
!
! max_offset = 0;
! i = 0;
! for (j = 0; j < bits - 1; j++)
{
+ i = (i << 1) + 1;
XEXP (addr, 1) = gen_int_mode (i, Pmode);
if (!memory_address_p (Pmode, addr))
break;
+ max_offset = i;
}
off = max_offset;
! min_offset = 0;
! i = -1;
! for (j = 0; j < bits; j++)
{
! XEXP (addr, 1) = gen_int_mode (i, Pmode);
if (!memory_address_p (Pmode, addr))
break;
+ min_offset = i;
+ i <<= 1;
}
if (dump_file && (dump_flags & TDF_DETAILS))
{
*************** get_address_cost (bool symbol_present, b
*** 3433,3505 ****
return cost + acost;
}
! /* Estimates cost of forcing expression EXPR into a variable. */
! unsigned
! force_expr_to_var_cost (tree expr)
{
! static bool costs_initialized = false;
! static unsigned integer_cost;
! static unsigned symbol_cost;
! static unsigned address_cost;
! tree op0, op1;
! unsigned cost0, cost1, cost;
! enum machine_mode mode;
! if (!costs_initialized)
{
! tree var = create_tmp_var_raw (integer_type_node, "test_var");
! rtx x = gen_rtx_MEM (DECL_MODE (var),
! gen_rtx_SYMBOL_REF (Pmode, "test_var"));
! tree addr;
! tree type = build_pointer_type (integer_type_node);
!
! integer_cost = computation_cost (build_int_cst_type (integer_type_node,
! 2000));
!
! SET_DECL_RTL (var, x);
! TREE_STATIC (var) = 1;
! addr = build1 (ADDR_EXPR, type, var);
! symbol_cost = computation_cost (addr) + 1;
!
! address_cost
! = computation_cost (build2 (PLUS_EXPR, type,
! addr,
! build_int_cst_type (type, 2000))) + 1;
if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file, "force_expr_to_var_cost:\n");
! fprintf (dump_file, " integer %d\n", (int) integer_cost);
! fprintf (dump_file, " symbol %d\n", (int) symbol_cost);
! fprintf (dump_file, " address %d\n", (int) address_cost);
! fprintf (dump_file, " other %d\n", (int) target_spill_cost);
! fprintf (dump_file, "\n");
! }
! costs_initialized = true;
}
STRIP_NOPS (expr);
if (SSA_VAR_P (expr))
return 0;
if (TREE_INVARIANT (expr))
{
if (TREE_CODE (expr) == INTEGER_CST)
! return integer_cost;
!
! if (TREE_CODE (expr) == ADDR_EXPR)
! {
! tree obj = TREE_OPERAND (expr, 0);
!
! if (TREE_CODE (obj) == VAR_DECL
! || TREE_CODE (obj) == PARM_DECL
! || TREE_CODE (obj) == RESULT_DECL)
! return symbol_cost;
! }
! return address_cost;
}
switch (TREE_CODE (expr))
--- 3081,3189 ----
return cost + acost;
}
! /* Returs cost of conversion from mode MODE_FROM to mode MODE_TO. UNSIGNED_P
! is true if the conversion is unsigned. */
! static unsigned
! convert_cost (enum machine_mode mode_to, enum machine_mode mode_from,
! bool unsigned_p)
{
! static unsigned costs[NUM_MACHINE_MODES][NUM_MACHINE_MODES][2];
! unsigned cost;
! rtx seq;
!
! if (mode_to == mode_from)
! return 0;
! cost = costs[mode_to][mode_from][unsigned_p];
! if (cost)
! return cost;
!
! start_sequence ();
! convert_move (gen_raw_REG (mode_to, LAST_VIRTUAL_REGISTER + 1),
! gen_raw_REG (mode_from, LAST_VIRTUAL_REGISTER + 2),
! unsigned_p);
! seq = get_insns ();
! end_sequence ();
!
! cost = seq_cost (seq);
! if (!cost)
! cost = 1;
!
! costs[mode_to][mode_from][unsigned_p] = cost;
!
! if (dump_file && (dump_flags & TDF_DETAILS))
! fprintf (dump_file, "Conversion from %sto %s costs %d\n",
! GET_MODE_NAME (mode_from), GET_MODE_NAME (mode_to), cost);
! return cost;
! }
!
! /* Returns the cost of forcing an integer constant in MODE to be
! an operand. Not implemented yet, for now we assume all constants
! are cheap (which is not always the case). */
!
! static unsigned
! integer_cost (enum machine_mode mode)
! {
! static unsigned costs[NUM_MACHINE_MODES];
!
! if (!costs[mode])
{
! /* TODO -- determine the cost. */
! costs[mode] = 1;
!
if (dump_file && (dump_flags & TDF_DETAILS))
! fprintf (dump_file, "Cost of integer constant in %s is 0\n",
! GET_MODE_NAME (mode));
! }
! return costs[mode] - 1;
! }
!
! /* Retunrs the cost of using a fixed address as an operand. Not implemented
! yet, for now we assume all such expressions are cheap (which is not always
! the case). */
!
! static unsigned
! fixed_address_cost (void)
! {
! static unsigned cost = 0;
!
! if (!cost)
! {
! /* TODO -- determine the cost. */
! cost = 1;
! if (dump_file && (dump_flags & TDF_DETAILS))
! fprintf (dump_file, "Cost of taking a fixed address is 0\n");
}
+ return cost - 1;
+ }
+
+ /* Estimates cost of forcing expression EXPR into a variable. */
+
+ unsigned
+ force_expr_to_var_cost (tree expr)
+ {
+ tree op0 = NULL_TREE, op1 = NULL_TREE, inner_type;
+ unsigned cost0 = 0, cost1 = 0, cost;
+ enum machine_mode mode;
+
STRIP_NOPS (expr);
if (SSA_VAR_P (expr))
return 0;
+ mode = TYPE_MODE (TREE_TYPE (expr));
+
if (TREE_INVARIANT (expr))
{
if (TREE_CODE (expr) == INTEGER_CST)
! return integer_cost (mode);
! /* TODO -- this might also be address of parameter, or other invariants
! that do not really qualify as object with fixed address. */
! return fixed_address_cost ();
}
switch (TREE_CODE (expr))
*************** force_expr_to_var_cost (tree expr)
*** 3512,3535 ****
STRIP_NOPS (op0);
STRIP_NOPS (op1);
! if (is_gimple_val (op0))
! cost0 = 0;
! else
cost0 = force_expr_to_var_cost (op0);
! if (is_gimple_val (op1))
! cost1 = 0;
! else
cost1 = force_expr_to_var_cost (op1);
break;
default:
/* Just an arbitrary value, FIXME. */
return target_spill_cost;
}
- mode = TYPE_MODE (TREE_TYPE (expr));
switch (TREE_CODE (expr))
{
case PLUS_EXPR:
--- 3196,3221 ----
STRIP_NOPS (op0);
STRIP_NOPS (op1);
! if (!is_gimple_val (op0))
cost0 = force_expr_to_var_cost (op0);
! if (!is_gimple_val (op1))
cost1 = force_expr_to_var_cost (op1);
break;
+ case CONVERT_EXPR:
+ case NOP_EXPR:
+ op0 = TREE_OPERAND (expr, 0);
+ STRIP_NOPS (op0);
+ cost0 = force_expr_to_var_cost (op0);
+ break;
+
default:
/* Just an arbitrary value, FIXME. */
return target_spill_cost;
}
switch (TREE_CODE (expr))
{
case PLUS_EXPR:
*************** force_expr_to_var_cost (tree expr)
*** 3546,3551 ****
--- 3232,3244 ----
return target_spill_cost;
break;
+ case CONVERT_EXPR:
+ case NOP_EXPR:
+ inner_type = TREE_TYPE (op0);
+ cost = convert_cost (mode, TYPE_MODE (inner_type),
+ TYPE_UNSIGNED (inner_type));
+ break;
+
default:
gcc_unreachable ();
}
*************** force_var_cost (struct ivopts_data *data
*** 3576,3713 ****
return force_expr_to_var_cost (expr);
}
! /* Estimates cost of expressing address ADDR as var + symbol + offset. The
! value of offset is added to OFFSET, SYMBOL_PRESENT and VAR_PRESENT are set
! to false if the corresponding part is missing. DEPENDS_ON is a set of the
! invariants the computation depends on. */
static unsigned
! split_address_cost (struct ivopts_data *data,
! tree addr, bool *symbol_present, bool *var_present,
! unsigned HOST_WIDE_INT *offset, bitmap *depends_on)
{
! tree core;
! HOST_WIDE_INT bitsize;
! HOST_WIDE_INT bitpos;
! tree toffset;
! enum machine_mode mode;
! int unsignedp, volatilep;
!
! core = get_inner_reference (addr, &bitsize, &bitpos, &toffset, &mode,
! &unsignedp, &volatilep, false);
! if (toffset != 0
! || bitpos % BITS_PER_UNIT != 0
! || TREE_CODE (core) != VAR_DECL)
{
! *symbol_present = false;
! *var_present = true;
! fd_ivopts_data = data;
! walk_tree (&addr, find_depends, depends_on, NULL);
! return target_spill_cost;
}
! *offset += bitpos / BITS_PER_UNIT;
! if (TREE_STATIC (core)
! || DECL_EXTERNAL (core))
! {
! *symbol_present = true;
! *var_present = false;
! return 0;
}
-
- *symbol_present = false;
- *var_present = true;
- return 0;
- }
-
- /* Estimates cost of expressing difference of addresses E1 - E2 as
- var + symbol + offset. The value of offset is added to OFFSET,
- SYMBOL_PRESENT and VAR_PRESENT are set to false if the corresponding
- part is missing. DEPENDS_ON is a set of the invariants the computation
- depends on. */
! static unsigned
! ptr_difference_cost (struct ivopts_data *data,
! tree e1, tree e2, bool *symbol_present, bool *var_present,
! unsigned HOST_WIDE_INT *offset, bitmap *depends_on)
! {
! HOST_WIDE_INT diff = 0;
! unsigned cost;
! gcc_assert (TREE_CODE (e1) == ADDR_EXPR);
! if (ptr_difference_const (e1, e2, &diff))
{
! *offset += diff;
! *symbol_present = false;
! *var_present = false;
! return 0;
}
! if (e2 == integer_zero_node)
! return split_address_cost (data, TREE_OPERAND (e1, 0),
! symbol_present, var_present, offset, depends_on);
! *symbol_present = false;
! *var_present = true;
!
! cost = force_var_cost (data, e1, depends_on);
! cost += force_var_cost (data, e2, depends_on);
! cost += add_cost (Pmode);
! return cost;
}
! /* Estimates cost of expressing difference E1 - E2 as
! var + symbol + offset. The value of offset is added to OFFSET,
! SYMBOL_PRESENT and VAR_PRESENT are set to false if the corresponding
! part is missing. DEPENDS_ON is a set of the invariants the computation
! depends on. */
static unsigned
! difference_cost (struct ivopts_data *data,
! tree e1, tree e2, bool *symbol_present, bool *var_present,
! unsigned HOST_WIDE_INT *offset, bitmap *depends_on)
{
! unsigned cost;
! enum machine_mode mode = TYPE_MODE (TREE_TYPE (e1));
! unsigned HOST_WIDE_INT off1, off2;
! e1 = strip_offset (e1, &off1);
! e2 = strip_offset (e2, &off2);
! *offset += off1 - off2;
! STRIP_NOPS (e1);
! STRIP_NOPS (e2);
! if (TREE_CODE (e1) == ADDR_EXPR)
! return ptr_difference_cost (data, e1, e2, symbol_present, var_present, offset,
! depends_on);
! *symbol_present = false;
! if (operand_equal_p (e1, e2, 0))
{
! *var_present = false;
! return 0;
}
! *var_present = true;
! if (zero_p (e2))
! return force_var_cost (data, e1, depends_on);
! if (zero_p (e1))
{
! cost = force_var_cost (data, e2, depends_on);
! cost += multiply_by_cost (-1, mode);
! return cost;
}
! cost = force_var_cost (data, e1, depends_on);
! cost += force_var_cost (data, e2, depends_on);
! cost += add_cost (mode);
!
! return cost;
}
/* Determines the cost of the computation by that USE is expressed
--- 3269,3469 ----
return force_expr_to_var_cost (expr);
}
! /* Determines the cost of the computation COMP. A set of invariants
! the expression depends on is stored in DEPENDS_ON. COMP is modified
! in the progress. OUTER_ADDITION is set to true if there is an
! addition outside any multiplication in the final expression. */
static unsigned
! aff_combination_cost (struct ivopts_data *data, aff_tree *comp,
! bitmap *depends_on, bool *outer_addition)
{
! int n_adds = comp->n - 1;
! unsigned i, j, cost = 0;
! unsigned positive_terms = 0;
! enum machine_mode mode = TYPE_MODE (comp->type);
! double_int positive, negative;
! bool seen_positive;
! if (outer_addition)
! *outer_addition = false;
!
! if (comp->n == 0 && !comp->rest)
{
! /* Computation is an integer constant. */
! return force_expr_to_var_cost (double_int_to_tree (comp->type,
! comp->offset));
}
! if (comp->rest)
! {
! n_adds++;
! cost += force_var_cost (data, comp->rest, depends_on);
! positive_terms++;
! }
! if (!double_int_zero_p (comp->offset))
! {
! n_adds++;
! positive_terms++;
}
! for (i = 0; i < comp->n; i++)
! cost += force_var_cost (data, comp->elts[i], depends_on);
! /* Compute the cost of multiplications and number of additions. Using
! distributivity, each coefficient needs to be considered only once.
! Furthermore, we do not need to consider both constant and its negation.
! If all elements are multiplied by negative number, we must perform one
! multiplication by a negative constant (and use MINUS_EXPR for the rest).
! Otherwise, it is possible to always multiply be a positive constant,
! which should usually be cheaper. */
! for (i = 0; i < comp->n; i++)
{
! if (double_int_zero_p (comp->coefs[i]))
! continue;
!
! if (double_int_negative_p (comp->mask, comp->coefs[i]))
! {
! negative = comp->coefs[i];
! positive = double_int_negate (comp->mask, comp->coefs[i]);
! seen_positive = false;
! }
! else
! {
! positive = comp->coefs[i];
! negative = double_int_negate (comp->mask, comp->coefs[i]);
! seen_positive = true;
! }
!
! for (j = i + 1; j < comp->n; j++)
! {
! if (double_int_equal_p (positive, comp->coefs[j]))
! seen_positive = true;
! else if (!double_int_equal_p (negative, comp->coefs[j]))
! continue;
! comp->coefs[j] = hwi_to_double_int (0);
! }
!
! if (!double_int_fits_in_hwi_p (comp->mask, comp->coefs[i]))
! {
! /* comp->coefs[i] almost always fits into HOST_WIDE_INT, so do not
! worry about this this case. */
! cost += target_spill_cost;
! }
! else
! cost += multiply_by_cost (double_int_to_hwi (comp->mask, comp->coefs[i]),
! mode);
!
! if (seen_positive)
! positive_terms++;
}
! if (positive_terms == 0)
! {
! /* This is not really precise -- multiplying by a negative constant could
! be cheaper than multiplyiing by its negation and negating. */
! cost += multiply_by_cost (-1, mode);
! }
! if (outer_addition)
! *outer_addition = (positive_terms >= 2);
! return cost + n_adds * add_cost (mode);
}
! /* Determines the cost of the computation COMP used as a memory address.
! A set of invariants the expression depends on is stored in DEPENDS_ON.
! COMP is modified in the progress. RATIO is the ratio by that iv is
! multiplied, as a good candidate for step in an addressing mode. */
static unsigned
! aff_combination_cost_address (struct ivopts_data *data, aff_tree *comp,
! bitmap *depends_on, double_int ratio)
{
! HOST_WIDE_INT rat = 1, off = 0;
! unsigned cost = 0, i, nelts;
! bool symbol_present = false, var_present = false, index_used = false;
! enum machine_mode mode = TYPE_MODE (comp->type);
! bool outer_addition;
!
! /* Try finding a symbol. */
! for (i = 0; i < comp->n; i++)
! {
! tree elt = comp->elts[i];
! if (double_int_one_p (comp->coefs[i])
! && TREE_CODE (elt) == ADDR_EXPR
! && fixed_address_object_p (TREE_OPERAND (elt, 0)))
! {
! symbol_present = true;
! aff_combination_remove_elt (comp, i);
! break;
! }
! }
!
! if (double_int_fits_in_hwi_p (comp->mask, ratio)
! && !double_int_one_p (ratio))
! {
! /* Try separating index from comp. */
! for (i = 0; i < comp->n; i++)
! if (double_int_equal_p (ratio, comp->coefs[i]))
! break;
!
! if (i < comp->n)
! {
! double_int ratio_neg = double_int_negate (comp->mask, ratio);
! int n_add = -1;
! rat = double_int_to_hwi (comp->mask, ratio);
! for (; i < comp->n; )
! {
! if (!double_int_equal_p (ratio, comp->coefs[i])
! && !double_int_equal_p (ratio_neg, comp->coefs[i]))
! {
! i++;
! continue;
! }
! n_add++;
! cost += force_var_cost (data, comp->elts[i], depends_on);
! aff_combination_remove_elt (comp, i);
! }
! cost += n_add * add_cost (mode);
! index_used = true;
! }
! }
! nelts = comp->n;
! if (!double_int_zero_p (comp->offset))
{
! if (double_int_fits_in_hwi_p (comp->mask, comp->offset))
! {
! off = double_int_to_hwi (comp->mask, comp->offset);
! comp->offset = hwi_to_double_int (0);
! }
! else
! nelts++;
}
! if (comp->rest)
! nelts++;
! if (nelts)
{
! cost += aff_combination_cost (data, comp, depends_on, &outer_addition);
! /* We can save one addition by using BASE + INDEX addressing mode,
! if present and if index is not used already. */
! if (index_used)
! var_present = true;
! else if (outer_addition)
! {
! var_present = true;
! cost -= add_cost (mode);
! }
}
! return cost + get_address_cost (symbol_present, var_present, off, rat);
}
/* Determines the cost of the computation by that USE is expressed
*************** get_computation_cost_at (struct ivopts_d
*** 3721,3733 ****
struct iv_use *use, struct iv_cand *cand,
bool address_p, bitmap *depends_on, tree at)
{
! tree ubase = use->iv->base, ustep = use->iv->step;
! tree cbase, cstep;
tree utype = TREE_TYPE (ubase), ctype;
! unsigned HOST_WIDE_INT ustepi, cstepi, offset = 0;
! HOST_WIDE_INT ratio, aratio;
! bool var_present, symbol_present;
! unsigned cost = 0, n_sums;
*depends_on = NULL;
--- 3477,3488 ----
struct iv_use *use, struct iv_cand *cand,
bool address_p, bitmap *depends_on, tree at)
{
! tree ubase = use->iv->base;
! tree cbase;
tree utype = TREE_TYPE (ubase), ctype;
! aff_tree comp;
! double_int rat;
! unsigned cost;
*depends_on = NULL;
*************** get_computation_cost_at (struct ivopts_d
*** 3736,3742 ****
return INFTY;
cbase = cand->iv->base;
- cstep = cand->iv->step;
ctype = TREE_TYPE (cbase);
if (TYPE_PRECISION (utype) > TYPE_PRECISION (ctype))
--- 3491,3496 ----
*************** get_computation_cost_at (struct ivopts_d
*** 3758,3884 ****
return INFTY;
}
! if (TYPE_PRECISION (utype) != TYPE_PRECISION (ctype))
! {
! /* TODO -- add direct handling of this case. */
! goto fallback;
! }
!
! /* CSTEPI is removed from the offset in case statement is after the
! increment. If the step is not constant, we use zero instead.
! This is a bit imprecise (there is the extra addition), but
! redundancy elimination is likely to transform the code so that
! it uses value of the variable before increment anyway,
! so it is not that much unrealistic. */
! if (cst_and_fits_in_hwi (cstep))
! cstepi = int_cst_value (cstep);
! else
! cstepi = 0;
!
! if (cst_and_fits_in_hwi (ustep)
! && cst_and_fits_in_hwi (cstep))
! {
! ustepi = int_cst_value (ustep);
!
! if (!divide (TYPE_PRECISION (utype), ustepi, cstepi, &ratio))
! return INFTY;
! }
! else
! {
! tree rat;
!
! rat = constant_multiple_of (utype, ustep, cstep);
!
! if (!rat)
! return INFTY;
!
! if (cst_and_fits_in_hwi (rat))
! ratio = int_cst_value (rat);
! else if (integer_onep (rat))
! ratio = 1;
! else if (integer_all_onesp (rat))
! ratio = -1;
! else
! return INFTY;
! }
!
! /* use = ubase + ratio * (var - cbase). If either cbase is a constant
! or ratio == 1, it is better to handle this like
!
! ubase - ratio * cbase + ratio * var
!
! (also holds in the case ratio == -1, TODO. */
- if (cst_and_fits_in_hwi (cbase))
- {
- offset = - ratio * int_cst_value (cbase);
- cost += difference_cost (data,
- ubase, integer_zero_node,
- &symbol_present, &var_present, &offset,
- depends_on);
- }
- else if (ratio == 1)
- {
- cost += difference_cost (data,
- ubase, cbase,
- &symbol_present, &var_present, &offset,
- depends_on);
- }
- else
- {
- cost += force_var_cost (data, cbase, depends_on);
- cost += add_cost (TYPE_MODE (ctype));
- cost += difference_cost (data,
- ubase, integer_zero_node,
- &symbol_present, &var_present, &offset,
- depends_on);
- }
-
- /* If we are after the increment, the value of the candidate is higher by
- one iteration. */
- if (stmt_after_increment (data->current_loop, cand, at))
- offset -= ratio * cstepi;
-
- /* Now the computation is in shape symbol + var1 + const + ratio * var2.
- (symbol/var/const parts may be omitted). If we are looking for an address,
- find the cost of addressing this. */
if (address_p)
! return cost + get_address_cost (symbol_present, var_present, offset, ratio);
!
! /* Otherwise estimate the costs for computing the expression. */
! aratio = ratio > 0 ? ratio : -ratio;
! if (!symbol_present && !var_present && !offset)
! {
! if (ratio != 1)
! cost += multiply_by_cost (ratio, TYPE_MODE (ctype));
!
! return cost;
! }
!
! if (aratio != 1)
! cost += multiply_by_cost (aratio, TYPE_MODE (ctype));
!
! n_sums = 1;
! if (var_present
! /* Symbol + offset should be compile-time computable. */
! && (symbol_present || offset))
! n_sums++;
!
! return cost + n_sums * add_cost (TYPE_MODE (ctype));
!
! fallback:
! {
! /* Just get the expression, expand it and measure the cost. */
! tree comp = get_computation_at (data->current_loop, use, cand, at);
!
! if (!comp)
! return INFTY;
!
! if (address_p)
! comp = build1 (INDIRECT_REF, TREE_TYPE (TREE_TYPE (comp)), comp);
! return computation_cost (comp);
! }
}
/* Determines the cost of the computation by that USE is expressed
--- 3512,3527 ----
return INFTY;
}
! if (!get_computation_aff (data->current_loop, use, cand, at, &comp, &rat))
! return INFTY;
if (address_p)
! cost = aff_combination_cost_address (data, &comp, depends_on, rat);
! else
! cost = aff_combination_cost (data, &comp, depends_on, NULL);
! gcc_assert ((signed) cost >= 0);
! return cost;
}
/* Determines the cost of the computation by that USE is expressed
*************** determine_use_iv_cost_generic (struct iv
*** 3913,3924 ****
if (cand->pos == IP_ORIGINAL
&& cand->incremented_at == use->stmt)
{
! set_use_iv_cost (data, use, cand, 0, NULL, NULL_TREE);
return true;
}
cost = get_computation_cost (data, use, cand, false, &depends_on);
! set_use_iv_cost (data, use, cand, cost, depends_on, NULL_TREE);
return cost != INFTY;
}
--- 3556,3567 ----
if (cand->pos == IP_ORIGINAL
&& cand->incremented_at == use->stmt)
{
! set_use_iv_cost_zero (data, use, cand);
return true;
}
cost = get_computation_cost (data, use, cand, false, &depends_on);
! set_use_iv_cost_generic (data, use, cand, cost, depends_on);
return cost != INFTY;
}
*************** determine_use_iv_cost_address (struct iv
*** 3932,3938 ****
bitmap depends_on;
unsigned cost = get_computation_cost (data, use, cand, true, &depends_on);
! set_use_iv_cost (data, use, cand, cost, depends_on, NULL_TREE);
return cost != INFTY;
}
--- 3575,3581 ----
bitmap depends_on;
unsigned cost = get_computation_cost (data, use, cand, true, &depends_on);
! set_use_iv_cost_generic (data, use, cand, cost, depends_on);
return cost != INFTY;
}
*************** may_eliminate_iv (struct ivopts_data *da
*** 4068,4118 ****
return true;
}
/* Determines cost of basing replacement of USE on CAND in a condition. */
static bool
determine_use_iv_cost_condition (struct ivopts_data *data,
struct iv_use *use, struct iv_cand *cand)
{
! tree bound = NULL_TREE, op, cond;
! bitmap depends_on = NULL;
! unsigned cost;
/* Only consider real candidates. */
if (!cand->iv)
{
! set_use_iv_cost (data, use, cand, INFTY, NULL, NULL_TREE);
return false;
}
if (may_eliminate_iv (data, use, cand, &bound))
! {
! cost = force_var_cost (data, bound, &depends_on);
! set_use_iv_cost (data, use, cand, cost, depends_on, bound);
! return cost != INFTY;
}
!
! /* The induction variable elimination failed; just express the original
! giv. If it is compared with an invariant, note that we cannot get
! rid of it. */
! cost = get_computation_cost (data, use, cand, false, &depends_on);
!
! cond = *use->op_p;
! if (TREE_CODE (cond) != SSA_NAME)
{
! op = TREE_OPERAND (cond, 0);
! if (TREE_CODE (op) == SSA_NAME && !zero_p (get_iv (data, op)->step))
! op = TREE_OPERAND (cond, 1);
! if (TREE_CODE (op) == SSA_NAME)
! {
! op = get_iv (data, op)->base;
! fd_ivopts_data = data;
! walk_tree (&op, find_depends, &depends_on, NULL);
! }
}
! set_use_iv_cost (data, use, cand, cost, depends_on, NULL);
return cost != INFTY;
}
--- 3711,3789 ----
return true;
}
+ /* Estimate cost of comparing with BOUND. */
+
+ static unsigned
+ compare_cost (tree bound)
+ {
+ /* Prefer costants, and prefer zero even more. */
+ if (zero_p (bound))
+ return 0;
+ else if (TREE_CODE (bound) == INTEGER_CST)
+ return 1;
+ else
+ return 2;
+ }
+
/* Determines cost of basing replacement of USE on CAND in a condition. */
static bool
determine_use_iv_cost_condition (struct ivopts_data *data,
struct iv_use *use, struct iv_cand *cand)
{
! tree bound = NULL_TREE, *cmp_inv;
! struct iv *cmp_iv;
! bitmap depends_on_elim = NULL, depends_on_express = NULL, depends_on;
! unsigned elim_cost, express_cost, cost;
! bool ok;
/* Only consider real candidates. */
if (!cand->iv)
{
! set_use_iv_cost_infinity (data, use, cand);
return false;
}
+ /* Try iv elimination. */
if (may_eliminate_iv (data, use, cand, &bound))
! elim_cost = (force_var_cost (data, bound, &depends_on_elim)
! + compare_cost (bound));
! else
! elim_cost = INFTY;
! /* Try expressing the original giv. If it is compared with an invariant,
! note that we cannot get rid of it. */
! ok = extract_cond_operands (data, use->op_p, NULL, &cmp_inv, NULL, &cmp_iv);
! gcc_assert (ok);
!
! express_cost = (get_computation_cost (data, use, cand, false,
! &depends_on_express)
! + compare_cost (*cmp_inv));
! fd_ivopts_data = data;
! walk_tree (&cmp_iv->base, find_depends, &depends_on_express, NULL);
!
! /* Choose the better approach. */
! if (elim_cost < express_cost)
! {
! cost = elim_cost;
! depends_on = depends_on_elim;
! depends_on_elim = NULL;
}
! else
{
! cost = express_cost;
! depends_on = depends_on_express;
! depends_on_express = NULL;
! bound = NULL_TREE;
}
! set_use_iv_cost_compare (data, use, cand, cost, depends_on, bound);
!
! if (depends_on_elim)
! BITMAP_FREE (depends_on_elim);
! if (depends_on_express)
! BITMAP_FREE (depends_on_express);
!
return cost != INFTY;
}
*************** determine_use_iv_cost_outer (struct ivop
*** 4163,4169 ****
if (cand->pos == IP_ORIGINAL
&& cand->incremented_at == use->stmt)
{
! set_use_iv_cost (data, use, cand, 0, NULL, NULL_TREE);
return true;
}
--- 3834,3840 ----
if (cand->pos == IP_ORIGINAL
&& cand->incremented_at == use->stmt)
{
! set_use_iv_cost_zero (data, use, cand);
return true;
}
*************** determine_use_iv_cost_outer (struct ivop
*** 4171,4186 ****
{
if (!may_replace_final_value (data, use, &value))
{
! set_use_iv_cost (data, use, cand, INFTY, NULL, NULL_TREE);
return false;
}
depends_on = NULL;
cost = force_var_cost (data, value, &depends_on);
! cost /= AVG_LOOP_NITER (loop);
!
! set_use_iv_cost (data, use, cand, cost, depends_on, value);
return cost != INFTY;
}
--- 3842,3856 ----
{
if (!may_replace_final_value (data, use, &value))
{
! set_use_iv_cost_infinity (data, use, cand);
return false;
}
depends_on = NULL;
cost = force_var_cost (data, value, &depends_on);
+ cost = before_loop_cost (loop, cost);
! set_use_iv_cost_outer (data, use, cand, cost, depends_on, value);
return cost != INFTY;
}
*************** determine_use_iv_cost_outer (struct ivop
*** 4192,4198 ****
cost = get_computation_cost_at (data, use, cand, false, &depends_on,
last_stmt (exit->src));
if (cost != INFTY)
! cost /= AVG_LOOP_NITER (loop);
}
else
{
--- 3862,3868 ----
cost = get_computation_cost_at (data, use, cand, false, &depends_on,
last_stmt (exit->src));
if (cost != INFTY)
! cost = before_loop_cost (loop, cost);
}
else
{
*************** determine_use_iv_cost_outer (struct ivop
*** 4200,4206 ****
cost = get_computation_cost (data, use, cand, false, &depends_on);
}
! set_use_iv_cost (data, use, cand, cost, depends_on, NULL_TREE);
return cost != INFTY;
}
--- 3870,3876 ----
cost = get_computation_cost (data, use, cand, false, &depends_on);
}
! set_use_iv_cost_generic (data, use, cand, cost, depends_on);
return cost != INFTY;
}
*************** determine_iv_cost (struct ivopts_data *d
*** 4328,4334 ****
cost_base = force_var_cost (data, base, NULL);
cost_step = add_cost (TYPE_MODE (TREE_TYPE (base)));
! cand->cost = cost_step + cost_base / AVG_LOOP_NITER (current_loop);
/* Prefer the original iv unless we may gain something by replacing it;
this is not really relevant for artificial ivs created by other
--- 3998,4004 ----
cost_base = force_var_cost (data, base, NULL);
cost_step = add_cost (TYPE_MODE (TREE_TYPE (base)));
! cand->cost = cost_step + before_loop_cost (data->current_loop, cost_base);
/* Prefer the original iv unless we may gain something by replacing it;
this is not really relevant for artificial ivs created by other
*************** static void
*** 5551,5561 ****
rewrite_use_address (struct ivopts_data *data,
struct iv_use *use, struct iv_cand *cand)
{
! struct affine_tree_combination aff;
block_stmt_iterator bsi = bsi_for_stmt (use->stmt);
tree ref;
! get_computation_aff (data->current_loop, use, cand, use->stmt, &aff);
unshare_aff_combination (&aff);
ref = create_mem_ref (&bsi, TREE_TYPE (*use->op_p), &aff);
--- 5221,5231 ----
rewrite_use_address (struct ivopts_data *data,
struct iv_use *use, struct iv_cand *cand)
{
! aff_tree aff;
block_stmt_iterator bsi = bsi_for_stmt (use->stmt);
tree ref;
! get_computation_aff (data->current_loop, use, cand, use->stmt, &aff, NULL);
unshare_aff_combination (&aff);
ref = create_mem_ref (&bsi, TREE_TYPE (*use->op_p), &aff);
*************** free_loop_data (struct ivopts_data *data
*** 5865,5871 ****
{
unsigned i, j;
bitmap_iterator bi;
- tree obj;
htab_empty (data->niters);
--- 5535,5540 ----
*************** free_loop_data (struct ivopts_data *data
*** 5919,5929 ****
}
data->max_inv_id = 0;
-
- for (i = 0; VEC_iterate (tree, decl_rtl_to_reset, i, obj); i++)
- SET_DECL_RTL (obj, NULL_RTX);
-
- VEC_truncate (tree, decl_rtl_to_reset, 0);
}
/* Finalizes data structures used by the iv optimization pass. LOOPS is the
--- 5588,5593 ----
*************** tree_ssa_iv_optimize_finalize (struct lo
*** 5947,5953 ****
BITMAP_FREE (data->important_candidates);
htab_delete (data->niters);
- VEC_free (tree, heap, decl_rtl_to_reset);
VEC_free (iv_use_p, heap, data->iv_uses);
VEC_free (iv_cand_p, heap, data->iv_candidates);
}
--- 5611,5616 ----
Index: tree-ssa-address.c
===================================================================
*** tree-ssa-address.c (revision 107712)
--- tree-ssa-address.c (working copy)
*************** Software Foundation, 51 Franklin Street,
*** 42,47 ****
--- 42,48 ----
#include "recog.h"
#include "expr.h"
#include "ggc.h"
+ #include "tree-affine.h"
/* TODO -- handling of symbols (according to Richard Hendersons
comments, http://gcc.gnu.org/ml/gcc-patches/2005-04/msg00949.html):
*************** create_mem_ref_raw (tree type, struct me
*** 334,340 ****
/* Returns true if OBJ is an object whose address is a link time constant. */
! static bool
fixed_address_object_p (tree obj)
{
return (TREE_CODE (obj) == VAR_DECL
--- 335,341 ----
/* Returns true if OBJ is an object whose address is a link time constant. */
! bool
fixed_address_object_p (tree obj)
{
return (TREE_CODE (obj) == VAR_DECL
*************** fixed_address_object_p (tree obj)
*** 346,370 ****
construct. */
static void
! add_to_parts (struct mem_address *parts, tree type, tree elt,
! unsigned HOST_WIDE_INT coef)
{
/* Check if this is a symbol. */
if (!parts->symbol
! && coef == 1
! && TREE_CODE (elt) == ADDR_EXPR
! && fixed_address_object_p (TREE_OPERAND (elt, 0)))
{
! parts->symbol = TREE_OPERAND (elt, 0);
return;
}
- if (coef != 1)
- elt = fold_build2 (MULT_EXPR, type, fold_convert (type, elt),
- build_int_cst_type (type, coef));
- else
- elt = fold_convert (type, elt);
-
if (!parts->base)
{
parts->base = elt;
--- 347,366 ----
construct. */
static void
! add_to_parts (struct mem_address *parts, tree type, tree elt)
{
+ tree elt_core = elt;
+ STRIP_NOPS (elt_core);
+
/* Check if this is a symbol. */
if (!parts->symbol
! && TREE_CODE (elt_core) == ADDR_EXPR
! && fixed_address_object_p (TREE_OPERAND (elt_core, 0)))
{
! parts->symbol = TREE_OPERAND (elt_core, 0);
return;
}
if (!parts->base)
{
parts->base = elt;
*************** add_to_parts (struct mem_address *parts,
*** 388,407 ****
static void
most_expensive_mult_to_index (struct mem_address *parts, tree type,
! struct affine_tree_combination *addr)
{
! unsigned HOST_WIDE_INT best_mult = 0;
unsigned best_mult_cost = 0, acost;
tree mult_elt = NULL_TREE, elt;
unsigned i, j;
for (i = 0; i < addr->n; i++)
{
! if (addr->coefs[i] == 1
! || !multiplier_allowed_in_address_p (addr->coefs[i]))
continue;
! acost = multiply_by_cost (addr->coefs[i], Pmode);
if (acost > best_mult_cost)
{
--- 384,409 ----
static void
most_expensive_mult_to_index (struct mem_address *parts, tree type,
! aff_tree *addr)
{
! HOST_WIDE_INT coef;
! double_int best_mult = {0, 0}, amult, amult_neg;
unsigned best_mult_cost = 0, acost;
tree mult_elt = NULL_TREE, elt;
unsigned i, j;
+ enum tree_code op_code;
for (i = 0; i < addr->n; i++)
{
! if (!double_int_fits_in_hwi_p (addr->mask, addr->coefs[i]))
! continue;
!
! coef = double_int_to_hwi (addr->mask, addr->coefs[i]);
! if (coef == 1
! || !multiplier_allowed_in_address_p (coef))
continue;
! acost = multiply_by_cost (coef, Pmode);
if (acost > best_mult_cost)
{
*************** most_expensive_mult_to_index (struct mem
*** 410,421 ****
}
}
! if (!best_mult)
return;
for (i = j = 0; i < addr->n; i++)
{
! if (addr->coefs[i] != best_mult)
{
addr->coefs[j] = addr->coefs[i];
addr->elts[j] = addr->elts[i];
--- 412,431 ----
}
}
! if (!best_mult_cost)
return;
+ /* Collect elements multiplied by best_mult. */
for (i = j = 0; i < addr->n; i++)
{
! amult = addr->coefs[i];
! amult_neg = double_int_negate (addr->mask, amult);
!
! if (double_int_equal_p (amult, best_mult))
! op_code = PLUS_EXPR;
! else if (double_int_equal_p (amult_neg, best_mult))
! op_code = MINUS_EXPR;
! else
{
addr->coefs[j] = addr->coefs[i];
addr->elts[j] = addr->elts[i];
*************** most_expensive_mult_to_index (struct mem
*** 424,438 ****
}
elt = fold_convert (type, addr->elts[i]);
! if (!mult_elt)
! mult_elt = elt;
else
! mult_elt = fold_build2 (PLUS_EXPR, type, mult_elt, elt);
}
addr->n = j;
parts->index = mult_elt;
! parts->step = build_int_cst_type (type, best_mult);
}
/* Splits address ADDR into PARTS.
--- 434,450 ----
}
elt = fold_convert (type, addr->elts[i]);
! if (mult_elt)
! mult_elt = fold_build2 (op_code, type, mult_elt, elt);
! else if (op_code == PLUS_EXPR)
! mult_elt = elt;
else
! mult_elt = fold_build1 (NEGATE_EXPR, type, elt);
}
addr->n = j;
parts->index = mult_elt;
! parts->step = double_int_to_tree (type, best_mult);
}
/* Splits address ADDR into PARTS.
*************** most_expensive_mult_to_index (struct mem
*** 445,453 ****
for complicated addressing modes is useless. */
static void
! addr_to_parts (struct affine_tree_combination *addr, tree type,
! struct mem_address *parts)
{
unsigned i;
parts->symbol = NULL_TREE;
--- 457,465 ----
for complicated addressing modes is useless. */
static void
! addr_to_parts (aff_tree *addr, tree type, struct mem_address *parts)
{
+ tree part;
unsigned i;
parts->symbol = NULL_TREE;
*************** addr_to_parts (struct affine_tree_combin
*** 455,462 ****
parts->index = NULL_TREE;
parts->step = NULL_TREE;
! if (addr->offset)
! parts->offset = build_int_cst_type (type, addr->offset);
else
parts->offset = NULL_TREE;
--- 467,474 ----
parts->index = NULL_TREE;
parts->step = NULL_TREE;
! if (!double_int_zero_p (addr->offset))
! parts->offset = double_int_to_tree (type, addr->offset);
else
parts->offset = NULL_TREE;
*************** addr_to_parts (struct affine_tree_combin
*** 466,474 ****
/* Then try to process the remaining elements. */
for (i = 0; i < addr->n; i++)
! add_to_parts (parts, type, addr->elts[i], addr->coefs[i]);
if (addr->rest)
! add_to_parts (parts, type, addr->rest, 1);
}
/* Force the PARTS to register. */
--- 478,492 ----
/* Then try to process the remaining elements. */
for (i = 0; i < addr->n; i++)
! {
! part = fold_convert (type, addr->elts[i]);
! if (!double_int_one_p (addr->coefs[i]))
! part = fold_build2 (MULT_EXPR, type, part,
! double_int_to_tree (type, addr->coefs[i]));
! add_to_parts (parts, type, part);
! }
if (addr->rest)
! add_to_parts (parts, type, addr->rest);
}
/* Force the PARTS to register. */
*************** gimplify_mem_ref_parts (block_stmt_itera
*** 489,496 ****
of created memory reference. */
tree
! create_mem_ref (block_stmt_iterator *bsi, tree type,
! struct affine_tree_combination *addr)
{
tree mem_ref, tmp;
tree addr_type = build_pointer_type (type);
--- 507,513 ----
of created memory reference. */
tree
! create_mem_ref (block_stmt_iterator *bsi, tree type, aff_tree *addr)
{
tree mem_ref, tmp;
tree addr_type = build_pointer_type (type);
Index: tree-affine.c
===================================================================
*** tree-affine.c (revision 0)
--- tree-affine.c (revision 0)
***************
*** 0 ****
--- 1,642 ----
+ /* Operations with affine combinations of trees.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by the
+ Free Software Foundation; either version 2, or (at your option) any
+ later version.
+
+ GCC is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING. If not, write to the Free
+ Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+ 02110-1301, USA. */
+
+ #include "config.h"
+ #include "system.h"
+ #include "coretypes.h"
+ #include "tm.h"
+ #include "tree.h"
+ #include "rtl.h"
+ #include "tm_p.h"
+ #include "hard-reg-set.h"
+ #include "output.h"
+ #include "diagnostic.h"
+ #include "tree-dump.h"
+ #include "tree-affine.h"
+
+ /* Restricts constant CST according to precision of combination COMB. */
+
+ static inline double_int
+ restrict_cst_to_precision (double_int mask, double_int cst)
+ {
+ double_int r;
+
+ r.low = cst.low & mask.low;
+ r.high = cst.high & mask.high;
+
+ return r;
+ }
+
+ /* Returns true if X fits in HOST_WIDE_INT, with respect to precision
+ of MASK. */
+
+ bool
+ double_int_fits_in_hwi_p (double_int mask, double_int x)
+ {
+ if (double_int_negative_p (mask, x))
+ return x.high == mask.high;
+ else
+ return x.high == 0 && (HOST_WIDE_INT) x.low >= 0;
+ }
+
+ /* Returns true if X fits in unsigned HOST_WIDE_INT. */
+
+ bool
+ double_int_fits_in_unsigned_hwi_p (double_int x)
+ {
+ return x.high == 0;
+ }
+
+ /* Returns true if SCALE is negative in precision of MASK. */
+
+ bool
+ double_int_negative_p (double_int mask, double_int scale)
+ {
+ if (mask.high)
+ return (scale.high & ~(mask.high >> 1)) != 0;
+ else
+ return (scale.low & ~(mask.low >> 1)) != 0;
+ }
+
+ /* Returns value of X with respect to precision of MASK. */
+
+ HOST_WIDE_INT
+ double_int_to_hwi (double_int mask, double_int x)
+ {
+ if (mask.high || !double_int_negative_p (mask, x))
+ return x.low;
+ else
+ return x.low | ~mask.low;
+ }
+
+ /* Returns value of X. */
+
+ unsigned HOST_WIDE_INT
+ double_int_to_unsigned_hwi (double_int x)
+ {
+ return x.low;
+ }
+
+ /* Returns A * B, truncated so that it fits MASK. */
+
+ double_int
+ double_int_mul (double_int mask, double_int a, double_int b)
+ {
+ unsigned HOST_WIDE_INT lo;
+ HOST_WIDE_INT hi;
+ double_int ret;
+
+ mul_double (a.low, a.high, b.low, b.high, &lo, &hi);
+ ret.low = lo;
+ ret.high = hi;
+
+ return restrict_cst_to_precision (mask, ret);
+ }
+
+ /* Returns A + B, truncated so that it fits MASK. */
+
+ double_int
+ double_int_add (double_int mask, double_int a, double_int b)
+ {
+ unsigned HOST_WIDE_INT lo;
+ HOST_WIDE_INT hi;
+ double_int ret;
+
+ add_double (a.low, a.high, b.low, b.high, &lo, &hi);
+ ret.low = lo;
+ ret.high = hi;
+
+ return restrict_cst_to_precision (mask, ret);
+ }
+
+ /* Returns -A, truncated so that it fits MASK. */
+
+ double_int
+ double_int_negate (double_int mask, double_int a)
+ {
+ unsigned HOST_WIDE_INT lo;
+ HOST_WIDE_INT hi;
+ double_int ret;
+
+ neg_double (a.low, a.high, &lo, &hi);
+ ret.low = lo;
+ ret.high = hi;
+
+ return restrict_cst_to_precision (mask, ret);
+ }
+
+ /* Returns A / B (computed as unsigned, rounded down). */
+
+ double_int
+ double_int_divide (double_int a, double_int b)
+ {
+ unsigned HOST_WIDE_INT lo, rem_lo;
+ HOST_WIDE_INT hi, rem_hi;
+ double_int ret;
+
+ div_and_round_double (FLOOR_DIV_EXPR, true, a.low, a.high, b.low, b.high,
+ &lo, &hi, &rem_lo, &rem_hi);
+ ret.low = lo;
+ ret.high = hi;
+
+ return ret;
+ }
+ /* Constructs tree in type TYPE from double_int CST. */
+
+ tree
+ double_int_to_tree (tree type, double_int cst)
+ {
+ unsigned HOST_WIDE_INT lo = cst.low;
+ HOST_WIDE_INT hi = cst.high;
+ unsigned prec = TYPE_PRECISION (type);
+ bool negative;
+
+ if (prec > HOST_BITS_PER_WIDE_INT)
+ {
+ prec -= HOST_BITS_PER_WIDE_INT;
+ negative = (hi >> (prec - 1)) & 1;
+ if (negative)
+ hi |= (~(unsigned HOST_WIDE_INT) 0) << (prec - 1) << 1;
+ }
+ else
+ {
+ negative = (cst.low >> (prec - 1)) & 1;
+ if (negative)
+ {
+ lo |= (~(unsigned HOST_WIDE_INT) 0) << (prec - 1) << 1;
+ hi = ~(unsigned HOST_WIDE_INT) 0;
+ }
+ }
+
+ return build_int_cst_wide (type, lo, hi);
+ }
+
+ /* Returns true if VAL is smaller or equal to the maximal value
+ representable in TYPE. */
+
+ bool
+ double_int_fits_to_type_p (tree type, double_int val)
+ {
+ unsigned prec = TYPE_PRECISION (type);
+ double_int mask = double_int_mask (TYPE_UNSIGNED (type) ? prec : prec - 1);
+
+ return (((val.low & mask.low) == val.low)
+ && ((val.high & mask.high) == val.high));
+ }
+
+ /* Returns true if A < B, unsigned comparison. */
+
+ bool
+ double_int_smaller_p (double_int a, double_int b)
+ {
+ if (a.high < b.high)
+ return true;
+ if (a.high > b.high)
+ return false;
+ return a.low < b.low;
+ }
+
+ /* Splits last digit of *X in BASE and returns it. */
+
+ static unsigned
+ double_int_split_digit (double_int *x, unsigned base)
+ {
+ unsigned HOST_WIDE_INT resl, reml;
+ HOST_WIDE_INT resh, remh;
+
+ div_and_round_double (FLOOR_DIV_EXPR, true, x->low, x->high, base, 0,
+ &resl, &resh, &reml, &remh);
+ x->high = resh;
+ x->low = resl;
+
+ return reml;
+ }
+
+ /* Dumps X (in precision MASK) to FILE. If SIGN is true, X is considered to
+ be signed. */
+
+ void
+ dump_double_int (FILE *file, double_int mask, double_int x, bool sign)
+ {
+ unsigned digits[100], n;
+ int i;
+
+ if (double_int_zero_p (x))
+ {
+ fprintf (file, "0");
+ return;
+ }
+
+ if (sign && double_int_negative_p (mask, x))
+ {
+ fprintf (file, "-");
+ x = double_int_negate (mask, x);
+ }
+
+ for (n = 0; !double_int_zero_p (x); n++)
+ digits[n] = double_int_split_digit (&x, 10);
+ for (i = n - 1; i >= 0; i--)
+ fprintf (file, "%u", digits[i]);
+ }
+
+ /* Returns a MASK for PREC bytes. */
+
+ double_int
+ double_int_mask (unsigned prec)
+ {
+ double_int mask;
+
+ if (prec > HOST_BITS_PER_WIDE_INT)
+ {
+ prec -= HOST_BITS_PER_WIDE_INT;
+ mask.high = (((unsigned HOST_WIDE_INT) 2 << (prec - 1)) - 1);
+ mask.low = ~(unsigned HOST_WIDE_INT) 0;
+ }
+ else
+ {
+ mask.high = 0;
+ mask.low = (((unsigned HOST_WIDE_INT) 2 << (prec - 1)) - 1);
+ }
+
+ return mask;
+ }
+
+ /* Initializes affine combination COMB so that its value is zero in TYPE. */
+
+ static void
+ aff_combination_zero (aff_tree *comb, tree type)
+ {
+ comb->type = type;
+ comb->mask = double_int_mask (TYPE_PRECISION (type));
+
+ comb->offset.low = 0;
+ comb->offset.high = 0;
+ comb->n = 0;
+ comb->rest = NULL_TREE;
+ }
+
+ /* Sets COMB to CST. */
+
+ void
+ aff_combination_const (aff_tree *comb, tree type, double_int cst)
+ {
+ aff_combination_zero (comb, type);
+ comb->offset = restrict_cst_to_precision (comb->mask, cst);
+ }
+
+ /* Sets COMB to single element ELT. */
+
+ void
+ aff_combination_elt (aff_tree *comb, tree type, tree elt)
+ {
+ aff_combination_zero (comb, type);
+
+ comb->n = 1;
+ comb->elts[0] = elt;
+ comb->coefs[0] = hwi_to_double_int (1);
+ }
+
+ /* Scales COMB by SCALE. */
+
+ void
+ aff_combination_scale (aff_tree *comb, double_int scale)
+ {
+ unsigned i, j;
+
+ scale = restrict_cst_to_precision (comb->mask, scale);
+ if (double_int_one_p (scale))
+ return;
+
+ if (double_int_zero_p (scale))
+ {
+ aff_combination_zero (comb, comb->type);
+ return;
+ }
+
+ comb->offset = double_int_mul (comb->mask, scale, comb->offset);
+ for (i = 0, j = 0; i < comb->n; i++)
+ {
+ comb->coefs[j] = double_int_mul (comb->mask, scale, comb->coefs[i]);
+ comb->elts[j] = comb->elts[i];
+ /* A coefficient may become zero due to overflow. Remove the zero
+ elements. */
+ if (!double_int_zero_p (comb->coefs[j]))
+ j++;
+ }
+ comb->n = j;
+
+ if (comb->rest)
+ {
+ if (comb->n < MAX_AFF_ELTS)
+ {
+ comb->coefs[comb->n] = scale;
+ comb->elts[comb->n] = comb->rest;
+ comb->rest = NULL_TREE;
+ comb->n++;
+ }
+ else
+ comb->rest = fold_build2 (MULT_EXPR, comb->type, comb->rest,
+ double_int_to_tree (comb->type, scale));
+ }
+ }
+
+ /* Adds ELT * SCALE to COMB. */
+
+ void
+ aff_combination_add_elt (aff_tree *comb, tree elt, double_int scale)
+ {
+ unsigned i;
+
+ if (double_int_zero_p (scale))
+ return;
+
+ for (i = 0; i < comb->n; i++)
+ if (operand_equal_p (comb->elts[i], elt, 0))
+ {
+ comb->coefs[i] = double_int_add (comb->mask, comb->coefs[i], scale);
+ if (!double_int_zero_p (comb->coefs[i]))
+ return;
+
+ comb->n--;
+ comb->coefs[i] = comb->coefs[comb->n];
+ comb->elts[i] = comb->elts[comb->n];
+
+ if (comb->rest)
+ {
+ gcc_assert (comb->n == MAX_AFF_ELTS - 1);
+ comb->coefs[comb->n] = hwi_to_double_int (1);
+ comb->elts[comb->n] = comb->rest;
+ comb->rest = NULL_TREE;
+ comb->n++;
+ }
+ return;
+ }
+ if (comb->n < MAX_AFF_ELTS)
+ {
+ comb->coefs[comb->n] = scale;
+ comb->elts[comb->n] = elt;
+ comb->n++;
+ return;
+ }
+
+ if (double_int_one_p (scale))
+ elt = fold_convert (comb->type, elt);
+ else
+ elt = fold_build2 (MULT_EXPR, comb->type,
+ fold_convert (comb->type, elt),
+ double_int_to_tree (comb->type, scale));
+
+ if (comb->rest)
+ comb->rest = fold_build2 (PLUS_EXPR, comb->type, comb->rest, elt);
+ else
+ comb->rest = elt;
+ }
+
+ /* Adds COMB2 to COMB1. */
+
+ void
+ aff_combination_add (aff_tree *comb1, aff_tree *comb2)
+ {
+ unsigned i;
+
+ comb1->offset = double_int_add (comb1->mask, comb1->offset, comb2->offset);
+ for (i = 0; i < comb2->n; i++)
+ aff_combination_add_elt (comb1, comb2->elts[i], comb2->coefs[i]);
+ if (comb2->rest)
+ aff_combination_add_elt (comb1, comb2->rest, hwi_to_double_int (1));
+ }
+
+ /* Converts affine combination COMB to TYPE. */
+
+ void
+ aff_combination_convert (aff_tree *comb, tree type)
+ {
+ unsigned i, j;
+ tree comb_type = comb->type;
+
+ gcc_assert (TYPE_PRECISION (type) <= TYPE_PRECISION (comb_type));
+ comb->type = type;
+ if (comb->rest)
+ comb->rest = fold_convert (type, comb->rest);
+
+ if (TYPE_PRECISION (type) == TYPE_PRECISION (comb_type))
+ return;
+ comb->mask = double_int_mask (TYPE_PRECISION (type));
+
+ comb->offset = restrict_cst_to_precision (comb->mask, comb->offset);
+ for (i = j = 0; i < comb->n; i++)
+ {
+ comb->coefs[j] = restrict_cst_to_precision (comb->mask, comb->coefs[i]);
+ if (double_int_zero_p (comb->coefs[j]))
+ continue;
+ comb->elts[j] = fold_convert (type, comb->elts[i]);
+ j++;
+ }
+
+ comb->n = j;
+ if (comb->n < MAX_AFF_ELTS && comb->rest)
+ {
+ comb->coefs[comb->n] = hwi_to_double_int (1);
+ comb->elts[comb->n] = comb->rest;
+ comb->rest = NULL_TREE;
+ comb->n++;
+ }
+ }
+
+ /* Splits EXPR into an affine combination of parts. */
+
+ void
+ tree_to_aff_combination (tree expr, tree type, aff_tree *comb)
+ {
+ aff_tree tmp;
+ enum tree_code code;
+ tree cst, core, toffset;
+ HOST_WIDE_INT bitpos, bitsize;
+ enum machine_mode mode;
+ int unsignedp, volatilep;
+
+ STRIP_NOPS (expr);
+
+ code = TREE_CODE (expr);
+ switch (code)
+ {
+ case INTEGER_CST:
+ aff_combination_const (comb, type, tree_to_double_int (expr));
+ return;
+
+ case PLUS_EXPR:
+ case MINUS_EXPR:
+ tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+ tree_to_aff_combination (TREE_OPERAND (expr, 1), type, &tmp);
+ if (code == MINUS_EXPR)
+ aff_combination_scale (&tmp, hwi_to_double_int (-1));
+ aff_combination_add (comb, &tmp);
+ return;
+
+ case MULT_EXPR:
+ cst = TREE_OPERAND (expr, 1);
+ if (TREE_CODE (cst) != INTEGER_CST)
+ break;
+ tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+ aff_combination_scale (comb, tree_to_double_int (cst));
+ return;
+
+ case NEGATE_EXPR:
+ tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+ aff_combination_scale (comb, hwi_to_double_int (-1));
+ return;
+
+ case ADDR_EXPR:
+ core = get_inner_reference (TREE_OPERAND (expr, 0), &bitsize, &bitpos,
+ &toffset, &mode, &unsignedp, &volatilep,
+ false);
+ if (bitpos % BITS_PER_UNIT != 0)
+ break;
+ aff_combination_const (comb, type,
+ hwi_to_double_int (bitpos / BITS_PER_UNIT));
+ core = build_fold_addr_expr (core);
+ if (TREE_CODE (core) == ADDR_EXPR)
+ aff_combination_add_elt (comb, core, hwi_to_double_int (1));
+ else
+ {
+ tree_to_aff_combination (core, type, &tmp);
+ aff_combination_add (comb, &tmp);
+ }
+ if (toffset)
+ {
+ tree_to_aff_combination (toffset, type, &tmp);
+ aff_combination_add (comb, &tmp);
+ }
+ return;
+
+ default:
+ break;
+ }
+
+ aff_combination_elt (comb, type, expr);
+ }
+
+ /* Creates EXPR + ELT * SCALE in TYPE. EXPR is taken from affine
+ combination COMB. */
+
+ static tree
+ add_elt_to_tree (tree expr, tree type, tree elt, double_int scale,
+ aff_tree *comb)
+ {
+ enum tree_code code;
+
+ scale = restrict_cst_to_precision (comb->mask, scale);
+ elt = fold_convert (type, elt);
+
+ if (double_int_one_p (scale))
+ {
+ if (!expr)
+ return elt;
+
+ return fold_build2 (PLUS_EXPR, type, expr, elt);
+ }
+
+ if (double_int_minus_one_p (comb->mask, scale))
+ {
+ if (!expr)
+ return fold_build1 (NEGATE_EXPR, type, elt);
+
+ return fold_build2 (MINUS_EXPR, type, expr, elt);
+ }
+
+ if (!expr)
+ return fold_build2 (MULT_EXPR, type, elt,
+ double_int_to_tree (type, scale));
+
+ if (double_int_negative_p (comb->mask, scale))
+ {
+ code = MINUS_EXPR;
+ scale = double_int_negate (comb->mask, scale);
+ }
+ else
+ code = PLUS_EXPR;
+
+ elt = fold_build2 (MULT_EXPR, type, elt,
+ double_int_to_tree (type, scale));
+ return fold_build2 (code, type, expr, elt);
+ }
+
+ /* Makes tree from the affine combination COMB. */
+
+ tree
+ aff_combination_to_tree (aff_tree *comb)
+ {
+ tree type = comb->type;
+ tree expr = comb->rest;
+ unsigned i;
+ double_int off, sgn;
+
+ gcc_assert (comb->n == MAX_AFF_ELTS || comb->rest == NULL_TREE);
+
+ for (i = 0; i < comb->n; i++)
+ expr = add_elt_to_tree (expr, type, comb->elts[i], comb->coefs[i], comb);
+
+ if (double_int_negative_p (comb->mask, comb->offset))
+ {
+ /* Offset is negative. */
+ off = double_int_negate (comb->mask, comb->offset);
+ sgn = comb->mask;
+ }
+ else
+ {
+ off = comb->offset;
+ sgn = hwi_to_double_int (1);
+ }
+ return add_elt_to_tree (expr, type, double_int_to_tree (type, off), sgn,
+ comb);
+ }
+
+ /* Copies the tree elements of COMB to ensure that they are not shared. */
+
+ void
+ unshare_aff_combination (aff_tree *comb)
+ {
+ unsigned i;
+
+ for (i = 0; i < comb->n; i++)
+ comb->elts[i] = unshare_expr (comb->elts[i]);
+ if (comb->rest)
+ comb->rest = unshare_expr (comb->rest);
+ }
+
+ /* Remove M-th element from COMB. */
+
+ void
+ aff_combination_remove_elt (aff_tree *comb, unsigned m)
+ {
+ comb->n--;
+ if (m <= comb->n)
+ {
+ comb->coefs[m] = comb->coefs[comb->n];
+ comb->elts[m] = comb->elts[comb->n];
+ }
+ if (comb->rest)
+ {
+ comb->coefs[comb->n] = hwi_to_double_int (1);
+ comb->elts[comb->n] = comb->rest;
+ comb->rest = NULL_TREE;
+ comb->n++;
+ }
+ }
Index: tree-affine.h
===================================================================
*** tree-affine.h (revision 0)
--- tree-affine.h (revision 0)
***************
*** 0 ****
--- 1,141 ----
+ /* Operations with affine combinations of trees.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by the
+ Free Software Foundation; either version 2, or (at your option) any
+ later version.
+
+ GCC is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING. If not, write to the Free
+ Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+ 02110-1301, USA. */
+
+ /* Affine combination of trees. We keep track of at most MAX_AFF_ELTS elements
+ to make things simpler; this is sufficient in most cases. */
+
+ #define MAX_AFF_ELTS 8
+
+ typedef struct affine_tree_combination
+ {
+ /* Type of the result of the combination. */
+ tree type;
+
+ /* Mask modulo that the operations are performed. */
+ double_int mask;
+
+ /* Constant offset. */
+ double_int offset;
+
+ /* Number of elements of the combination. */
+ unsigned n;
+
+ /* Elements and their coefficients. Type of elements may be different from
+ TYPE, but their sizes must be the same (STRIP_NOPS is applied to the
+ elements). */
+ tree elts[MAX_AFF_ELTS];
+ double_int coefs[MAX_AFF_ELTS];
+
+ /* Remainder of the expression. Usually NULL, used only if there are more
+ than MAX_AFF_ELTS elements. Type of REST must be TYPE. */
+ tree rest;
+ } aff_tree;
+
+ void aff_combination_const (aff_tree *, tree, double_int);
+ void aff_combination_elt (aff_tree *, tree, tree);
+ void aff_combination_scale (aff_tree *, double_int);
+ void aff_combination_add (aff_tree *, aff_tree *);
+ void aff_combination_add_elt (aff_tree *, tree, double_int);
+ void aff_combination_remove_elt (aff_tree *, unsigned);
+ void aff_combination_convert (aff_tree *, tree);
+ void tree_to_aff_combination (tree, tree, aff_tree *);
+ tree aff_combination_to_tree (aff_tree *);
+ void unshare_aff_combination (aff_tree *);
+ tree double_int_to_tree (tree, double_int);
+ bool double_int_fits_to_type_p (tree, double_int);
+ bool double_int_fits_in_hwi_p (double_int, double_int);
+ HOST_WIDE_INT double_int_to_hwi (double_int, double_int);
+ bool double_int_fits_in_unsigned_hwi_p (double_int);
+ unsigned HOST_WIDE_INT double_int_to_unsigned_hwi (double_int);
+ double_int double_int_mul (double_int, double_int, double_int);
+ double_int double_int_add (double_int, double_int, double_int);
+ double_int double_int_negate (double_int, double_int);
+ double_int double_int_divide (double_int, double_int);
+ bool double_int_negative_p (double_int, double_int);
+ bool double_int_smaller_p (double_int, double_int);
+ void dump_double_int (FILE *, double_int, double_int, bool);
+ double_int double_int_mask (unsigned);
+
+ /* Constructs double_int from tree CST. */
+
+ static inline double_int
+ tree_to_double_int (tree cst)
+ {
+ double_int r;
+
+ r.low = TREE_INT_CST_LOW (cst);
+ r.high = TREE_INT_CST_HIGH (cst);
+
+ return r;
+ }
+
+ /* Constructs double_int from integer CST. */
+
+ static inline double_int
+ hwi_to_double_int (HOST_WIDE_INT cst)
+ {
+ double_int r;
+
+ r.low = cst;
+ r.high = cst < 0 ? ~(unsigned HOST_WIDE_INT) 0 : 0;
+
+ return r;
+ }
+
+ /* Constructs mask with all bits 1. */
+
+ static inline double_int
+ double_int_all (void)
+ {
+ return hwi_to_double_int (-1);
+ }
+
+ /* Returns true if CST is zero. */
+
+ static inline bool
+ double_int_zero_p (double_int cst)
+ {
+ return cst.low == 0 && cst.high == 0;
+ }
+
+ /* Returns true if CST is one. */
+
+ static inline bool
+ double_int_one_p (double_int cst)
+ {
+ return cst.low == 1 && cst.high == 0;
+ }
+
+ /* Returns true if CST is minus one in precision of MASK. */
+
+ static inline bool
+ double_int_minus_one_p (double_int mask, double_int cst)
+ {
+ return cst.low == mask.low && cst.high == mask.high;
+ }
+
+ /* Returns true if CST1 == CST2. */
+
+ static inline bool
+ double_int_equal_p (double_int cst1, double_int cst2)
+ {
+ return cst1.low == cst2.low && cst1.high == cst2.high;
+ }
+
Index: tree-flow.h
===================================================================
*** tree-flow.h (revision 107712)
--- tree-flow.h (working copy)
*************** struct loop *tree_ssa_loop_version (stru
*** 759,765 ****
basic_block *);
tree expand_simple_operations (tree);
void substitute_in_loop_info (struct loop *, tree, tree);
! edge single_dom_exit (struct loop *);
/* In tree-ssa-loop-im.c */
/* The possibilities of statement movement. */
--- 759,765 ----
basic_block *);
tree expand_simple_operations (tree);
void substitute_in_loop_info (struct loop *, tree, tree);
! edge single_dom_exit (const struct loop *);
/* In tree-ssa-loop-im.c */
/* The possibilities of statement movement. */
*************** bool find_what_p_points_to (tree);
*** 836,868 ****
/* In tree-ssa-address.c */
- /* Affine combination of trees. We keep track of at most MAX_AFF_ELTS elements
- to make things simpler; this is sufficient in most cases. */
-
- #define MAX_AFF_ELTS 8
-
- struct affine_tree_combination
- {
- /* Type of the result of the combination. */
- tree type;
-
- /* Mask modulo that the operations are performed. */
- unsigned HOST_WIDE_INT mask;
-
- /* Constant offset. */
- unsigned HOST_WIDE_INT offset;
-
- /* Number of elements of the combination. */
- unsigned n;
-
- /* Elements and their coefficients. */
- tree elts[MAX_AFF_ELTS];
- unsigned HOST_WIDE_INT coefs[MAX_AFF_ELTS];
-
- /* Remainder of the expression. */
- tree rest;
- };
-
/* Description of a memory address. */
struct mem_address
--- 836,841 ----
*************** struct mem_address
*** 870,875 ****
--- 843,849 ----
tree symbol, base, index, step, offset;
};
+ struct affine_tree_combination;
tree create_mem_ref (block_stmt_iterator *, tree,
struct affine_tree_combination *);
rtx addr_for_mem_ref (struct mem_address *, bool);
Index: Makefile.in
===================================================================
*** Makefile.in (revision 107712)
--- Makefile.in (working copy)
*************** OBJS-common = \
*** 956,962 ****
tree-ssa-loop-manip.o tree-ssa-threadupdate.o \
tree-vectorizer.o tree-vect-analyze.o tree-vect-transform.o \
tree-ssa-loop-ivcanon.o tree-ssa-propagate.o tree-ssa-address.o \
! tree-ssa-math-opts.o \
tree-ssa-loop-ivopts.o tree-if-conv.o tree-ssa-loop-unswitch.o \
alias.o bb-reorder.o bitmap.o builtins.o caller-save.o calls.o \
cfg.o cfganal.o cfgbuild.o cfgcleanup.o cfglayout.o cfgloop.o \
--- 956,962 ----
tree-ssa-loop-manip.o tree-ssa-threadupdate.o \
tree-vectorizer.o tree-vect-analyze.o tree-vect-transform.o \
tree-ssa-loop-ivcanon.o tree-ssa-propagate.o tree-ssa-address.o \
! tree-ssa-math-opts.o tree-affine.o \
tree-ssa-loop-ivopts.o tree-if-conv.o tree-ssa-loop-unswitch.o \
alias.o bb-reorder.o bitmap.o builtins.o caller-save.o calls.o \
cfg.o cfganal.o cfgbuild.o cfgcleanup.o cfglayout.o cfgloop.o \
*************** tree-ssa-address.o : tree-ssa-address.c
*** 1891,1897 ****
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) \
output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
tree-pass.h $(FLAGS_H) tree-inline.h $(RECOG_H) insn-config.h $(EXPR_H) \
! gt-tree-ssa-address.h $(GGC_H)
tree-ssa-loop-niter.o : tree-ssa-loop-niter.c $(TREE_FLOW_H) $(CONFIG_H) \
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) $(PARAMS_H) \
tree-inline.h output.h $(DIAGNOSTIC_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
--- 1891,1897 ----
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) \
output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
tree-pass.h $(FLAGS_H) tree-inline.h $(RECOG_H) insn-config.h $(EXPR_H) \
! gt-tree-ssa-address.h $(GGC_H) tree-affine.h
tree-ssa-loop-niter.o : tree-ssa-loop-niter.c $(TREE_FLOW_H) $(CONFIG_H) \
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) $(PARAMS_H) \
tree-inline.h output.h $(DIAGNOSTIC_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
*************** tree-ssa-loop-ivopts.o : tree-ssa-loop-i
*** 1911,1917 ****
output.h $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
tree-pass.h $(GGC_H) $(RECOG_H) insn-config.h $(HASHTAB_H) $(SCEV_H) \
$(CFGLOOP_H) $(PARAMS_H) langhooks.h $(BASIC_BLOCK_H) hard-reg-set.h \
! tree-chrec.h $(VARRAY_H)
tree-ssa-loop-manip.o : tree-ssa-loop-manip.c $(TREE_FLOW_H) $(CONFIG_H) \
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) \
output.h $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
--- 1911,1920 ----
output.h $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
tree-pass.h $(GGC_H) $(RECOG_H) insn-config.h $(HASHTAB_H) $(SCEV_H) \
$(CFGLOOP_H) $(PARAMS_H) langhooks.h $(BASIC_BLOCK_H) hard-reg-set.h \
! tree-chrec.h $(VARRAY_H) tree-affine.h
! tree-affine.o : tree-affine.c tree-affine.h $(CONFIG_H) \
! $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) \
! output.h $(DIAGNOSTIC_H) $(TM_H) coretypes.h $(TREE_DUMP_H)
tree-ssa-loop-manip.o : tree-ssa-loop-manip.c $(TREE_FLOW_H) $(CONFIG_H) \
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) \
output.h $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
Index: hwint.h
===================================================================
*** hwint.h (revision 107712)
--- hwint.h (working copy)
*************** extern char sizeof_long_long_must_be_8[s
*** 147,150 ****
--- 147,157 ----
# define HOST_BITS_PER_WIDEST_FAST_INT HOST_BITS_PER_LONG
#endif
+ /* A structure containing a large integer. */
+
+ typedef struct
+ {
+ unsigned HOST_WIDE_INT low, high;
+ } double_int;
+
#endif /* ! GCC_HWINT_H */