This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] for PR 18316 (updated)
Hello,
> The rest of the patch looks reasonable. Do you have any benchmark
> figures on the impact of this patch? We've had problems in the past
> with the number of induction variables we find and their effect on
> register pressure. If there's no (large) performance degradation on
> SPEC with this change, I'll approve it for mainline.
here is the patch that includes the changes you requested, and also an
improvement to fix the huge regressions the old patch caused on some
testcases (mgrid).
The reason for the regressions was the following: ivopts create
expressions like
iv + &a[x + y + 8] - &a[x + y]
and expect fold to take care of turning this to
iv + 8 * sizeof(*a)
However fold is not clever enough to do this in many cases. This did
not cause that many problems before, since the operations were cleaned
up after expansion to rtl. However, in mgrid there are several nested
loops and "x" and "y" happen to be induction variables (with nonconstant
step) of those outer loops. And the attempts of ivopts to optimize
these uses of course cause complete mess, and even the rtl optimizers
are no longer able to get the "8 * sizeof(*a)" constants, and the result
is a huge performance regression.
The patch fixes this problem by implementing improved folding of this
type of expressions (basically for the affine combinations of
variables). This may be relatively compile time expensive, thus I
haven't added this to fold, but kept as a separate set of functions.
(On a side note, the decomposition to the affine expression can
be used to significantly simplify the process of generation of
TARGET_MEM_REFs in the patch http://gcc.gnu.org/ml/gcc-patches/2005-04/msg01360.html;
I would like to work on that once this patch is approved).
With this version of the patch, we get no significant regressions on
spec on i686, overall unchanged score on specint and a minor improvement
on specfp (see the results below, at -O2, peak with the patch).
Bootstrapped & regtested on i686, x86_64 and ia64.
Zdenek
164.gzip 1400 199 704 * 1400 199 704 *
175.vpr 1400 393 356 * 1400 392 357 *
176.gcc X X
181.mcf 1800 739 244 * 1800 736 245 *
186.crafty 1000 120 834 * 1000 118 846 *
197.parser 1800 395 456 * 1800 394 457 *
252.eon 1300 143 909 * 1300 142 913 *
253.perlbmk 1800 203 887 * 1800 204 884 *
254.gap 1100 174 634 * 1100 175 629 *
255.vortex 1900 232 818 * 1900 232 817 *
256.bzip2 1500 335 447 * 1500 338 444 *
300.twolf 3000 786 382 * 3000 786 382 *
Est. SPECint_base2000 559
Est. SPECint2000 559
168.wupwise 1600 283 565 * 1600 283 566 *
171.swim 3100 673 461 * 3100 676 458 *
172.mgrid 1800 507 355 * 1800 480 375 *
173.applu 2100 578 363 * 2100 542 388 *
177.mesa 1400 201 696 * 1400 202 695 *
178.galgel X X
179.art 2600 1472 177 * 2600 1465 178 *
183.equake 1300 291 446 * 1300 294 443 *
187.facerec 1900 478 397 * 1900 468 406 *
188.ammp 2200 610 361 * 2200 621 354 *
189.lucas 2000 428 467 * 2000 429 467 *
191.fma3d 2100 396 531 * 2100 396 530 *
200.sixtrack 1100 237 463 * 1100 238 461 *
301.apsi 2600 688 378 * 2600 660 394 *
Est. SPECfp_base2000 417
Est. SPECfp2000 422
PR tree-optimization/18316
PR tree-optimization/19126
* tree.c (build_int_cst_type): Avoid shift by size of type.
* tree-scalar-evolution.c (simple_iv): Add allow_nonconstant_step
argument.
* tree-scalar-evolution.h (simple_iv): Declaration changed.
* tree-ssa-loop-ivopts.c (struct iv_cand): Add depends_on
field.
(dump_cand): Dump depends_on information.
(determine_biv_step): Add argument to simple_iv call.
(contains_abnormal_ssa_name_p): Handle case expr == NULL.
(find_bivs, find_givs_in_stmt_scev): Do not require step to be a
constant.
(add_candidate_1): Record depends_on for candidates.
(tree_int_cst_sign_bit, constant_multiple_of): New functions.
(get_computation_at, get_computation_cost_at, may_eliminate_iv):
Handle ivs with nonconstant step.
(iv_ca_set_remove_invariants, iv_ca_set_add_invariants): New functions.
(iv_ca_set_no_cp, iv_ca_set_cp): Handle cand->depends_on.
(create_new_iv): Unshare the step before passing it to create_iv.
(free_loop_data): Free cand->depends_on.
(build_addr_strip_iref): New function.
(find_interesting_uses_address): Use build_addr_strip_iref.
(strip_offset_1): Split the recursive part from strip_offset.
Strip constant offset component_refs and array_refs.
(strip_offset): Split the recursive part to strip_offset_1.
(add_address_candidates): Removed.
(add_derived_ivs_candidates): Do not use add_address_candidates.
(add_iv_value_candidates): Add candidates with stripped constant
offset. Consider all candidates with initial value 0 important.
(struct affine_tree_combination): New.
(aff_combination_const, aff_combination_elt, aff_combination_scale,
aff_combination_add_elt, aff_combination_add,
tree_to_aff_combination, add_elt_to_tree, aff_combination_to_tree,
fold_affine_sum): New functions.
(get_computation_at): Use fold_affine_sum.
* tree-ssa-loop-manip.c (create_iv): Handle ivs with nonconstant step.
* tree-ssa-loop-niter.c (number_of_iterations_exit): Add argument
to simple_iv call.
Index: tree-scalar-evolution.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v
retrieving revision 2.21
diff -c -3 -p -r2.21 tree-scalar-evolution.c
*** tree-scalar-evolution.c 21 Apr 2005 08:48:53 -0000 2.21
--- tree-scalar-evolution.c 22 Apr 2005 21:23:04 -0000
*************** scev_reset (void)
*** 2547,2556 ****
}
/* Checks whether OP behaves as a simple affine iv of LOOP in STMT and returns
! its BASE and STEP if possible. */
bool
! simple_iv (struct loop *loop, tree stmt, tree op, tree *base, tree *step)
{
basic_block bb = bb_for_stmt (stmt);
tree type, ev;
--- 2547,2559 ----
}
/* Checks whether OP behaves as a simple affine iv of LOOP in STMT and returns
! its BASE and STEP if possible. If ALLOW_NONCONSTANT_STEP is true, we
! want STEP to be invariant in LOOP. Otherwise we require it to be an
! integer constant. */
bool
! simple_iv (struct loop *loop, tree stmt, tree op, tree *base, tree *step,
! bool allow_nonconstant_step)
{
basic_block bb = bb_for_stmt (stmt);
tree type, ev;
*************** simple_iv (struct loop *loop, tree stmt,
*** 2579,2586 ****
return false;
*step = CHREC_RIGHT (ev);
! if (TREE_CODE (*step) != INTEGER_CST)
return false;
*base = CHREC_LEFT (ev);
if (tree_contains_chrecs (*base, NULL)
|| chrec_contains_symbols_defined_in_loop (*base, loop->num))
--- 2582,2596 ----
return false;
*step = CHREC_RIGHT (ev);
! if (allow_nonconstant_step)
! {
! if (tree_contains_chrecs (*step, NULL)
! || chrec_contains_symbols_defined_in_loop (*step, loop->num))
! return false;
! }
! else if (TREE_CODE (*step) != INTEGER_CST)
return false;
+
*base = CHREC_LEFT (ev);
if (tree_contains_chrecs (*base, NULL)
|| chrec_contains_symbols_defined_in_loop (*base, loop->num))
Index: tree-scalar-evolution.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.h,v
retrieving revision 2.3
diff -c -3 -p -r2.3 tree-scalar-evolution.h
*** tree-scalar-evolution.h 17 Nov 2004 22:06:00 -0000 2.3
--- tree-scalar-evolution.h 22 Apr 2005 21:23:04 -0000
*************** extern tree analyze_scalar_evolution (st
*** 32,37 ****
extern tree instantiate_parameters (struct loop *, tree);
extern void gather_stats_on_scev_database (void);
extern void scev_analysis (void);
! extern bool simple_iv (struct loop *, tree, tree, tree *, tree *);
#endif /* GCC_TREE_SCALAR_EVOLUTION_H */
--- 32,37 ----
extern tree instantiate_parameters (struct loop *, tree);
extern void gather_stats_on_scev_database (void);
extern void scev_analysis (void);
! extern bool simple_iv (struct loop *, tree, tree, tree *, tree *, bool);
#endif /* GCC_TREE_SCALAR_EVOLUTION_H */
Index: tree-ssa-loop-ivopts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivopts.c,v
retrieving revision 2.60
diff -c -3 -p -r2.60 tree-ssa-loop-ivopts.c
*** tree-ssa-loop-ivopts.c 17 Apr 2005 06:41:53 -0000 2.60
--- tree-ssa-loop-ivopts.c 22 Apr 2005 21:23:04 -0000
*************** struct iv_cand
*** 188,193 ****
--- 188,195 ----
to replace the final value of an iv by direct
computation of the value. */
unsigned cost; /* Cost of the candidate. */
+ bitmap depends_on; /* The list of invariants that are used in step of the
+ biv. */
};
/* The data used by the induction variable optimizations. */
*************** dump_cand (FILE *file, struct iv_cand *c
*** 481,486 ****
--- 483,494 ----
fprintf (file, "candidate %d%s\n",
cand->id, cand->important ? " (important)" : "");
+ if (cand->depends_on)
+ {
+ fprintf (file, " depends on ");
+ dump_bitmap (file, cand->depends_on);
+ }
+
if (!iv)
{
fprintf (file, " final value replacement\n");
*************** get_iv (struct ivopts_data *data, tree v
*** 853,875 ****
return name_info (data, var)->iv;
}
! /* Determines the step of a biv defined in PHI. */
static tree
determine_biv_step (tree phi)
{
struct loop *loop = bb_for_stmt (phi)->loop_father;
tree name = PHI_RESULT (phi), base, step;
- tree type = TREE_TYPE (name);
if (!is_gimple_reg (name))
return NULL_TREE;
! if (!simple_iv (loop, phi, name, &base, &step))
return NULL_TREE;
! if (!step)
! return build_int_cst (type, 0);
return step;
}
--- 861,883 ----
return name_info (data, var)->iv;
}
! /* Determines the step of a biv defined in PHI. Returns NULL if PHI does
! not define a simple affine biv with nonzero step. */
static tree
determine_biv_step (tree phi)
{
struct loop *loop = bb_for_stmt (phi)->loop_father;
tree name = PHI_RESULT (phi), base, step;
if (!is_gimple_reg (name))
return NULL_TREE;
! if (!simple_iv (loop, phi, name, &base, &step, true))
return NULL_TREE;
! if (zero_p (step))
! return NULL_TREE;
return step;
}
*************** idx_contains_abnormal_ssa_name_p (tree b
*** 912,920 ****
static bool
contains_abnormal_ssa_name_p (tree expr)
{
! enum tree_code code = TREE_CODE (expr);
! enum tree_code_class class = TREE_CODE_CLASS (code);
!
if (code == SSA_NAME)
return SSA_NAME_OCCURS_IN_ABNORMAL_PHI (expr) != 0;
--- 920,934 ----
static bool
contains_abnormal_ssa_name_p (tree expr)
{
! enum tree_code code;
! enum tree_code_class class;
!
! if (!expr)
! return false;
!
! code = TREE_CODE (expr);
! class = TREE_CODE_CLASS (code);
!
if (code == SSA_NAME)
return SSA_NAME_OCCURS_IN_ABNORMAL_PHI (expr) != 0;
*************** find_bivs (struct ivopts_data *data)
*** 963,987 ****
continue;
step = determine_biv_step (phi);
-
if (!step)
continue;
- if (cst_and_fits_in_hwi (step)
- && int_cst_value (step) == 0)
- continue;
base = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
! if (contains_abnormal_ssa_name_p (base))
continue;
type = TREE_TYPE (PHI_RESULT (phi));
base = fold_convert (type, base);
! step = fold_convert (type, step);
!
! /* FIXME: We do not handle induction variables whose step does
! not satisfy cst_and_fits_in_hwi. */
! if (!cst_and_fits_in_hwi (step))
! continue;
set_iv (data, PHI_RESULT (phi), base, step);
found = true;
--- 977,994 ----
continue;
step = determine_biv_step (phi);
if (!step)
continue;
base = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
! if (contains_abnormal_ssa_name_p (base)
! || contains_abnormal_ssa_name_p (step))
continue;
type = TREE_TYPE (PHI_RESULT (phi));
base = fold_convert (type, base);
! if (step)
! step = fold_convert (type, step);
set_iv (data, PHI_RESULT (phi), base, step);
found = true;
*************** find_givs_in_stmt_scev (struct ivopts_da
*** 1042,1057 ****
if (TREE_CODE (lhs) != SSA_NAME)
return false;
! if (!simple_iv (loop, stmt, TREE_OPERAND (stmt, 1), base, step))
! return false;
!
! /* FIXME: We do not handle induction variables whose step does
! not satisfy cst_and_fits_in_hwi. */
! if (!zero_p (*step)
! && !cst_and_fits_in_hwi (*step))
return false;
! if (contains_abnormal_ssa_name_p (*base))
return false;
return true;
--- 1049,1059 ----
if (TREE_CODE (lhs) != SSA_NAME)
return false;
! if (!simple_iv (loop, stmt, TREE_OPERAND (stmt, 1), base, step, true))
return false;
! if (contains_abnormal_ssa_name_p (*base)
! || contains_abnormal_ssa_name_p (*step))
return false;
return true;
*************** idx_find_step (tree base, tree *idx, voi
*** 1440,1452 ****
return false;
}
! step = fold_binary_to_constant (MULT_EXPR, type, step, iv_step);
if (!*dta->step_p)
*dta->step_p = step;
else
! *dta->step_p = fold_binary_to_constant (PLUS_EXPR, type,
! *dta->step_p, step);
return true;
}
--- 1442,1453 ----
return false;
}
! step = fold_build2 (MULT_EXPR, type, step, iv_step);
if (!*dta->step_p)
*dta->step_p = step;
else
! *dta->step_p = fold_build2 (PLUS_EXPR, type, *dta->step_p, step);
return true;
}
*************** may_be_unaligned_p (tree ref)
*** 1498,1503 ****
--- 1499,1523 ----
return false;
}
+ /* Builds ADDR_EXPR of object OBJ. If OBJ is an INDIRECT_REF, the indirect_ref
+ is stripped instead. */
+
+ static tree
+ build_addr_strip_iref (tree obj)
+ {
+ tree type;
+
+ if (TREE_CODE (obj) == INDIRECT_REF)
+ {
+ type = build_pointer_type (TREE_TYPE (obj));
+ obj = fold_convert (type, TREE_OPERAND (obj, 0));
+ }
+ else
+ obj = build_addr (obj);
+
+ return obj;
+ }
+
/* Finds addresses in *OP_P inside STMT. */
static void
*************** find_interesting_uses_address (struct iv
*** 1527,1536 ****
gcc_assert (TREE_CODE (base) != ALIGN_INDIRECT_REF);
gcc_assert (TREE_CODE (base) != MISALIGNED_INDIRECT_REF);
! if (TREE_CODE (base) == INDIRECT_REF)
! base = TREE_OPERAND (base, 0);
! else
! base = build_addr (base);
civ = alloc_iv (base, step);
record_use (data, op_p, civ, stmt, USE_ADDRESS);
--- 1547,1553 ----
gcc_assert (TREE_CODE (base) != ALIGN_INDIRECT_REF);
gcc_assert (TREE_CODE (base) != MISALIGNED_INDIRECT_REF);
! base = build_addr_strip_iref (base);
civ = alloc_iv (base, step);
record_use (data, op_p, civ, stmt, USE_ADDRESS);
*************** find_interesting_uses (struct ivopts_dat
*** 1742,1759 ****
}
/* Strips constant offsets from EXPR and stores them to OFFSET. If INSIDE_ADDR
! is true, assume we are inside an address. */
static tree
! strip_offset (tree expr, bool inside_addr, unsigned HOST_WIDE_INT *offset)
{
! tree op0 = NULL_TREE, op1 = NULL_TREE, step;
enum tree_code code;
tree type, orig_type = TREE_TYPE (expr);
unsigned HOST_WIDE_INT off0, off1, st;
tree orig_expr = expr;
STRIP_NOPS (expr);
type = TREE_TYPE (expr);
code = TREE_CODE (expr);
*offset = 0;
--- 1759,1779 ----
}
/* Strips constant offsets from EXPR and stores them to OFFSET. If INSIDE_ADDR
! is true, assume we are inside an address. If TOP_COMPREF is true, assume
! we are at the top-level of the processed address. */
static tree
! strip_offset_1 (tree expr, bool inside_addr, bool top_compref,
! unsigned HOST_WIDE_INT *offset)
{
! tree op0 = NULL_TREE, op1 = NULL_TREE, tmp, step;
enum tree_code code;
tree type, orig_type = TREE_TYPE (expr);
unsigned HOST_WIDE_INT off0, off1, st;
tree orig_expr = expr;
STRIP_NOPS (expr);
+
type = TREE_TYPE (expr);
code = TREE_CODE (expr);
*offset = 0;
*************** strip_offset (tree expr, bool inside_add
*** 1773,1780 ****
op0 = TREE_OPERAND (expr, 0);
op1 = TREE_OPERAND (expr, 1);
! op0 = strip_offset (op0, false, &off0);
! op1 = strip_offset (op1, false, &off1);
*offset = (code == PLUS_EXPR ? off0 + off1 : off0 - off1);
if (op0 == TREE_OPERAND (expr, 0)
--- 1793,1800 ----
op0 = TREE_OPERAND (expr, 0);
op1 = TREE_OPERAND (expr, 1);
! op0 = strip_offset_1 (op0, false, false, &off0);
! op1 = strip_offset_1 (op1, false, false, &off1);
*offset = (code == PLUS_EXPR ? off0 + off1 : off0 - off1);
if (op0 == TREE_OPERAND (expr, 0)
*************** strip_offset (tree expr, bool inside_add
*** 1788,1797 ****
if (code == PLUS_EXPR)
expr = op1;
else
! expr = build1 (NEGATE_EXPR, type, op1);
}
else
! expr = build2 (code, type, op0, op1);
return fold_convert (orig_type, expr);
--- 1808,1817 ----
if (code == PLUS_EXPR)
expr = op1;
else
! expr = fold_build1 (NEGATE_EXPR, type, op1);
}
else
! expr = fold_build2 (code, type, op0, op1);
return fold_convert (orig_type, expr);
*************** strip_offset (tree expr, bool inside_add
*** 1805,1821 ****
st = int_cst_value (step);
op1 = TREE_OPERAND (expr, 1);
! op1 = strip_offset (op1, false, &off1);
*offset = off1 * st;
break;
case COMPONENT_REF:
if (!inside_addr)
return orig_expr;
break;
case ADDR_EXPR:
! inside_addr = true;
break;
default:
--- 1825,1873 ----
st = int_cst_value (step);
op1 = TREE_OPERAND (expr, 1);
! op1 = strip_offset_1 (op1, false, false, &off1);
*offset = off1 * st;
+
+ if (top_compref
+ && zero_p (op1))
+ {
+ /* Strip the component reference completely. */
+ op0 = TREE_OPERAND (expr, 0);
+ op0 = strip_offset_1 (op0, inside_addr, top_compref, &off0);
+ *offset += off0;
+ return op0;
+ }
break;
case COMPONENT_REF:
if (!inside_addr)
return orig_expr;
+
+ tmp = component_ref_field_offset (expr);
+ if (top_compref
+ && cst_and_fits_in_hwi (tmp))
+ {
+ /* Strip the component reference completely. */
+ op0 = TREE_OPERAND (expr, 0);
+ op0 = strip_offset_1 (op0, inside_addr, top_compref, &off0);
+ *offset = off0 + int_cst_value (tmp);
+ return op0;
+ }
break;
case ADDR_EXPR:
! op0 = TREE_OPERAND (expr, 0);
! op0 = strip_offset_1 (op0, true, true, &off0);
! *offset += off0;
!
! if (op0 == TREE_OPERAND (expr, 0))
! return orig_expr;
!
! expr = build_addr_strip_iref (op0);
! return fold_convert (orig_type, expr);
!
! case INDIRECT_REF:
! inside_addr = false;
break;
default:
*************** strip_offset (tree expr, bool inside_add
*** 1825,1831 ****
/* Default handling of expressions for that we want to recurse into
the first operand. */
op0 = TREE_OPERAND (expr, 0);
! op0 = strip_offset (op0, inside_addr, &off0);
*offset += off0;
if (op0 == TREE_OPERAND (expr, 0)
--- 1877,1883 ----
/* Default handling of expressions for that we want to recurse into
the first operand. */
op0 = TREE_OPERAND (expr, 0);
! op0 = strip_offset_1 (op0, inside_addr, false, &off0);
*offset += off0;
if (op0 == TREE_OPERAND (expr, 0)
*************** strip_offset (tree expr, bool inside_add
*** 1837,1843 ****
if (op1)
TREE_OPERAND (expr, 1) = op1;
! return fold_convert (orig_type, expr);
}
/* Returns variant of TYPE that can be used as base for different uses.
--- 1889,1908 ----
if (op1)
TREE_OPERAND (expr, 1) = op1;
! /* Inside address, we might strip the top level component references,
! thus changing type of the expresion. Handling of ADDR_EXPR
! will fix that. */
! expr = fold_convert (orig_type, expr);
!
! return expr;
! }
!
! /* Strips constant offsets from EXPR and stores them to OFFSET. */
!
! static tree
! strip_offset (tree expr, unsigned HOST_WIDE_INT *offset)
! {
! return strip_offset_1 (expr, false, false, offset);
}
/* Returns variant of TYPE that can be used as base for different uses.
*************** generic_type_for (tree type)
*** 1856,1861 ****
--- 1921,1950 ----
return unsigned_type_for (type);
}
+ /* Records invariants in *EXPR_P. Callback for walk_tree. DATA contains
+ the bitmap to that we should store it. */
+
+ static struct ivopts_data *fd_ivopts_data;
+ static tree
+ find_depends (tree *expr_p, int *ws ATTRIBUTE_UNUSED, void *data)
+ {
+ bitmap *depends_on = data;
+ struct version_info *info;
+
+ if (TREE_CODE (*expr_p) != SSA_NAME)
+ return NULL_TREE;
+ info = name_info (fd_ivopts_data, *expr_p);
+
+ if (!info->inv_id || info->has_nonlin_use)
+ return NULL_TREE;
+
+ if (!*depends_on)
+ *depends_on = BITMAP_ALLOC (NULL);
+ bitmap_set_bit (*depends_on, info->inv_id);
+
+ return NULL_TREE;
+ }
+
/* Adds a candidate BASE + STEP * i. Important field is set to IMPORTANT and
position to POS. If USE is not NULL, the candidate is set as related to
it. If both BASE and STEP are NULL, we add a pseudocandidate for the
*************** add_candidate_1 (struct ivopts_data *dat
*** 1938,1943 ****
--- 2027,2039 ----
cand->incremented_at = incremented_at;
VARRAY_PUSH_GENERIC_PTR_NOGC (data->iv_candidates, cand);
+ if (step
+ && TREE_CODE (step) != INTEGER_CST)
+ {
+ fd_ivopts_data = data;
+ walk_tree (&step, find_depends, &cand->depends_on, NULL);
+ }
+
if (dump_file && (dump_flags & TDF_DETAILS))
dump_cand (dump_file, cand);
}
*************** static void
*** 2073,2122 ****
add_iv_value_candidates (struct ivopts_data *data,
struct iv *iv, struct iv_use *use)
{
- add_candidate (data, iv->base, iv->step, false, use);
-
- /* The same, but with initial value zero. */
- add_candidate (data, build_int_cst (TREE_TYPE (iv->base), 0),
- iv->step, false, use);
- }
-
- /* Adds candidates based on the address IV and USE. */
-
- static void
- add_address_candidates (struct ivopts_data *data,
- struct iv *iv, struct iv_use *use)
- {
- tree base, abase;
unsigned HOST_WIDE_INT offset;
! /* First, the trivial choices. */
! add_iv_value_candidates (data, iv, use);
!
! /* Second, try removing the COMPONENT_REFs. */
! if (TREE_CODE (iv->base) == ADDR_EXPR)
! {
! base = TREE_OPERAND (iv->base, 0);
! while (TREE_CODE (base) == COMPONENT_REF
! || (TREE_CODE (base) == ARRAY_REF
! && TREE_CODE (TREE_OPERAND (base, 1)) == INTEGER_CST))
! base = TREE_OPERAND (base, 0);
!
! if (base != TREE_OPERAND (iv->base, 0))
! {
! gcc_assert (TREE_CODE (base) != ALIGN_INDIRECT_REF);
! gcc_assert (TREE_CODE (base) != MISALIGNED_INDIRECT_REF);
! if (TREE_CODE (base) == INDIRECT_REF)
! base = TREE_OPERAND (base, 0);
! else
! base = build_addr (base);
! add_candidate (data, base, iv->step, false, use);
! }
! }
/* Third, try removing the constant offset. */
! abase = iv->base;
! base = strip_offset (abase, false, &offset);
if (offset)
add_candidate (data, base, iv->step, false, use);
}
--- 2169,2187 ----
add_iv_value_candidates (struct ivopts_data *data,
struct iv *iv, struct iv_use *use)
{
unsigned HOST_WIDE_INT offset;
+ tree base;
! add_candidate (data, iv->base, iv->step, false, use);
! /* The same, but with initial value zero. Make such variable important,
! since it is generic enough so that possibly many uses may be based
! on it. */
! add_candidate (data, build_int_cst (TREE_TYPE (iv->base), 0),
! iv->step, true, use);
/* Third, try removing the constant offset. */
! base = strip_offset (iv->base, &offset);
if (offset)
add_candidate (data, base, iv->step, false, use);
}
*************** add_derived_ivs_candidates (struct ivopt
*** 2156,2161 ****
--- 2221,2227 ----
{
case USE_NONLINEAR_EXPR:
case USE_COMPARE:
+ case USE_ADDRESS:
/* Just add the ivs based on the value of the iv used here. */
add_iv_value_candidates (data, use->iv, use);
break;
*************** add_derived_ivs_candidates (struct ivopt
*** 2168,2177 ****
add_iv_outer_candidates (data, use);
break;
- case USE_ADDRESS:
- add_address_candidates (data, use->iv, use);
- break;
-
default:
gcc_unreachable ();
}
--- 2234,2239 ----
*************** var_at_stmt (struct loop *loop, struct i
*** 2475,2480 ****
--- 2537,2988 ----
return cand->var_before;
}
+ /* Return the most significant (sign) bit of T. Similar to tree_int_cst_msb,
+ but the bit is determined from TYPE_PRECISION, not MODE_BITSIZE. */
+
+ static int
+ tree_int_cst_sign_bit (tree t)
+ {
+ unsigned bitno = TYPE_PRECISION (TREE_TYPE (t)) - 1;
+ unsigned HOST_WIDE_INT w;
+
+ if (bitno < HOST_BITS_PER_WIDE_INT)
+ w = TREE_INT_CST_LOW (t);
+ else
+ {
+ w = TREE_INT_CST_HIGH (t);
+ bitno -= HOST_BITS_PER_WIDE_INT;
+ }
+
+ return (w >> bitno) & 1;
+ }
+
+ /* If we can prove that TOP = cst * BOT for some constant cst in TYPE,
+ return cst. Otherwise return NULL_TREE. */
+
+ static tree
+ constant_multiple_of (tree type, tree top, tree bot)
+ {
+ tree res, mby, p0, p1;
+ enum tree_code code;
+ bool negate;
+
+ STRIP_NOPS (top);
+ STRIP_NOPS (bot);
+
+ if (operand_equal_p (top, bot, 0))
+ return build_int_cst (type, 1);
+
+ code = TREE_CODE (top);
+ switch (code)
+ {
+ case MULT_EXPR:
+ mby = TREE_OPERAND (top, 1);
+ if (TREE_CODE (mby) != INTEGER_CST)
+ return NULL_TREE;
+
+ res = constant_multiple_of (type, TREE_OPERAND (top, 0), bot);
+ if (!res)
+ return NULL_TREE;
+
+ return fold_binary_to_constant (MULT_EXPR, type, res,
+ fold_convert (type, mby));
+
+ case PLUS_EXPR:
+ case MINUS_EXPR:
+ p0 = constant_multiple_of (type, TREE_OPERAND (top, 0), bot);
+ if (!p0)
+ return NULL_TREE;
+ p1 = constant_multiple_of (type, TREE_OPERAND (top, 1), bot);
+ if (!p1)
+ return NULL_TREE;
+
+ return fold_binary_to_constant (code, type, p0, p1);
+
+ case INTEGER_CST:
+ if (TREE_CODE (bot) != INTEGER_CST)
+ return NULL_TREE;
+
+ bot = fold_convert (type, bot);
+ top = fold_convert (type, top);
+
+ /* If BOT seems to be negative, try dividing by -BOT instead, and negate
+ the result afterwards. */
+ if (tree_int_cst_sign_bit (bot))
+ {
+ negate = true;
+ bot = fold_unary_to_constant (NEGATE_EXPR, type, bot);
+ }
+ else
+ negate = false;
+
+ /* Ditto for TOP. */
+ if (tree_int_cst_sign_bit (top))
+ {
+ negate = !negate;
+ top = fold_unary_to_constant (NEGATE_EXPR, type, top);
+ }
+
+ if (!zero_p (fold_binary_to_constant (TRUNC_MOD_EXPR, type, top, bot)))
+ return NULL_TREE;
+
+ res = fold_binary_to_constant (EXACT_DIV_EXPR, type, top, bot);
+ if (negate)
+ res = fold_unary_to_constant (NEGATE_EXPR, type, res);
+ return res;
+
+ default:
+ return NULL_TREE;
+ }
+ }
+
+ /* Affine combination of trees. We keep track of at most MAX_AFF_ELTS elements
+ to make things simpler; this is sufficient in most cases. */
+
+ #define MAX_AFF_ELTS 8
+
+ struct affine_tree_combination
+ {
+ /* Type of the result of the combination. */
+ tree type;
+
+ /* Mask modulo that the operations are performed. */
+ unsigned HOST_WIDE_INT mask;
+
+ /* Constant offset. */
+ unsigned HOST_WIDE_INT offset;
+
+ /* Number of elements of the combination. */
+ unsigned n;
+
+ /* Elements and their coefficients. */
+ tree elts[MAX_AFF_ELTS];
+ unsigned HOST_WIDE_INT coefs[MAX_AFF_ELTS];
+
+ /* Remainder of the expression. */
+ tree rest;
+ };
+
+ /* Sets COMB to CST. */
+
+ static void
+ aff_combination_const (struct affine_tree_combination *comb, tree type,
+ unsigned HOST_WIDE_INT cst)
+ {
+ comb->type = type;
+ comb->mask = 2;
+ comb->mask <<= TYPE_PRECISION (type) - 1;
+ comb->mask--;
+
+ comb->n = 0;
+ comb->rest = NULL_TREE;
+ comb->offset = cst & comb->mask;
+ }
+
+ /* Sets COMB to single element ELT. */
+
+ static void
+ aff_combination_elt (struct affine_tree_combination *comb, tree type, tree elt)
+ {
+ comb->type = type;
+ comb->mask = 2;
+ comb->mask <<= TYPE_PRECISION (type) - 1;
+ comb->mask--;
+
+ comb->n = 1;
+ comb->elts[0] = elt;
+ comb->coefs[0] = 1;
+ comb->rest = NULL_TREE;
+ comb->offset = 0;
+ }
+
+ /* Scales COMB by SCALE. */
+
+ static void
+ aff_combination_scale (struct affine_tree_combination *comb,
+ unsigned HOST_WIDE_INT scale)
+ {
+ unsigned i, j;
+
+ if (scale == 1)
+ return;
+
+ if (scale == 0)
+ {
+ aff_combination_const (comb, comb->type, 0);
+ return;
+ }
+
+ comb->offset = (scale * comb->offset) & comb->mask;
+ for (i = 0, j = 0; i < comb->n; i++)
+ {
+ comb->coefs[j] = (scale * comb->coefs[i]) & comb->mask;
+ comb->elts[j] = comb->elts[i];
+ if (comb->coefs[j] != 0)
+ j++;
+ }
+ comb->n = j;
+
+ if (comb->rest)
+ {
+ if (comb->n < MAX_AFF_ELTS)
+ {
+ comb->coefs[comb->n] = scale;
+ comb->elts[comb->n] = comb->rest;
+ comb->rest = NULL_TREE;
+ comb->n++;
+ }
+ else
+ comb->rest = fold_build2 (MULT_EXPR, comb->type, comb->rest,
+ build_int_cst_type (comb->type, scale));
+ }
+ }
+
+ /* Adds ELT * SCALE to COMB. */
+
+ static void
+ aff_combination_add_elt (struct affine_tree_combination *comb, tree elt,
+ unsigned HOST_WIDE_INT scale)
+ {
+ unsigned i;
+
+ if (scale == 0)
+ return;
+
+ for (i = 0; i < comb->n; i++)
+ if (operand_equal_p (comb->elts[i], elt, 0))
+ {
+ comb->coefs[i] = (comb->coefs[i] + scale) & comb->mask;
+ if (comb->coefs[i])
+ return;
+
+ comb->n--;
+ comb->coefs[i] = comb->coefs[comb->n];
+ comb->elts[i] = comb->elts[comb->n];
+ return;
+ }
+ if (comb->n < MAX_AFF_ELTS)
+ {
+ comb->coefs[comb->n] = scale;
+ comb->elts[comb->n] = elt;
+ comb->n++;
+ return;
+ }
+
+ if (scale == 1)
+ elt = fold_convert (comb->type, elt);
+ else
+ elt = fold_build2 (MULT_EXPR, comb->type,
+ fold_convert (comb->type, elt),
+ build_int_cst_type (comb->type, scale));
+
+ if (comb->rest)
+ comb->rest = fold_build2 (PLUS_EXPR, comb->type, comb->rest, elt);
+ else
+ comb->rest = elt;
+ }
+
+ /* Adds COMB2 to COMB1. */
+
+ static void
+ aff_combination_add (struct affine_tree_combination *comb1,
+ struct affine_tree_combination *comb2)
+ {
+ unsigned i;
+
+ comb1->offset = (comb1->offset + comb2->offset) & comb1->mask;
+ for (i = 0; i < comb2-> n; i++)
+ aff_combination_add_elt (comb1, comb2->elts[i], comb2->coefs[i]);
+ if (comb2->rest)
+ aff_combination_add_elt (comb1, comb2->rest, 1);
+ }
+
+ /* Splits EXPR into an affine combination of parts. */
+
+ static void
+ tree_to_aff_combination (tree expr, tree type,
+ struct affine_tree_combination *comb)
+ {
+ struct affine_tree_combination tmp;
+ enum tree_code code;
+ tree cst, core, toffset;
+ HOST_WIDE_INT bitpos, bitsize;
+ enum machine_mode mode;
+ int unsignedp, volatilep;
+
+ STRIP_NOPS (expr);
+
+ code = TREE_CODE (expr);
+ switch (code)
+ {
+ case INTEGER_CST:
+ aff_combination_const (comb, type, int_cst_value (expr));
+ return;
+
+ case PLUS_EXPR:
+ case MINUS_EXPR:
+ tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+ tree_to_aff_combination (TREE_OPERAND (expr, 1), type, &tmp);
+ if (code == MINUS_EXPR)
+ aff_combination_scale (&tmp, -1);
+ aff_combination_add (comb, &tmp);
+ return;
+
+ case MULT_EXPR:
+ cst = TREE_OPERAND (expr, 1);
+ if (TREE_CODE (cst) != INTEGER_CST)
+ break;
+ tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+ aff_combination_scale (comb, int_cst_value (cst));
+ return;
+
+ case NEGATE_EXPR:
+ tree_to_aff_combination (TREE_OPERAND (expr, 0), type, comb);
+ aff_combination_scale (comb, -1);
+ return;
+
+ case ADDR_EXPR:
+ core = get_inner_reference (TREE_OPERAND (expr, 0), &bitsize, &bitpos,
+ &toffset, &mode, &unsignedp, &volatilep,
+ false);
+ if (bitpos % BITS_PER_UNIT != 0)
+ break;
+ aff_combination_const (comb, type, bitpos / BITS_PER_UNIT);
+ core = build_addr_strip_iref (core);
+ if (TREE_CODE (core) == ADDR_EXPR)
+ aff_combination_add_elt (comb, core, 1);
+ else
+ {
+ tree_to_aff_combination (core, type, &tmp);
+ aff_combination_add (comb, &tmp);
+ }
+ if (toffset)
+ {
+ tree_to_aff_combination (toffset, type, &tmp);
+ aff_combination_add (comb, &tmp);
+ }
+ return;
+
+ default:
+ break;
+ }
+
+ aff_combination_elt (comb, type, expr);
+ }
+
+ /* Creates EXPR + ELT * SCALE in TYPE. MASK is the mask for width of TYPE. */
+
+ static tree
+ add_elt_to_tree (tree expr, tree type, tree elt, unsigned HOST_WIDE_INT scale,
+ unsigned HOST_WIDE_INT mask)
+ {
+ enum tree_code code;
+
+ scale &= mask;
+ elt = fold_convert (type, elt);
+
+ if (scale == 1)
+ {
+ if (!expr)
+ return elt;
+
+ return fold_build2 (PLUS_EXPR, type, expr, elt);
+ }
+
+ if (scale == mask)
+ {
+ if (!expr)
+ return fold_build1 (NEGATE_EXPR, type, elt);
+
+ return fold_build2 (MINUS_EXPR, type, expr, elt);
+ }
+
+ if (!expr)
+ return fold_build2 (MULT_EXPR, type, elt,
+ build_int_cst_type (type, scale));
+
+ if ((scale | (mask >> 1)) == mask)
+ {
+ /* Scale is negative. */
+ code = MINUS_EXPR;
+ scale = (-scale) & mask;
+ }
+ else
+ code = PLUS_EXPR;
+
+ elt = fold_build2 (MULT_EXPR, type, elt,
+ build_int_cst_type (type, scale));
+ return fold_build2 (code, type, expr, elt);
+ }
+
+ /* Makes tree from the affine combination COMB. */
+
+ static tree
+ aff_combination_to_tree (struct affine_tree_combination *comb)
+ {
+ tree type = comb->type;
+ tree expr = comb->rest;
+ unsigned i;
+ unsigned HOST_WIDE_INT off, sgn;
+
+ gcc_assert (comb->n == MAX_AFF_ELTS || comb->rest == NULL_TREE);
+
+ for (i = 0; i < comb->n; i++)
+ expr = add_elt_to_tree (expr, type, comb->elts[i], comb->coefs[i],
+ comb->mask);
+
+ if ((comb->offset | (comb->mask >> 1)) == comb->mask)
+ {
+ /* Offset is negative. */
+ off = (-comb->offset) & comb->mask;
+ sgn = comb->mask;
+ }
+ else
+ {
+ off = comb->offset;
+ sgn = 1;
+ }
+ return add_elt_to_tree (expr, type, build_int_cst_type (type, off), sgn,
+ comb->mask);
+ }
+
+ /* Folds X + RATIO * Y in TYPE. */
+
+ static tree
+ fold_affine_sum (tree type, tree x, tree y, HOST_WIDE_INT ratio)
+ {
+ enum tree_code code;
+ tree cst;
+ struct affine_tree_combination cx, cy;
+
+ if (TYPE_PRECISION (type) > HOST_BITS_PER_WIDE_INT)
+ {
+ if (ratio == 1)
+ return fold_build2 (PLUS_EXPR, type, x, y);
+ if (ratio == -1)
+ return fold_build2 (MINUS_EXPR, type, x, y);
+
+ if (ratio < 0)
+ {
+ code = MINUS_EXPR;
+ ratio = -ratio;
+ }
+ else
+ code = PLUS_EXPR;
+
+ cst = build_int_cst_type (type, ratio);
+ y = fold_build2 (MULT_EXPR, type, y, cst);
+ return fold_build2 (code, type, x, y);
+ }
+
+ tree_to_aff_combination (x, type, &cx);
+ tree_to_aff_combination (y, type, &cy);
+ aff_combination_scale (&cy, ratio);
+ aff_combination_add (&cx, &cy);
+
+ return aff_combination_to_tree (&cx);
+ }
+
/* Determines the expression by that USE is expressed from induction variable
CAND at statement AT in LOOP. */
*************** get_computation_at (struct loop *loop,
*** 2523,2541 ****
cstep = fold_convert (uutype, cstep);
}
! if (!cst_and_fits_in_hwi (cstep)
! || !cst_and_fits_in_hwi (ustep))
! return NULL_TREE;
! ustepi = int_cst_value (ustep);
! cstepi = int_cst_value (cstep);
! if (!divide (TYPE_PRECISION (uutype), ustepi, cstepi, &ratioi))
{
! /* TODO maybe consider case when ustep divides cstep and the ratio is
! a power of 2 (so that the division is fast to execute)? We would
! need to be much more careful with overflows etc. then. */
! return NULL_TREE;
}
/* We may need to shift the value if we are after the increment. */
--- 3031,3071 ----
cstep = fold_convert (uutype, cstep);
}
! if (cst_and_fits_in_hwi (cstep)
! && cst_and_fits_in_hwi (ustep))
! {
! ustepi = int_cst_value (ustep);
! cstepi = int_cst_value (cstep);
! if (!divide (TYPE_PRECISION (uutype), ustepi, cstepi, &ratioi))
! {
! /* TODO maybe consider case when ustep divides cstep and the ratio is
! a power of 2 (so that the division is fast to execute)? We would
! need to be much more careful with overflows etc. then. */
! return NULL_TREE;
! }
! ratio = build_int_cst_type (uutype, ratioi);
! }
! else
{
! ratio = constant_multiple_of (uutype, ustep, cstep);
! if (!ratio)
! return NULL_TREE;
!
! /* Ratioi is only used to detect special cases when the multiplicative
! factor is 1 or -1, so if we cannot convert ratio to HOST_WIDE_INT,
! we may set it to 0. We prefer cst_and_fits_in_hwi/int_cst_value
! to integer_onep/integer_all_onesp, since the former ignores
! TREE_OVERFLOW. */
! if (cst_and_fits_in_hwi (ratio))
! ratioi = int_cst_value (ratio);
! else if (integer_onep (ratio))
! ratioi = 1;
! else if (integer_all_onesp (ratio))
! ratioi = -1;
! else
! ratioi = 0;
}
/* We may need to shift the value if we are after the increment. */
*************** get_computation_at (struct loop *loop,
*** 2552,2572 ****
if (ratioi == 1)
{
! delta = fold (build2 (MINUS_EXPR, uutype, ubase, cbase));
! expr = fold (build2 (PLUS_EXPR, uutype, expr, delta));
}
else if (ratioi == -1)
{
! delta = fold (build2 (PLUS_EXPR, uutype, ubase, cbase));
! expr = fold (build2 (MINUS_EXPR, uutype, delta, expr));
}
else
{
! ratio = build_int_cst_type (uutype, ratioi);
! delta = fold (build2 (MULT_EXPR, uutype, ratio, cbase));
! delta = fold (build2 (MINUS_EXPR, uutype, ubase, delta));
! expr = fold (build2 (MULT_EXPR, uutype, ratio, expr));
! expr = fold (build2 (PLUS_EXPR, uutype, delta, expr));
}
return fold_convert (utype, expr);
--- 3082,3106 ----
if (ratioi == 1)
{
! delta = fold_affine_sum (uutype, ubase, cbase, -1);
! expr = fold_build2 (PLUS_EXPR, uutype, expr, delta);
}
else if (ratioi == -1)
{
! delta = fold_affine_sum (uutype, ubase, cbase, 1);
! expr = fold_build2 (MINUS_EXPR, uutype, delta, expr);
}
else
{
! if (ratioi)
! delta = fold_affine_sum (uutype, ubase, cbase, -ratioi);
! else
! {
! delta = fold_build2 (MULT_EXPR, uutype, ratio, cbase);
! delta = fold_affine_sum (uutype, ubase, delta, -1);
! }
! expr = fold_build2 (MULT_EXPR, uutype, ratio, expr);
! expr = fold_build2 (PLUS_EXPR, uutype, delta, expr);
}
return fold_convert (utype, expr);
*************** get_address_cost (bool symbol_present, b
*** 2830,2860 ****
return cost + acost;
}
-
- /* Records invariants in *EXPR_P. Callback for walk_tree. DATA contains
- the bitmap to that we should store it. */
-
- static struct ivopts_data *fd_ivopts_data;
- static tree
- find_depends (tree *expr_p, int *ws ATTRIBUTE_UNUSED, void *data)
- {
- bitmap *depends_on = data;
- struct version_info *info;
-
- if (TREE_CODE (*expr_p) != SSA_NAME)
- return NULL_TREE;
- info = name_info (fd_ivopts_data, *expr_p);
-
- if (!info->inv_id || info->has_nonlin_use)
- return NULL_TREE;
-
- if (!*depends_on)
- *depends_on = BITMAP_ALLOC (NULL);
- bitmap_set_bit (*depends_on, info->inv_id);
-
- return NULL_TREE;
- }
-
/* Estimates cost of forcing EXPR into a variable. DEPENDS_ON is a set of the
invariants the computation depends on. */
--- 3364,3369 ----
*************** difference_cost (struct ivopts_data *dat
*** 3088,3095 ****
enum machine_mode mode = TYPE_MODE (TREE_TYPE (e1));
unsigned HOST_WIDE_INT off1, off2;
! e1 = strip_offset (e1, false, &off1);
! e2 = strip_offset (e2, false, &off2);
*offset += off1 - off2;
STRIP_NOPS (e1);
--- 3597,3604 ----
enum machine_mode mode = TYPE_MODE (TREE_TYPE (e1));
unsigned HOST_WIDE_INT off1, off2;
! e1 = strip_offset (e1, &off1);
! e2 = strip_offset (e2, &off2);
*offset += off1 - off2;
STRIP_NOPS (e1);
*************** get_computation_cost_at (struct ivopts_d
*** 3172,3200 ****
return INFTY;
}
- if (!cst_and_fits_in_hwi (ustep)
- || !cst_and_fits_in_hwi (cstep))
- return INFTY;
-
- if (TREE_CODE (ubase) == INTEGER_CST
- && !cst_and_fits_in_hwi (ubase))
- goto fallback;
-
- if (TREE_CODE (cbase) == INTEGER_CST
- && !cst_and_fits_in_hwi (cbase))
- goto fallback;
-
- ustepi = int_cst_value (ustep);
- cstepi = int_cst_value (cstep);
-
if (TYPE_PRECISION (utype) != TYPE_PRECISION (ctype))
{
/* TODO -- add direct handling of this case. */
goto fallback;
}
! if (!divide (TYPE_PRECISION (utype), ustepi, cstepi, &ratio))
! return INFTY;
/* use = ubase + ratio * (var - cbase). If either cbase is a constant
or ratio == 1, it is better to handle this like
--- 3681,3729 ----
return INFTY;
}
if (TYPE_PRECISION (utype) != TYPE_PRECISION (ctype))
{
/* TODO -- add direct handling of this case. */
goto fallback;
}
! /* CSTEPI is removed from the offset in case statement is after the
! increment. If the step is not constant, we use zero instead.
! This is a bit inprecise (there is the extra addition), but
! redundancy elimination is likely to transform the code so that
! it uses value of the variable before increment anyway,
! so it is not that much unrealistic. */
! if (cst_and_fits_in_hwi (cstep))
! cstepi = int_cst_value (cstep);
! else
! cstepi = 0;
!
! if (cst_and_fits_in_hwi (ustep)
! && cst_and_fits_in_hwi (cstep))
! {
! ustepi = int_cst_value (ustep);
!
! if (!divide (TYPE_PRECISION (utype), ustepi, cstepi, &ratio))
! return INFTY;
! }
! else
! {
! tree rat;
!
! rat = constant_multiple_of (utype, ustep, cstep);
!
! if (!rat)
! return INFTY;
!
! if (cst_and_fits_in_hwi (rat))
! ratio = int_cst_value (rat);
! else if (integer_onep (rat))
! ratio = 1;
! else if (integer_all_onesp (rat))
! ratio = -1;
! else
! return INFTY;
! }
/* use = ubase + ratio * (var - cbase). If either cbase is a constant
or ratio == 1, it is better to handle this like
*************** get_computation_cost_at (struct ivopts_d
*** 3203,3209 ****
(also holds in the case ratio == -1, TODO. */
! if (TREE_CODE (cbase) == INTEGER_CST)
{
offset = - ratio * int_cst_value (cbase);
cost += difference_cost (data,
--- 3732,3738 ----
(also holds in the case ratio == -1, TODO. */
! if (cst_and_fits_in_hwi (cbase))
{
offset = - ratio * int_cst_value (cbase);
cost += difference_cost (data,
*************** may_eliminate_iv (struct ivopts_data *da
*** 3398,3403 ****
--- 3927,3935 ----
tree wider_type, period, per_type;
struct loop *loop = data->current_loop;
+ if (TREE_CODE (cand->iv->step) != INTEGER_CST)
+ return false;
+
/* For now works only for exits that dominate the loop latch. TODO -- extend
for other conditions inside loop body. */
ex_bb = bb_for_stmt (use->stmt);
*************** iv_ca_recount_cost (struct ivopts_data *
*** 3867,3882 ****
ivs->cost = cost;
}
/* Set USE not to be expressed by any candidate in IVS. */
static void
iv_ca_set_no_cp (struct ivopts_data *data, struct iv_ca *ivs,
struct iv_use *use)
{
! unsigned uid = use->id, cid, iid;
! bitmap deps;
struct cost_pair *cp;
- bitmap_iterator bi;
cp = ivs->cand_for_use[uid];
if (!cp)
--- 4399,4431 ----
ivs->cost = cost;
}
+ /* Remove invariants in set INVS to set IVS. */
+
+ static void
+ iv_ca_set_remove_invariants (struct iv_ca *ivs, bitmap invs)
+ {
+ bitmap_iterator bi;
+ unsigned iid;
+
+ if (!invs)
+ return;
+
+ EXECUTE_IF_SET_IN_BITMAP (invs, 0, iid, bi)
+ {
+ ivs->n_invariant_uses[iid]--;
+ if (ivs->n_invariant_uses[iid] == 0)
+ ivs->n_regs--;
+ }
+ }
+
/* Set USE not to be expressed by any candidate in IVS. */
static void
iv_ca_set_no_cp (struct ivopts_data *data, struct iv_ca *ivs,
struct iv_use *use)
{
! unsigned uid = use->id, cid;
struct cost_pair *cp;
cp = ivs->cand_for_use[uid];
if (!cp)
*************** iv_ca_set_no_cp (struct ivopts_data *dat
*** 3895,3917 ****
ivs->n_regs--;
ivs->n_cands--;
ivs->cand_cost -= cp->cand->cost;
}
ivs->cand_use_cost -= cp->cost;
! deps = cp->depends_on;
! if (deps)
{
! EXECUTE_IF_SET_IN_BITMAP (deps, 0, iid, bi)
! {
! ivs->n_invariant_uses[iid]--;
! if (ivs->n_invariant_uses[iid] == 0)
! ivs->n_regs--;
! }
}
-
- iv_ca_recount_cost (data, ivs);
}
/* Set cost pair for USE in set IVS to CP. */
--- 4444,4476 ----
ivs->n_regs--;
ivs->n_cands--;
ivs->cand_cost -= cp->cand->cost;
+
+ iv_ca_set_remove_invariants (ivs, cp->cand->depends_on);
}
ivs->cand_use_cost -= cp->cost;
! iv_ca_set_remove_invariants (ivs, cp->depends_on);
! iv_ca_recount_cost (data, ivs);
! }
!
! /* Add invariants in set INVS to set IVS. */
!
! static void
! iv_ca_set_add_invariants (struct iv_ca *ivs, bitmap invs)
! {
! bitmap_iterator bi;
! unsigned iid;
! if (!invs)
! return;
!
! EXECUTE_IF_SET_IN_BITMAP (invs, 0, iid, bi)
{
! ivs->n_invariant_uses[iid]++;
! if (ivs->n_invariant_uses[iid] == 1)
! ivs->n_regs++;
}
}
/* Set cost pair for USE in set IVS to CP. */
*************** static void
*** 3920,3928 ****
iv_ca_set_cp (struct ivopts_data *data, struct iv_ca *ivs,
struct iv_use *use, struct cost_pair *cp)
{
! unsigned uid = use->id, cid, iid;
! bitmap deps;
! bitmap_iterator bi;
if (ivs->cand_for_use[uid] == cp)
return;
--- 4479,4485 ----
iv_ca_set_cp (struct ivopts_data *data, struct iv_ca *ivs,
struct iv_use *use, struct cost_pair *cp)
{
! unsigned uid = use->id, cid;
if (ivs->cand_for_use[uid] == cp)
return;
*************** iv_ca_set_cp (struct ivopts_data *data,
*** 3945,3966 ****
ivs->n_regs++;
ivs->n_cands++;
ivs->cand_cost += cp->cand->cost;
- }
-
- ivs->cand_use_cost += cp->cost;
-
- deps = cp->depends_on;
! if (deps)
! {
! EXECUTE_IF_SET_IN_BITMAP (deps, 0, iid, bi)
! {
! ivs->n_invariant_uses[iid]++;
! if (ivs->n_invariant_uses[iid] == 1)
! ivs->n_regs++;
! }
}
iv_ca_recount_cost (data, ivs);
}
}
--- 4502,4513 ----
ivs->n_regs++;
ivs->n_cands++;
ivs->cand_cost += cp->cand->cost;
! iv_ca_set_add_invariants (ivs, cp->cand->depends_on);
}
+ ivs->cand_use_cost += cp->cost;
+ iv_ca_set_add_invariants (ivs, cp->depends_on);
iv_ca_recount_cost (data, ivs);
}
}
*************** create_new_iv (struct ivopts_data *data,
*** 4643,4649 ****
base = unshare_expr (cand->iv->base);
! create_iv (base, cand->iv->step, cand->var_before, data->current_loop,
&incr_pos, after, &cand->var_before, &cand->var_after);
}
--- 5190,5197 ----
base = unshare_expr (cand->iv->base);
! create_iv (base, unshare_expr (cand->iv->step),
! cand->var_before, data->current_loop,
&incr_pos, after, &cand->var_before, &cand->var_after);
}
*************** free_loop_data (struct ivopts_data *data
*** 5245,5250 ****
--- 5793,5800 ----
if (cand->iv)
free (cand->iv);
+ if (cand->depends_on)
+ BITMAP_FREE (cand->depends_on);
free (cand);
}
VARRAY_POP_ALL (data->iv_candidates);
Index: tree-ssa-loop-manip.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-manip.c,v
retrieving revision 2.30
diff -c -3 -p -r2.30 tree-ssa-loop-manip.c
*** tree-ssa-loop-manip.c 17 Apr 2005 06:41:54 -0000 2.30
--- tree-ssa-loop-manip.c 22 Apr 2005 21:23:04 -0000
*************** create_iv (tree base, tree step, tree va
*** 54,59 ****
--- 54,60 ----
tree stmt, initial, step1, stmts;
tree vb, va;
enum tree_code incr_op = PLUS_EXPR;
+ edge pe = loop_preheader_edge (loop);
if (!var)
{
*************** create_iv (tree base, tree step, tree va
*** 92,97 ****
--- 93,104 ----
}
}
+ /* Gimplify the step if necessary. We put the computations in front of the
+ loop (i.e. the step should be loop invariant). */
+ step = force_gimple_operand (step, &stmts, true, var);
+ if (stmts)
+ bsi_insert_on_edge_immediate_loop (pe, stmts);
+
stmt = build2 (MODIFY_EXPR, void_type_node, va,
build2 (incr_op, TREE_TYPE (base),
vb, step));
*************** create_iv (tree base, tree step, tree va
*** 103,113 ****
initial = force_gimple_operand (base, &stmts, true, var);
if (stmts)
! {
! edge pe = loop_preheader_edge (loop);
!
! bsi_insert_on_edge_immediate_loop (pe, stmts);
! }
stmt = create_phi_node (vb, loop->header);
SSA_NAME_DEF_STMT (vb) = stmt;
--- 110,116 ----
initial = force_gimple_operand (base, &stmts, true, var);
if (stmts)
! bsi_insert_on_edge_immediate_loop (pe, stmts);
stmt = create_phi_node (vb, loop->header);
SSA_NAME_DEF_STMT (vb) = stmt;
Index: tree-ssa-loop-niter.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-niter.c,v
retrieving revision 2.23
diff -c -3 -p -r2.23 tree-ssa-loop-niter.c
*** tree-ssa-loop-niter.c 17 Apr 2005 06:41:55 -0000 2.23
--- tree-ssa-loop-niter.c 22 Apr 2005 21:23:04 -0000
*************** number_of_iterations_exit (struct loop *
*** 881,889 ****
&& !POINTER_TYPE_P (type))
return false;
! if (!simple_iv (loop, stmt, op0, &base0, &step0))
return false;
! if (!simple_iv (loop, stmt, op1, &base1, &step1))
return false;
niter->niter = NULL_TREE;
--- 881,889 ----
&& !POINTER_TYPE_P (type))
return false;
! if (!simple_iv (loop, stmt, op0, &base0, &step0, false))
return false;
! if (!simple_iv (loop, stmt, op1, &base1, &step1, false))
return false;
niter->niter = NULL_TREE;
Index: tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.c,v
retrieving revision 1.474
diff -c -3 -p -r1.474 tree.c
*** tree.c 22 Apr 2005 10:56:57 -0000 1.474
--- tree.c 22 Apr 2005 21:23:04 -0000
*************** tree
*** 518,524 ****
build_int_cst_type (tree type, HOST_WIDE_INT low)
{
unsigned HOST_WIDE_INT val = (unsigned HOST_WIDE_INT) low;
! unsigned HOST_WIDE_INT hi;
unsigned bits;
bool signed_p;
bool negative;
--- 518,524 ----
build_int_cst_type (tree type, HOST_WIDE_INT low)
{
unsigned HOST_WIDE_INT val = (unsigned HOST_WIDE_INT) low;
! unsigned HOST_WIDE_INT hi, mask;
unsigned bits;
bool signed_p;
bool negative;
*************** build_int_cst_type (tree type, HOST_WIDE
*** 538,547 ****
negative = ((val >> (bits - 1)) & 1) != 0;
/* Mask out the bits outside of the precision of the constant. */
if (signed_p && negative)
! val = val | ((~(unsigned HOST_WIDE_INT) 0) << bits);
else
! val = val & ~((~(unsigned HOST_WIDE_INT) 0) << bits);
}
/* Determine the high bits. */
--- 538,549 ----
negative = ((val >> (bits - 1)) & 1) != 0;
/* Mask out the bits outside of the precision of the constant. */
+ mask = (((unsigned HOST_WIDE_INT) 2) << (bits - 1)) - 1;
+
if (signed_p && negative)
! val |= ~mask;
else
! val &= mask;
}
/* Determine the high bits. */
*************** build_int_cst_type (tree type, HOST_WIDE
*** 556,562 ****
else
{
bits -= HOST_BITS_PER_WIDE_INT;
! hi = hi & ~((~(unsigned HOST_WIDE_INT) 0) << bits);
}
}
--- 558,565 ----
else
{
bits -= HOST_BITS_PER_WIDE_INT;
! mask = (((unsigned HOST_WIDE_INT) 2) << (bits - 1)) - 1;
! hi &= mask;
}
}