This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[patch][rfc] PR tree-optimization/21883
- From: Steven Bosscher <stevenb at suse dot de>
- To: gcc-patches at gcc dot gnu dot org
- Cc: law at redhat dot com, jakub at redhat dot com
- Date: Sun, 4 Sep 2005 01:34:12 +0200
- Subject: [patch][rfc] PR tree-optimization/21883
Hi,
Bug 21883 is a bug where tree-ssa-dom's jump threading decides to
thread across a very large basic block in a way that requires
duplicating that large block. Jakub has a nice test case for it:
================================================
extern int foo1 (void), foo2 (int);
int bar (int x)
{
int varvar1, varvar2;
int z;
if (x > 913)
varvar1 = foo1 ();
else
varvar2 = foo2 (6);
#define A foo1 (); foo1 (); foo1 (); foo1 (); foo1 ();
#define B A A A A A A A A A A
#define C B B B B B B B B B B
C C C C C C C C C C C
if (x > 913)
z = varvar1 + foo1 ();
else
z = varvar2 + foo2 (6);
return z;
}
================================================
Mainline will duplicate all the foo1 calls just to avoid the second
conditional jump, to give something equivalent to,
================================================
extern int foo1 (void), foo2 (int);
#define A foo1 (); foo1 (); foo1 (); foo1 (); foo1 ();
#define B A A A A A A A A A A
#define C B B B B B B B B B B
int bar (int x)
{
int varvar1, varvar2;
int z;
if (x > 913)
{
varvar1 = foo1 ();
C C C C C C C C C C C
z = varvar1 + foo1 ();
}
else
{
varvar2 = foo2 (6);
C C C C C C C C C C C
z = varvar2 + foo2 (6);
}
return z;
}
================================================
This transformation is done without any regard for the increase in
code size that this causes (we do this transformation even at -Os).
The patch below changes this. With the patch, DOM assumes that if
the jump threading succeeds, all statements used in the computation
of the controlling expression of a conditional jump are going to be
folded (so they do not cause any code size increase). DOM assume all
other statements have to be duplicated when the jump threading is
performed. It then simply makes an estimate of the number of insns
required for those statements (using estimate_num_insns) and simply
does not apply the transformation of this number is higher than some
number, in this case a number defined in params.def.
Now, this patch bootstrapped and passed testing just fine on x86_64,
but otherwise I haven't tested it very thoroughly yet. I've got a
few questions for the experts first ;-)
1) Does this patch look reasonable for GCC 4.1 (modulo reviewer's
comments of course...)? I think the patch is not too intrusive
but it is quite large for a stage3 patch.
2) What would be a reasonable number for this new param I introduce?
We discussed this a bit on IRC. It was suggested that it should
probably depend on BRANCH_COST, but in that case perhaps a param
is the wrong choice. Maybe the loop depth should be taken into
account (for icache) and perhaps for -Os the duplication should be
disabled completely if there are any statements that need to be
duplicated. I've looked at bb-reorder's copy_bb_p, but that did
not help very much either.
Ideas? Suggestions? Just brute-force testing is obviously also
required, but I'd like to hear some idea for heuristics if the
answer for (1) above is positive ;-)
Gr.
Steven
* params.def (PARAM_MAX_JUMP_THREAD_DUPLICATION_INSNS): New param.
* tree-ssa-dom.c: Inlcude tree-inline.h, pointer-set.h, and params.h.
(fold_stmts_feeding_cond): New function, partially split out from
thread_across_edge.
(ok_to_duplicate_for_threading): New function.
(thread_across_edge): Overhaul. Do not thread across edges that
require too much code duplication.
Index: params.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/params.def,v
retrieving revision 1.67
diff -u -3 -p -r1.67 params.def
--- params.def 24 Aug 2005 20:28:05 -0000 1.67
+++ params.def 3 Sep 2005 21:16:49 -0000
@@ -493,6 +493,28 @@ DEFPARAM (PARAM_VIRTUAL_MAPPINGS_TO_SYMS
"Ratio between virtual mappings and virtual symbols to do full virtual renames",
3, 0, 0)
+/* Jump threading duplicates basic blocks, which may lead to excessive
+ code size increases and compile time increases without significant
+ benefits for the resulting code. For example, consider the following
+ chunk of a CFG on the left hand side, and assume C is threaded through
+ to give the CFG on the right:
+
+ A1 B1 A1 B1
+ \ / | |
+ C --> C C (duplicated)
+ / \ | |
+ A2 B2 A2 B2
+
+ If C is large, then duplicating C just to lose one conditional jump is
+ not always profitable. PARAM_MAX_JUMP_THREAD_DUPLICATION_INSNS is the
+ maximum number of instructions that can be in C such that C is still
+ considered a candidate for duplication for jump threading. */
+DEFPARAM (PARAM_MAX_JUMP_THREAD_DUPLICATION_INSNS,
+ "max-jump-thread-duplication-insns",
+ "Maximum number of instructions for basic blocks that can be duplicated for jump threading",
+ 100, 0, 0)
+
+
DEFPARAM (PARAM_SSP_BUFFER_SIZE,
"ssp-buffer-size",
"The lower bound for a buffer to be considered for stack smashing protection",
Index: tree-ssa-dom.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-dom.c,v
retrieving revision 2.127
diff -u -3 -p -r2.127 tree-ssa-dom.c
--- tree-ssa-dom.c 2 Aug 2005 00:12:39 -0000 2.127
+++ tree-ssa-dom.c 3 Sep 2005 21:16:50 -0000
@@ -37,11 +37,14 @@ Boston, MA 02110-1301, USA. */
#include "timevar.h"
#include "tree-dump.h"
#include "tree-flow.h"
+#include "tree-inline.h" /* for estimate_num_insns */
+#include "pointer-set.h"
#include "domwalk.h"
#include "real.h"
#include "tree-pass.h"
#include "tree-ssa-propagate.h"
#include "langhooks.h"
+#include "params.h"
/* This file implements optimizations on the dominator tree. */
@@ -583,6 +586,155 @@ struct tree_opt_pass pass_dominator =
};
+/* Recursively fold statements feeding STMT. Only consider statements
+ contained in basic block BB.
+ Mark visited statements in VISITED_STMTS. */
+
+static void
+fold_stmts_feeding_cond (basic_block bb, tree stmt,
+ struct pointer_set_t *visited_stmts)
+{
+ ssa_op_iter iter;
+ use_operand_p use_p;
+ tree cached_lhs;
+
+ /* Mark this statement as visited. */
+ pointer_set_insert (visited_stmts, stmt);
+
+ /* Do a recursive DFS on everything feeding the real USE operands of
+ stmt, to fold things in reverse topological sort. If we are able
+ to simplify a statement into the form,
+
+ SSA_NAME = (SSA_NAME | gimple invariant)
+
+ then we can record a context sensitive equivalency which may help
+ us simplify later statements in BB. */
+ FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
+ {
+ tree use, def_stmt;
+
+ use = USE_FROM_PTR (use_p);
+ if (TREE_CODE (use) != SSA_NAME)
+ continue;
+
+ def_stmt = SSA_NAME_DEF_STMT (use);
+
+ /* Ignore empty statements, PHI nodes and stmts not inside BB. */
+ if (IS_EMPTY_STMT (def_stmt)
+ || TREE_CODE (def_stmt) == PHI_NODE
+ || bb_for_stmt (def_stmt) != bb)
+ continue;
+
+ fold_stmts_feeding_cond (bb, def_stmt, visited_stmts);
+ }
+
+ /* If this is not a MODIFY_EXPR which sets an SSA_NAME to a new
+ value, then do not try to simplify this statement as it will
+ not simplify in any way that is helpful for jump threading. */
+ if (TREE_CODE (stmt) != MODIFY_EXPR
+ || TREE_CODE (TREE_OPERAND (stmt, 0)) != SSA_NAME)
+ return;
+
+ /* At this point we have a statement which assigns an RHS to an
+ SSA_VAR on the LHS. We want to try and simplify this statement
+ to expose more context sensitive equivalences which in turn may
+ allow us to simplify the condition at the end of the block. */
+ if (TREE_CODE (TREE_OPERAND (stmt, 1)) == SSA_NAME)
+ cached_lhs = TREE_OPERAND (stmt, 1);
+ else
+ {
+ /* Copy the operands. */
+ tree *copy;
+ unsigned int num, i = 0;
+
+ num = NUM_SSA_OPERANDS (stmt, (SSA_OP_USE | SSA_OP_VUSE));
+ copy = xcalloc (num, sizeof (tree));
+
+ /* Make a copy of the uses & vuses into USES_COPY, then cprop into
+ the operands. */
+ FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE | SSA_OP_VUSE)
+ {
+ tree tmp = NULL;
+ tree use = USE_FROM_PTR (use_p);
+
+ copy[i++] = use;
+ if (TREE_CODE (use) == SSA_NAME)
+ tmp = SSA_NAME_VALUE (use);
+ if (tmp && TREE_CODE (tmp) != VALUE_HANDLE)
+ SET_USE (use_p, tmp);
+ }
+
+ /* Try to fold/lookup the new expression. Inserting the
+ expression into the hash table is unlikely to help
+ simplify anything later, so just query the hashtable. */
+ cached_lhs = fold (TREE_OPERAND (stmt, 1));
+ if (TREE_CODE (cached_lhs) != SSA_NAME
+ && !is_gimple_min_invariant (cached_lhs))
+ cached_lhs = lookup_avail_expr (stmt, false);
+
+ /* Restore the statement's original uses/defs. */
+ i = 0;
+ FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE | SSA_OP_VUSE)
+ SET_USE (use_p, copy[i++]);
+
+ free (copy);
+ }
+
+ /* Record the context sensitive equivalence if we were able
+ to simplify this statement. */
+ if (cached_lhs
+ && (TREE_CODE (cached_lhs) == SSA_NAME
+ || is_gimple_min_invariant (cached_lhs)))
+ record_const_or_copy (TREE_OPERAND (stmt, 0), cached_lhs);
+}
+
+
+/* Return true if duplicaton of BB probably does not lead to extreme code
+ size increases. Statements in VISITED_STMTS are ignored because they
+ contribute to the condition for the jump at the end of BB. Such
+ statements may sometimes also need to be duplicated, but we know at
+ this point that the condition folds, so most likely the statements
+ feeding the condition also fold, which is why we assume duplicating
+ them is not going to lead to significant code growth. */
+
+static bool
+ok_to_duplicate_for_threading (basic_block bb,
+ struct pointer_set_t *visited_stmts)
+{
+ block_stmt_iterator bsi;
+ unsigned estimated_insns_to_duplicate = 0;
+
+ for (bsi = bsi_start (bb); ! bsi_end_p (bsi); bsi_next (&bsi))
+ {
+ tree stmt = bsi_stmt (bsi);
+
+ /* Ignore empty statements and labels. */
+ if (IS_EMPTY_STMT (stmt) || TREE_CODE (stmt) == LABEL_EXPR)
+ continue;
+
+ /* If the statement has volatile operands, then we assume we
+ can not thread through this block. This is overly
+ conservative in some ways. */
+ if (TREE_CODE (stmt) == ASM_EXPR && ASM_VOLATILE_P (stmt))
+ return false;
+
+ /* If we have already seen this statement while trying to fold
+ the condition at the end of E->dest, ignore it here. */
+ if (pointer_set_contains (visited_stmts, stmt))
+ continue;
+
+ /* Otherwise this statement will have to be duplicated. Estimate
+ how many insns we will need for it, and accumulate. If it looks
+ like E->dest needs too much duplication, bail out. */
+ estimated_insns_to_duplicate += estimate_num_insns (stmt);
+ if (estimated_insns_to_duplicate
+ > (unsigned) PARAM_VALUE (PARAM_MAX_JUMP_THREAD_DUPLICATION_INSNS))
+ return false;
+ }
+
+ return true;
+}
+
/* We are exiting E->src, see if E->dest ends with a conditional
jump which has a known value when reached via E.
@@ -601,20 +753,40 @@ struct tree_opt_pass pass_dominator =
static void
thread_across_edge (struct dom_walk_data *walk_data, edge e)
{
+ tree stmt;
+ tree phi, cond, cached_lhs;
+ struct pointer_set_t *visited_stmts;
block_stmt_iterator bsi;
- tree stmt = NULL;
- tree phi;
/* If E->dest does not end with a conditional, then there is
nothing to do. */
- bsi = bsi_last (e->dest);
- if (bsi_end_p (bsi)
- || ! bsi_stmt (bsi)
- || (TREE_CODE (bsi_stmt (bsi)) != COND_EXPR
- && TREE_CODE (bsi_stmt (bsi)) != GOTO_EXPR
- && TREE_CODE (bsi_stmt (bsi)) != SWITCH_EXPR))
+ stmt = last_stmt (e->dest);
+ if (! stmt
+ || (TREE_CODE (stmt) != COND_EXPR
+ && TREE_CODE (stmt) != GOTO_EXPR
+ && TREE_CODE (stmt) != SWITCH_EXPR))
return;
+ /* Safely handle threading across loop backedges. This is overly
+ conservative, but still allows us to capture the majority of
+ the cases where we can thread across a loop backedge. */
+ if ((e->flags & EDGE_DFS_BACK) != 0)
+ for (bsi = bsi_start (e->dest); ! bsi_end_p (bsi); bsi_next (&bsi))
+ {
+ tree stmt2 = bsi_stmt (bsi);
+
+ /* Ignore empty statements and labels. */
+ if (IS_EMPTY_STMT (stmt2) || TREE_CODE (stmt2) == LABEL_EXPR)
+ continue;
+
+ /* If there is any code here other than a conditional jump,
+ threading is unsafe (we may have recorded equivalences in
+ blocks dominating E->dest). */
+ if (TREE_CODE (stmt2) != COND_EXPR
+ && TREE_CODE (stmt2) != SWITCH_EXPR)
+ return;
+ }
+
/* The basic idea here is to use whatever knowledge we have
from our dominator walk to simplify statements in E->dest,
with the ultimate goal being to simplify the conditional
@@ -645,218 +817,118 @@ thread_across_edge (struct dom_walk_data
record_const_or_copy (dst, src);
}
- /* Try to simplify each statement in E->dest, ultimately leading to
- a simplification of the COND_EXPR at the end of E->dest.
-
- We might consider marking just those statements which ultimately
- feed the COND_EXPR. It's not clear if the overhead of bookkeeping
- would be recovered by trying to simplify fewer statements.
-
- If we are able to simplify a statement into the form
- SSA_NAME = (SSA_NAME | gimple invariant), then we can record
- a context sensitive equivalency which may help us simplify
- later statements in E->dest.
-
- Failure to simplify into the form above merely means that the
- statement provides no equivalences to help simplify later
- statements. This does not prevent threading through E->dest. */
- for (bsi = bsi_start (e->dest); ! bsi_end_p (bsi); bsi_next (&bsi))
- {
- tree cached_lhs;
-
- stmt = bsi_stmt (bsi);
-
- /* Ignore empty statements and labels. */
- if (IS_EMPTY_STMT (stmt) || TREE_CODE (stmt) == LABEL_EXPR)
- continue;
-
- /* Safely handle threading across loop backedges. This is
- over conservative, but still allows us to capture the
- majority of the cases where we can thread across a loop
- backedge. */
- if ((e->flags & EDGE_DFS_BACK) != 0
- && TREE_CODE (stmt) != COND_EXPR
- && TREE_CODE (stmt) != SWITCH_EXPR)
- return;
+ /* Try to simplify each statement in E->dest that ultimately feeds
+ the jump condition. Record the statements we visit in the process. */
+ visited_stmts = pointer_set_create ();
+ fold_stmts_feeding_cond (e->dest, stmt, visited_stmts);
- /* If the statement has volatile operands, then we assume we
- can not thread through this block. This is overly
- conservative in some ways. */
- if (TREE_CODE (stmt) == ASM_EXPR && ASM_VOLATILE_P (stmt))
- return;
-
- /* If this is not a MODIFY_EXPR which sets an SSA_NAME to a new
- value, then do not try to simplify this statement as it will
- not simplify in any way that is helpful for jump threading. */
- if (TREE_CODE (stmt) != MODIFY_EXPR
- || TREE_CODE (TREE_OPERAND (stmt, 0)) != SSA_NAME)
- continue;
-
- /* At this point we have a statement which assigns an RHS to an
- SSA_VAR on the LHS. We want to try and simplify this statement
- to expose more context sensitive equivalences which in turn may
- allow us to simplify the condition at the end of the loop. */
- if (TREE_CODE (TREE_OPERAND (stmt, 1)) == SSA_NAME)
- cached_lhs = TREE_OPERAND (stmt, 1);
- else
- {
- /* Copy the operands. */
- tree *copy;
- ssa_op_iter iter;
- use_operand_p use_p;
- unsigned int num, i = 0;
-
- num = NUM_SSA_OPERANDS (stmt, (SSA_OP_USE | SSA_OP_VUSE));
- copy = xcalloc (num, sizeof (tree));
-
- /* Make a copy of the uses & vuses into USES_COPY, then cprop into
- the operands. */
- FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE | SSA_OP_VUSE)
- {
- tree tmp = NULL;
- tree use = USE_FROM_PTR (use_p);
-
- copy[i++] = use;
- if (TREE_CODE (use) == SSA_NAME)
- tmp = SSA_NAME_VALUE (use);
- if (tmp && TREE_CODE (tmp) != VALUE_HANDLE)
- SET_USE (use_p, tmp);
- }
+ /* See if we now know where the jump at the end of E->dest is going. */
+ if (TREE_CODE (stmt) == COND_EXPR)
+ cond = COND_EXPR_COND (stmt);
+ else if (TREE_CODE (stmt) == GOTO_EXPR)
+ cond = GOTO_DESTINATION (stmt);
+ else
+ cond = SWITCH_COND (stmt);
- /* Try to fold/lookup the new expression. Inserting the
- expression into the hash table is unlikely to help
- simplify anything later, so just query the hashtable. */
- cached_lhs = fold (TREE_OPERAND (stmt, 1));
- if (TREE_CODE (cached_lhs) != SSA_NAME
- && !is_gimple_min_invariant (cached_lhs))
- cached_lhs = lookup_avail_expr (stmt, false);
-
-
- /* Restore the statement's original uses/defs. */
- i = 0;
- FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE | SSA_OP_VUSE)
- SET_USE (use_p, copy[i++]);
-
- free (copy);
- }
-
- /* Record the context sensitive equivalence if we were able
- to simplify this statement. */
- if (cached_lhs
- && (TREE_CODE (cached_lhs) == SSA_NAME
- || is_gimple_min_invariant (cached_lhs)))
- record_const_or_copy (TREE_OPERAND (stmt, 0), cached_lhs);
- }
-
- /* If we stopped at a COND_EXPR or SWITCH_EXPR, see if we know which arm
- will be taken. */
- if (stmt
- && (TREE_CODE (stmt) == COND_EXPR
- || TREE_CODE (stmt) == GOTO_EXPR
- || TREE_CODE (stmt) == SWITCH_EXPR))
+ /* Temporarily cprop the operands and try to find the resulting
+ expression in the hash tables. */
+ if (COMPARISON_CLASS_P (cond))
{
- tree cond, cached_lhs;
+ tree dummy_cond, op0, op1;
+ enum tree_code cond_code;
- /* Now temporarily cprop the operands and try to find the resulting
- expression in the hash tables. */
- if (TREE_CODE (stmt) == COND_EXPR)
- cond = COND_EXPR_COND (stmt);
- else if (TREE_CODE (stmt) == GOTO_EXPR)
- cond = GOTO_DESTINATION (stmt);
- else
- cond = SWITCH_COND (stmt);
+ op0 = TREE_OPERAND (cond, 0);
+ op1 = TREE_OPERAND (cond, 1);
+ cond_code = TREE_CODE (cond);
- if (COMPARISON_CLASS_P (cond))
+ /* Get the current value of both operands. */
+ if (TREE_CODE (op0) == SSA_NAME)
{
- tree dummy_cond, op0, op1;
- enum tree_code cond_code;
-
- op0 = TREE_OPERAND (cond, 0);
- op1 = TREE_OPERAND (cond, 1);
- cond_code = TREE_CODE (cond);
-
- /* Get the current value of both operands. */
- if (TREE_CODE (op0) == SSA_NAME)
- {
- tree tmp = SSA_NAME_VALUE (op0);
- if (tmp && TREE_CODE (tmp) != VALUE_HANDLE)
- op0 = tmp;
- }
-
- if (TREE_CODE (op1) == SSA_NAME)
- {
- tree tmp = SSA_NAME_VALUE (op1);
- if (tmp && TREE_CODE (tmp) != VALUE_HANDLE)
- op1 = tmp;
- }
-
- /* Stuff the operator and operands into our dummy conditional
- expression, creating the dummy conditional if necessary. */
- dummy_cond = walk_data->global_data;
- if (! dummy_cond)
- {
- dummy_cond = build (cond_code, boolean_type_node, op0, op1);
- dummy_cond = build (COND_EXPR, void_type_node,
- dummy_cond, NULL, NULL);
- walk_data->global_data = dummy_cond;
- }
- else
- {
- TREE_SET_CODE (COND_EXPR_COND (dummy_cond), cond_code);
- TREE_OPERAND (COND_EXPR_COND (dummy_cond), 0) = op0;
- TREE_OPERAND (COND_EXPR_COND (dummy_cond), 1) = op1;
- }
+ tree tmp = SSA_NAME_VALUE (op0);
+ if (tmp && TREE_CODE (tmp) != VALUE_HANDLE)
+ op0 = tmp;
+ }
- /* If the conditional folds to an invariant, then we are done,
- otherwise look it up in the hash tables. */
- cached_lhs = local_fold (COND_EXPR_COND (dummy_cond));
- if (! is_gimple_min_invariant (cached_lhs))
- {
- cached_lhs = lookup_avail_expr (dummy_cond, false);
- if (!cached_lhs || ! is_gimple_min_invariant (cached_lhs))
- cached_lhs = simplify_cond_and_lookup_avail_expr (dummy_cond,
- NULL,
- false);
- }
+ if (TREE_CODE (op1) == SSA_NAME)
+ {
+ tree tmp = SSA_NAME_VALUE (op1);
+ if (tmp && TREE_CODE (tmp) != VALUE_HANDLE)
+ op1 = tmp;
}
- /* We can have conditionals which just test the state of a
- variable rather than use a relational operator. These are
- simpler to handle. */
- else if (TREE_CODE (cond) == SSA_NAME)
- {
- cached_lhs = cond;
- cached_lhs = SSA_NAME_VALUE (cached_lhs);
- if (cached_lhs && ! is_gimple_min_invariant (cached_lhs))
- cached_lhs = NULL;
+
+ /* Stuff the operator and operands into our dummy conditional
+ expression, creating the dummy conditional if necessary. */
+ dummy_cond = walk_data->global_data;
+ if (! dummy_cond)
+ {
+ dummy_cond = build (cond_code, boolean_type_node, op0, op1);
+ dummy_cond = build (COND_EXPR, void_type_node,
+ dummy_cond, NULL, NULL);
+ walk_data->global_data = dummy_cond;
}
else
- cached_lhs = lookup_avail_expr (stmt, false);
-
- if (cached_lhs)
{
- edge taken_edge = find_taken_edge (e->dest, cached_lhs);
- basic_block dest = (taken_edge ? taken_edge->dest : NULL);
+ TREE_SET_CODE (COND_EXPR_COND (dummy_cond), cond_code);
+ TREE_OPERAND (COND_EXPR_COND (dummy_cond), 0) = op0;
+ TREE_OPERAND (COND_EXPR_COND (dummy_cond), 1) = op1;
+ }
+
+ /* If the conditional folds to an invariant, then we are done,
+ otherwise look it up in the hash tables. */
+ cached_lhs = local_fold (COND_EXPR_COND (dummy_cond));
+ if (! is_gimple_min_invariant (cached_lhs))
+ {
+ cached_lhs = lookup_avail_expr (dummy_cond, false);
+ if (!cached_lhs || ! is_gimple_min_invariant (cached_lhs))
+ cached_lhs = simplify_cond_and_lookup_avail_expr (dummy_cond,
+ NULL,
+ false);
+ }
+ }
+ /* We can have conditionals which just test the state of a
+ variable rather than use a relational operator. These are
+ simpler to handle. */
+ else if (TREE_CODE (cond) == SSA_NAME)
+ {
+ cached_lhs = cond;
+ cached_lhs = SSA_NAME_VALUE (cached_lhs);
+ if (cached_lhs && ! is_gimple_min_invariant (cached_lhs))
+ cached_lhs = NULL;
+ }
+ else
+ cached_lhs = lookup_avail_expr (stmt, false);
- if (dest == e->dest)
- return;
+ if (cached_lhs)
+ {
+ edge taken_edge = find_taken_edge (e->dest, cached_lhs);
+ basic_block dest = (taken_edge ? taken_edge->dest : NULL);
- /* If we have a known destination for the conditional, then
- we can perform this optimization, which saves at least one
- conditional jump each time it applies since we get to
- bypass the conditional at our original destination. */
- if (dest)
- {
- struct edge_info *edge_info;
+ /* If we have a known destination for the conditional, then
+ we can perform this optimization, which saves at least one
+ conditional jump each time it applies since we get to
+ bypass the conditional at our original destination.
+
+ To limit code growth, we only consider duplication of blocks
+ for which we estimate that the number of insns required for
+ statements not contributing to the condition at the end of
+ DEST (which means we need to duplicate them) is not larger
+ than PARAM_MAX_JUMP_THREAD_DUPLICATION_INSNS. */
+ if (dest
+ && dest != e->dest
+ && ok_to_duplicate_for_threading (e->dest, visited_stmts))
+ {
+ struct edge_info *edge_info;
- if (e->aux)
- edge_info = e->aux;
- else
- edge_info = allocate_edge_info (e);
- edge_info->redirection_target = taken_edge;
- bitmap_set_bit (threaded_blocks, e->dest->index);
- }
+ if (e->aux)
+ edge_info = e->aux;
+ else
+ edge_info = allocate_edge_info (e);
+ edge_info->redirection_target = taken_edge;
+ bitmap_set_bit (threaded_blocks, e->dest->index);
}
}
+
+ pointer_set_destroy (visited_stmts);
}