This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch] Improving jump-thread pass for PR 54742
- From: Sebastian Pop <sebpop at gmail dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, James Greenhalgh <james dot greenhalgh at arm dot com>, Steve Ellcey <sellcey at mips dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 11 Nov 2014 01:14:04 +0000
- Subject: Re: [Patch] Improving jump-thread pass for PR 54742
- Authentication-results: sourceware.org; auth=none
- References: <1408480796 dot 31355 dot 7 dot camel at ubuntu-sellcey> <CAFiYyc2D2MJmPneUNWaBk_3thxUOmB49h=uddOqMeBH8dnD43Q at mail dot gmail dot com> <20140821094109 dot GA20536 at arm dot com> <CAFiYyc1bdsYrgf3=7Qttdek486YYrO1D+-NJq+nH27s7MX4o-w at mail dot gmail dot com> <53FB73C0 dot 7090507 at redhat dot com> <20140926201451 dot GA16704 at f1 dot c dot bardezibar dot internal> <20141026213433 dot GA22140 at f1 dot c dot bardezibar dot internal>
Hi Jeff,
I have adapted the code generation part from James' patch to current trunk, and
the resulting patch gets the 30% speedup on coremark and passes bootstrap of GCC.
Ok for trunk?
Thanks,
Sebastian
Sebastian Pop wrote:
> Sebastian Pop wrote:
> > Jeff Law wrote:
> > > On 08/21/14 04:30, Richard Biener wrote:
> > > >>It turns Jeff's jump-threading code in to a strange franken-pass of bits and
> > > >>pieces of detection and optimisation, and would need some substantial
> > > >>reworking to fit in with Jeff's changes last Autumn, but if it is more
> > > >>likely to be acceptable for trunk then perhaps we could look to revive it.
> > > >>It would be nice to reuse the path copy code Jeff added last year, but I
> > > >>don't have much intuition as to how feasible that is.
> > > >>
> > > >>Was this the sort of thing that you were imagining?
> > > >
> > > >Yeah, didn't look too closely though.
> > > It'd be pretty ugly I suspect. But it's probably worth pondering
> > > since that approach would eliminate the concerns about the cost of
> > > detection (which is problematical for the jump threader) by using
> > > Steve's code for that.
> > >
> > > On the update side, I suspect most, if not all of the framework is
> > > in place to handle this kind of update if the right threading paths
> > > were passed to the updater. I can probably cobble together that
> > > by-hand and see what the tree-ssa-threadupdate does with it. But
> > > it'll be a week or so before I could look at it.
> >
> > I adapted the patch James has sent last year to use the new update paths
>
> Attached an updated version of the patch.
>
> > mechanism. I verified that the attached patch does register all the paths that
> > need to be threaded. Thread updater seems to have some problems handling the
> > attached testcase (a simplified version of the testcase attached to the bug.)
> >
> > Jeff, could you please have a look at why the jump-thread updater is crashing?
>
> I have tried to understand why the code generation part ICEs on coremark: the
> first problem that I have seen is that tree-ssa-threadupdate.c does not handle
> more than a joiner block per path to be threaded, so we would not be able to
> jump thread accross the joiners of the if condition and the joiner of the switch
> condition: i.e., these paths
>
> patch: Registering jump thread: (7, 10) incoming edge; (10, 25) joiner; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 14) nocopy;
> patch: Registering jump thread: (28, 10) incoming edge; (10, 25) joiner; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 11) nocopy;
> patch: Registering jump thread: (8, 10) incoming edge; (10, 25) joiner; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 17) nocopy;
> patch: Registering jump thread: (9, 10) incoming edge; (10, 25) joiner; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
>
> Another problem is that we attach the path to be threaded to the ->aux field of
> the first edge in the path, such that we would have to cancel some of the paths
> because we cannot keep track of all the paths to be threaded.
>
> For coremark, we would discover some jump-thread paths from one of the switch
> cases over the loop exit condition, either to bb_27 outside the loop, or to bb_4
> staying inside the loop. Then with the "patch:" we would discover jump threads
> that would thread switch cases to switch cases, and because these paths start
> with the same edges for which we have already assigned a path to e->aux, we
> would have to cancel the interesting threads added by the patch:
>
> Registering jump thread: (12, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy;
> Registering jump thread: (13, 25) incoming edge; (25, 26) joiner; (26, 27) nocopy;
> Registering jump thread: (29, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy;
> Registering jump thread: (31, 25) incoming edge; (25, 26) joiner; (26, 27) nocopy;
> Registering jump thread: (16, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy;
> Registering jump thread: (15, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy;
> Registering jump thread: (32, 25) incoming edge; (25, 26) joiner; (26, 27) nocopy;
> Registering jump thread: (19, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy;
> Registering jump thread: (18, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy;
> Registering jump thread: (22, 25) incoming edge; (25, 26) joiner; (26, 27) nocopy;
> Registering jump thread: (21, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy;
> Registering jump thread: (34, 25) incoming edge; (25, 26) joiner; (26, 27) nocopy;
> Registering jump thread: (33, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy;
> Registering jump thread: (35, 25) incoming edge; (25, 26) joiner; (26, 27) nocopy;
> Registering jump thread: (24, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy;
> patch: Registering jump thread: (12, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 17) nocopy;
> patch: Registering jump thread: (16, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 14) nocopy;
> patch: Registering jump thread: (19, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 17) nocopy;
> patch: Registering jump thread: (22, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> patch: Registering jump thread: (34, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> patch: Registering jump thread: (35, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> patch: Registering jump thread: (29, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 14) nocopy;
> patch: Registering jump thread: (13, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> patch: Registering jump thread: (15, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 17) nocopy;
> patch: Registering jump thread: (31, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> patch: Registering jump thread: (18, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 20) nocopy;
> patch: Registering jump thread: (32, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> patch: Registering jump thread: (21, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 23) nocopy;
> patch: Registering jump thread: (33, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 24) nocopy;
> patch: Registering jump thread: (24, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 24) nocopy;
> Registering jump thread: (6, 36) incoming edge; (36, 7) normal;
> Cancelling jump thread: (12, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 17) nocopy;
> Cancelling jump thread: (16, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 14) nocopy;
> Cancelling jump thread: (19, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 17) nocopy;
> Cancelling jump thread: (22, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> Cancelling jump thread: (34, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> Cancelling jump thread: (35, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> Cancelling jump thread: (29, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 14) nocopy;
> Cancelling jump thread: (13, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> Cancelling jump thread: (15, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 17) nocopy;
> Cancelling jump thread: (31, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> Cancelling jump thread: (18, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 20) nocopy;
> Cancelling jump thread: (32, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 25) nocopy;
> Cancelling jump thread: (21, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 23) nocopy;
> Cancelling jump thread: (33, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 24) nocopy;
> Cancelling jump thread: (24, 25) incoming edge; (25, 26) joiner; (26, 4) nocopy; (4, 37) nocopy; (37, 36) nocopy; (36, 24) nocopy;
>
> Here is the structure of the CFG with the loops:
>
> (gdb) p debug_loops (2)
> loop_0 (header = 0, latch = 1, niter = )
> {
> bb_2 (preds = {bb_0 }, succs = {bb_3 bb_27 })
> bb_3 (preds = {bb_2 }, succs = {bb_5 bb_6 })
> bb_5 (preds = {bb_4 bb_3 }, succs = {bb_27 })
> bb_6 (preds = {bb_3 }, succs = {bb_36 })
> bb_27 (preds = {bb_5 bb_25 bb_26 bb_2 }, succs = {bb_1 })
> loop_1 (header = 36, latch = 37, niter = )
> {
> bb_4 (preds = {bb_26 }, succs = {bb_5 bb_37 })
> bb_37 (preds = {bb_4 }, succs = {bb_36 })
> bb_36 (preds = {bb_6 bb_37 }, succs = {bb_25 bb_7 bb_11 bb_20 bb_14 bb_17 bb_23 bb_24 })
> bb_7 (preds = {bb_36 }, succs = {bb_10 bb_28 })
> bb_8 (preds = {bb_28 }, succs = {bb_10 bb_9 })
> bb_9 (preds = {bb_8 }, succs = {bb_10 })
> bb_10 (preds = {bb_7 bb_28 bb_8 bb_9 }, succs = {bb_25 })
> bb_11 (preds = {bb_36 }, succs = {bb_29 bb_30 })
> bb_12 (preds = {bb_30 }, succs = {bb_25 })
> bb_13 (preds = {bb_30 }, succs = {bb_25 })
> bb_14 (preds = {bb_36 }, succs = {bb_15 bb_16 })
> bb_15 (preds = {bb_14 }, succs = {bb_25 })
> bb_16 (preds = {bb_14 }, succs = {bb_25 bb_31 })
> bb_17 (preds = {bb_36 }, succs = {bb_18 bb_19 })
> bb_18 (preds = {bb_17 }, succs = {bb_25 })
> bb_19 (preds = {bb_17 }, succs = {bb_25 bb_32 })
> bb_20 (preds = {bb_36 }, succs = {bb_21 bb_22 })
> bb_21 (preds = {bb_20 }, succs = {bb_25 })
> bb_22 (preds = {bb_20 }, succs = {bb_25 })
> bb_23 (preds = {bb_36 }, succs = {bb_33 bb_34 })
> bb_24 (preds = {bb_36 }, succs = {bb_25 bb_35 })
> bb_25 (preds = {bb_10 bb_12 bb_16 bb_19 bb_22 bb_34 bb_35 bb_36 bb_29 bb_13 bb_15 bb_31 bb_18 bb_32 bb_21 bb_33 bb_24 }, succs = {bb_26 bb_27 })
> bb_26 (preds = {bb_25 }, succs = {bb_4 bb_27 })
> bb_28 (preds = {bb_7 }, succs = {bb_10 bb_8 })
> bb_29 (preds = {bb_11 }, succs = {bb_25 })
> bb_30 (preds = {bb_11 }, succs = {bb_12 bb_13 })
> bb_31 (preds = {bb_16 }, succs = {bb_25 })
> bb_32 (preds = {bb_19 }, succs = {bb_25 })
> bb_33 (preds = {bb_23 }, succs = {bb_25 })
> bb_34 (preds = {bb_23 }, succs = {bb_25 })
> bb_35 (preds = {bb_24 }, succs = {bb_25 })
> }
> }
>
> What about removing the use of e->aux in threadupdate.c, to be able to jump
> thread across all the recorded paths?
>
> Thanks,
> Sebastian
> From bac0f2a390048652910f77503b21b3e208daeae1 Mon Sep 17 00:00:00 2001
> From: Sebastian Pop <s.pop@samsung.com>
> Date: Fri, 26 Sep 2014 14:54:20 -0500
> Subject: [PATCH] jump thread for PR 54742
>
> Adapted from a patch from James Greenhalgh.
>
> * tree-ssa-threadedge.c (simplify_control_stmt_condition): Restore the
> original value of cond when simplification fails.
> (find_thread_path): New.
> (find_control_statement_thread_paths): New.
> (thread_through_normal_block): Call find_control_statement_thread_paths.
>
> * testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c: New.
> ---
> gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c | 32 ++++
> gcc/tree-ssa-threadedge.c | 180 ++++++++++++++++++++++-
> gcc/tree-ssa-threadupdate.c | 4 +
> 3 files changed, 215 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
> new file mode 100644
> index 0000000..f3ef725
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
> @@ -0,0 +1,32 @@
> +int sum0, sum1, sum2, sum3;
> +int foo(char * s, char** ret)
> +{
> + int state=0;
> + char c;
> +
> + for (; *s && state != 4; s++)
> + {
> + c = *s;
> + if (c == '*')
> + {
> + s++;
> + break;
> + }
> + switch (state) {
> + case 0:
> + if (c == '+') state = 1;
> + else if (c != '-') sum0+=c;
> + break;
> + case 1:
> + if (c == '+') state = 2;
> + else if (c == '-') state = 0;
> + else sum1+=c;
> + break;
> + default:
> + break;
> + }
> +
> + }
> + *ret = s;
> + return state;
> +}
> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
> index 3dee5ba..7b9e5b6 100644
> --- a/gcc/tree-ssa-threadedge.c
> +++ b/gcc/tree-ssa-threadedge.c
> @@ -628,6 +628,7 @@ simplify_control_stmt_condition (edge e,
> rather than use a relational operator. These are simpler to handle. */
> if (TREE_CODE (cond) == SSA_NAME)
> {
> + tree original_lhs = cond;
> cached_lhs = cond;
>
> /* Get the variable's current value from the equivalence chains.
> @@ -656,6 +657,12 @@ simplify_control_stmt_condition (edge e,
> pass specific callback to try and simplify it further. */
> if (cached_lhs && ! is_gimple_min_invariant (cached_lhs))
> cached_lhs = (*simplify) (stmt, stmt);
> +
> + /* We couldn't find an invariant. But, callers of this
> + function may be able to do something useful with the
> + unmodified destination. */
> + if (!cached_lhs)
> + cached_lhs = original_lhs;
> }
> else
> cached_lhs = NULL;
> @@ -915,6 +922,155 @@ thread_around_empty_blocks (edge taken_edge,
> return false;
> }
>
> +/* Return true if there is at least one path from START_BB to END_BB.
> + VISITED_BBS is used to make sure we don't fall into an infinite loop. */
> +
> +static bool
> +find_thread_path (basic_block start_bb, basic_block end_bb,
> + vec<basic_block, va_gc> *&path,
> + hash_set<basic_block> *visited_bbs)
> +{
> + if (start_bb == end_bb)
> + {
> + vec_safe_push (path, start_bb);
> + return true;
> + }
> +
> + if (!visited_bbs->add(start_bb))
> + {
> + edge e;
> + edge_iterator ei;
> + FOR_EACH_EDGE (e, ei, start_bb->succs)
> + if (find_thread_path (e->dest, end_bb, path, visited_bbs))
> + {
> + vec_safe_push (path, start_bb);
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
> +/* We trace the value of the variable EXPR back through any phi nodes looking
> + for places where it gets a constant value and save the path. */
> +
> +static void
> +find_control_statement_thread_paths (tree expr,
> + hash_set<gimple> *visited_phis,
> + vec<basic_block, va_gc> *&path)
> +{
> + tree var = SSA_NAME_VAR (expr);
> + gimple def_stmt = SSA_NAME_DEF_STMT (expr);
> + basic_block var_bb = gimple_bb (def_stmt);
> +
> + if (var == NULL || var_bb == NULL)
> + return;
> +
> + vec<basic_block, va_gc> *next_path;
> + vec_alloc (next_path, n_basic_blocks_for_fn (cfun));
> +
> + basic_block last_bb_in_path = path->last ();
> +
> + /* Put the path from var_bb to last_bb_in_path into next_path. */
> + if (var_bb != last_bb_in_path)
> + {
> + edge e;
> + int e_count = 0;
> + edge_iterator ei;
> +
> + FOR_EACH_EDGE (e, ei, last_bb_in_path->preds)
> + {
> + hash_set<basic_block> *visited_bbs = new hash_set<basic_block>;
> +
> + if (find_thread_path (var_bb, e->src, next_path, visited_bbs))
> + e_count = e_count + 1;
> +
> + delete visited_bbs;
> +
> + /* If there is more than one path, stop. */
> + if (e_count > 1)
> + {
> + vec_free (next_path);
> + return;
> + }
> + }
> + }
> +
> + /* Visit PHI nodes once. */
> + if (gimple_code (def_stmt) != GIMPLE_PHI
> + || visited_phis->add(def_stmt)) {
> + vec_free (next_path);
> + return;
> + }
> +
> + /* Append all the nodes from next_path to path. */
> + vec_safe_splice (path, next_path);
> + gcc_assert (path->last () == var_bb);
> +
> + /* Iterate over the arguments of PHI. */
> + unsigned int i;
> + for (i = 0; i < gimple_phi_num_args (def_stmt); i++)
> + {
> + tree arg = gimple_phi_arg_def (def_stmt, i);
> + basic_block bbi = gimple_phi_arg_edge (def_stmt, i)->src;
> +
> + /* Skip edges pointing outside the current loop. */
> + if (!arg || var_bb->loop_father != bbi->loop_father)
> + continue;
> +
> + /* Add BBI to the path. */
> + vec_safe_push (path, bbi);
> +
> + if (TREE_CODE (arg) == INTEGER_CST)
> + {
> + int j, n = path->length ();
> + vec<jump_thread_edge *> *jump_thread_path
> + = new vec<jump_thread_edge *> ();
> + int joiners = 0;
> +
> + for (j = 0; j < n - 1; j++)
> + {
> + edge e = find_edge ((*path)[n - j - 1],
> + (*path)[n - j - 2]);
> + gcc_assert (e);
> + enum jump_thread_edge_type kind;
> +
> + if (j == 0)
> + kind = EDGE_START_JUMP_THREAD;
> + else if (single_pred_p (e->src))
> + kind = EDGE_NO_COPY_SRC_BLOCK;
> + else {
> + kind = EDGE_COPY_SRC_JOINER_BLOCK;
> + ++joiners;
> + }
> +
> + jump_thread_edge *x = new jump_thread_edge (e, kind);
> + jump_thread_path->safe_push (x);
> + }
> +
> + /* Add the edge taken when the control variable has value ARG. */
> + edge taken_edge = find_taken_edge ((*path)[0], arg);
> + jump_thread_edge *x
> + = new jump_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK);
> + jump_thread_path->safe_push (x);
> +
> + /* Thread-update does not handle more than two joiners. A path with
> + less than 3 nodes should not be jump-threaded. */
> + if (joiners < 2 && n > 2)
> + register_jump_thread (jump_thread_path);
> + }
> + else if (TREE_CODE (arg) == SSA_NAME)
> + find_control_statement_thread_paths (arg, visited_phis, path);
> +
> + /* Remove BBI from the path. */
> + path->pop ();
> + }
> +
> + /* Remove all the nodes that we added from next_path. */
> + vec_safe_truncate (path, (path->length () - next_path->length ()));
> + vec_free (next_path);
> +}
> +
> /* We are exiting E->src, see if E->dest ends with a conditional
> jump which has a known value when reached via E.
>
> @@ -1000,7 +1156,10 @@ thread_through_normal_block (edge e,
> cond = simplify_control_stmt_condition (e, stmt, dummy_cond, simplify,
> handle_dominating_asserts);
>
> - if (cond && is_gimple_min_invariant (cond))
> + if (!cond)
> + return 0;
> +
> + if (is_gimple_min_invariant (cond))
> {
> edge taken_edge = find_taken_edge (e->dest, cond);
> basic_block dest = (taken_edge ? taken_edge->dest : NULL);
> @@ -1046,7 +1205,25 @@ thread_through_normal_block (edge e,
> backedge_seen_p);
> return 1;
> }
> +
> + if (TREE_CODE (cond) != SSA_NAME
> + || e->dest->loop_father != e->src->loop_father)
> + return 0;
> +
> + /* When COND cannot be simplified, try to find paths from a control
> + statement back through the PHI nodes which would affect that control
> + statement. */
> + vec<basic_block, va_gc> *bb_path;
> + vec_alloc (bb_path, n_basic_blocks_for_fn (cfun));
> + vec_safe_push (bb_path, e->dest);
> + hash_set<gimple> *visited_phis = new hash_set<gimple>;
> +
> + find_control_statement_thread_paths (cond, visited_phis, bb_path);
> +
> + delete visited_phis;
> + vec_free (bb_path);
> +
> + return -1;
> }
> return 0;
> }
> --
> 2.1.0.243.g30d45f7
>
>From 23b1ac8fa92e9e4cd05edb2967aba564126f75a1 Mon Sep 17 00:00:00 2001
From: Sebastian Pop <s.pop@samsung.com>
Date: Fri, 26 Sep 2014 14:54:20 -0500
Subject: [PATCH] jump thread for PR 54742
Adapted from a patch from James Greenhalgh.
* params.def (max-thread-path-insns, max-thread-length,
max-thread-paths): New.
* testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c: New.
* tree-cfg.c (gimple_duplicate_sese_region): Save and restore loop
header and latch.
* tree-ssa-threadedge.c (simplify_control_stmt_condition): Restore the
original value of cond when simplification fails.
(fsm_find_thread_path): New.
(fsm_find_control_statement_thread_paths): New.
(fsm_thread_through_normal_block): Call find_control_statement_thread_paths.
* tree-ssa-threadupdate.c (dump_jump_thread_path): Pretty print
EDGE_START_FSM_THREAD.
(thread_through_all_blocks): Generate code for EDGE_START_FSM_THREAD edges
calling gimple_duplicate_sese_region.
* tree-ssa-threadupdate.h (jump_thread_edge_type): Add EDGE_START_FSM_THREAD.
---
gcc/params.def | 19 +++
gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c | 32 ++++
gcc/tree-cfg.c | 21 ++-
gcc/tree-ssa-threadedge.c | 200 ++++++++++++++++++++++-
gcc/tree-ssa-threadupdate.c | 52 +++++-
gcc/tree-ssa-threadupdate.h | 1 +
6 files changed, 320 insertions(+), 5 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
diff --git a/gcc/params.def b/gcc/params.def
index d2d2add..749f962 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -123,6 +123,25 @@ DEFPARAM (PARAM_PARTIAL_INLINING_ENTRY_PROBABILITY,
"Maximum probability of the entry BB of split region (in percent relative to entry BB of the function) to make partial inlining happen",
70, 0, 0)
+/* Maximum number of instructions to copy when duplicating blocks
+ on a jump thread path. */
+DEFPARAM (PARAM_MAX_THREAD_PATH_INSNS,
+ "max-thread-path-insns",
+ "Maximum number of instructions to copy when duplicating blocks on a jump thread path",
+ 100, 1, 999999)
+
+/* Maximum length of a jump thread path. */
+DEFPARAM (PARAM_MAX_THREAD_LENGTH,
+ "max-thread-length",
+ "Maximum number of basic blocks on a jump thread path",
+ 10, 1, 999999)
+
+/* Maximum number of jump thread paths to duplicate. */
+DEFPARAM (PARAM_MAX_THREAD_PATHS,
+ "max-thread-paths",
+ "Maximum number of new jump thread paths to create",
+ 50, 1, 999999)
+
/* Limit the number of expansions created by the variable expansion
optimization to avoid register pressure. */
DEFPARAM (PARAM_MAX_VARIABLE_EXPANSIONS,
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
new file mode 100644
index 0000000..f3ef725
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
@@ -0,0 +1,32 @@
+int sum0, sum1, sum2, sum3;
+int foo(char * s, char** ret)
+{
+ int state=0;
+ char c;
+
+ for (; *s && state != 4; s++)
+ {
+ c = *s;
+ if (c == '*')
+ {
+ s++;
+ break;
+ }
+ switch (state) {
+ case 0:
+ if (c == '+') state = 1;
+ else if (c != '-') sum0+=c;
+ break;
+ case 1:
+ if (c == '+') state = 2;
+ else if (c == '-') state = 0;
+ else sum1+=c;
+ break;
+ default:
+ break;
+ }
+
+ }
+ *ret = s;
+ return state;
+}
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index ee10bc6..565cfe3 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -5949,10 +5949,12 @@ gimple_duplicate_sese_region (edge entry, edge exit,
{
unsigned i;
bool free_region_copy = false, copying_header = false;
+ bool save_loop_details = false;
struct loop *loop = entry->dest->loop_father;
edge exit_copy;
vec<basic_block> doms;
edge redirected;
+ int memo_loop_header_no = 0, memo_loop_latch_no = 0;
int total_freq = 0, entry_freq = 0;
gcov_type total_count = 0, entry_count = 0;
@@ -5970,9 +5972,15 @@ gimple_duplicate_sese_region (edge entry, edge exit,
if (region[i]->loop_father != loop)
return false;
- if (region[i] != entry->dest
- && region[i] == loop->header)
- return false;
+ /* If we are copying an abnormally shaped region, keep track of
+ which block will become our loop header. */
+ if ((region[i] != entry->dest && region[i] == loop->header)
+ || (region[i] != entry->src && region[i] == loop->latch))
+ {
+ save_loop_details = true;
+ memo_loop_latch_no = i;
+ memo_loop_header_no = i;
+ }
}
/* In case the function is used for loop header copying (which is the primary
@@ -6055,6 +6063,13 @@ gimple_duplicate_sese_region (edge entry, edge exit,
loop->latch = exit->src;
}
+ /* Restore loop details if we were asked to save them. */
+ if (save_loop_details)
+ {
+ loop->header = region_copy[memo_loop_header_no];
+ loop->latch = region_copy[memo_loop_latch_no];
+ }
+
/* Redirect the entry and add the phi node arguments. */
redirected = redirect_edge_and_branch (entry, get_bb_copy (entry->dest));
gcc_assert (redirected != NULL);
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index d5b9696..de9b3fe 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -56,6 +56,8 @@ along with GCC; see the file COPYING3. If not see
#include "params.h"
#include "tree-ssa-threadedge.h"
#include "builtins.h"
+#include "cfg.h"
+#include "cfganal.h"
/* To avoid code explosion due to jump threading, we limit the
number of statements we are going to copy. This variable
@@ -660,6 +662,7 @@ simplify_control_stmt_condition (edge e,
rather than use a relational operator. These are simpler to handle. */
if (TREE_CODE (cond) == SSA_NAME)
{
+ tree original_lhs = cond;
cached_lhs = cond;
/* Get the variable's current value from the equivalence chains.
@@ -688,6 +691,12 @@ simplify_control_stmt_condition (edge e,
pass specific callback to try and simplify it further. */
if (cached_lhs && ! is_gimple_min_invariant (cached_lhs))
cached_lhs = (*simplify) (stmt, stmt);
+
+ /* We couldn't find an invariant. But, callers of this
+ function may be able to do something useful with the
+ unmodified destination. */
+ if (!cached_lhs)
+ cached_lhs = original_lhs;
}
else
cached_lhs = NULL;
@@ -947,6 +956,172 @@ thread_around_empty_blocks (edge taken_edge,
return false;
}
+/* Return true if there is at least one path from START_BB to END_BB.
+ VISITED_BBS is used to make sure we don't fall into an infinite loop. */
+
+static bool
+fsm_find_thread_path (basic_block start_bb, basic_block end_bb,
+ vec<basic_block, va_gc> *&path,
+ hash_set<basic_block> *visited_bbs, int n_insns)
+{
+ if (start_bb == end_bb)
+ {
+ vec_safe_push (path, start_bb);
+ return true;
+ }
+
+ if (!visited_bbs->add(start_bb))
+ {
+ edge e;
+ edge_iterator ei;
+ FOR_EACH_EDGE (e, ei, start_bb->succs)
+ if (fsm_find_thread_path (e->dest, end_bb, path, visited_bbs, n_insns))
+ {
+ vec_safe_push (path, start_bb);
+ return true;
+ }
+ }
+
+ return false;
+}
+
+static int max_threaded_paths;
+
+/* We trace the value of the variable EXPR back through any phi nodes looking
+ for places where it gets a constant value and save the path. Stop after
+ having recorded MAX_PATHS jump threading paths. */
+
+static void
+fsm_find_control_statement_thread_paths (tree expr,
+ hash_set<gimple> *visited_phis,
+ vec<basic_block, va_gc> *&path)
+{
+ tree var = SSA_NAME_VAR (expr);
+ gimple def_stmt = SSA_NAME_DEF_STMT (expr);
+ basic_block var_bb = gimple_bb (def_stmt);
+
+ if (var == NULL || var_bb == NULL)
+ return;
+
+ vec<basic_block, va_gc> *next_path;
+ vec_alloc (next_path, n_basic_blocks_for_fn (cfun));
+
+ basic_block last_bb_in_path = path->last ();
+
+ /* Put the path from var_bb to last_bb_in_path into next_path. */
+ if (var_bb != last_bb_in_path)
+ {
+ edge e;
+ int e_count = 0;
+ edge_iterator ei;
+
+ FOR_EACH_EDGE (e, ei, last_bb_in_path->preds)
+ {
+ hash_set<basic_block> *visited_bbs = new hash_set<basic_block>;
+
+ if (fsm_find_thread_path (var_bb, e->src, next_path, visited_bbs, 0))
+ ++e_count;
+
+ delete visited_bbs;
+
+ /* If there is more than one path, stop. */
+ if (e_count > 1)
+ {
+ vec_free (next_path);
+ return;
+ }
+ }
+ }
+
+ /* Visit PHI nodes once. */
+ if (gimple_code (def_stmt) != GIMPLE_PHI
+ || visited_phis->add(def_stmt))
+ {
+ vec_free (next_path);
+ return;
+ }
+
+ /* Append all the nodes from next_path to path. */
+ vec_safe_splice (path, next_path);
+ gcc_assert (path->last () == var_bb);
+
+ /* Iterate over the arguments of PHI. */
+ unsigned int i;
+ for (i = 0; i < gimple_phi_num_args (def_stmt); i++)
+ {
+ tree arg = gimple_phi_arg_def (def_stmt, i);
+ basic_block bbi = gimple_phi_arg_edge (def_stmt, i)->src;
+
+ /* Skip edges pointing outside the current loop. */
+ if (!arg || var_bb->loop_father != bbi->loop_father)
+ continue;
+
+ /* Add BBI to the path. */
+ vec_safe_push (path, bbi);
+
+ if (TREE_CODE (arg) == INTEGER_CST)
+ {
+ int j, n = path->length ();
+ vec<jump_thread_edge *> *jump_thread_path
+ = new vec<jump_thread_edge *> ();
+ int joiners = 0;
+
+ for (j = 0; j < n - 1; j++)
+ {
+ edge e = find_edge ((*path)[n - j - 1],
+ (*path)[n - j - 2]);
+ gcc_assert (e);
+ enum jump_thread_edge_type kind;
+
+ if (j == 0)
+ kind = EDGE_START_FSM_THREAD;
+ else if (single_pred_p (e->src))
+ kind = EDGE_NO_COPY_SRC_BLOCK;
+ else {
+ kind = EDGE_COPY_SRC_JOINER_BLOCK;
+ ++joiners;
+ }
+
+ jump_thread_edge *x = new jump_thread_edge (e, kind);
+ jump_thread_path->safe_push (x);
+ }
+
+ /* Add the edge taken when the control variable has value ARG. */
+ edge taken_edge = find_taken_edge ((*path)[0], arg);
+ jump_thread_edge *x
+ = new jump_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK);
+ jump_thread_path->safe_push (x);
+
+ /* A path with less than 3 nodes should not be jump-threaded. */
+ if (n > 2 && n < PARAM_VALUE (PARAM_MAX_THREAD_LENGTH)
+ && max_threaded_paths > 0)
+ {
+ int n_insns = 0;
+ gimple_stmt_iterator gsi;
+
+ for (j = 1; j < n - 1; j++)
+ for (gsi = gsi_start_bb ((*path)[j]); !gsi_end_p (gsi); gsi_next (&gsi))
+ ++n_insns;
+
+ if (n_insns < PARAM_VALUE (PARAM_MAX_THREAD_PATH_INSNS))
+ {
+ register_jump_thread (jump_thread_path);
+ --max_threaded_paths;
+ }
+ }
+ }
+ else if (TREE_CODE (arg) == SSA_NAME)
+ fsm_find_control_statement_thread_paths (arg, visited_phis, path);
+
+ /* Remove BBI from the path. */
+ path->pop ();
+ }
+
+ /* Remove all the nodes that we added from next_path. */
+ vec_safe_truncate (path, (path->length () - next_path->length ()));
+ vec_free (next_path);
+}
+
/* We are exiting E->src, see if E->dest ends with a conditional
jump which has a known value when reached via E.
@@ -1032,7 +1207,10 @@ thread_through_normal_block (edge e,
cond = simplify_control_stmt_condition (e, stmt, dummy_cond, simplify,
handle_dominating_asserts);
- if (cond && is_gimple_min_invariant (cond))
+ if (!cond)
+ return 0;
+
+ if (is_gimple_min_invariant (cond))
{
edge taken_edge = find_taken_edge (e->dest, cond);
basic_block dest = (taken_edge ? taken_edge->dest : NULL);
@@ -1078,6 +1256,26 @@ thread_through_normal_block (edge e,
backedge_seen_p);
return 1;
}
+
+ if (TREE_CODE (cond) != SSA_NAME
+ || e->dest->loop_father != e->src->loop_father)
+ return 0;
+
+ /* When COND cannot be simplified, try to find paths from a control
+ statement back through the PHI nodes which would affect that control
+ statement. */
+ vec<basic_block, va_gc> *bb_path;
+ vec_alloc (bb_path, n_basic_blocks_for_fn (cfun));
+ vec_safe_push (bb_path, e->dest);
+ hash_set<gimple> *visited_phis = new hash_set<gimple>;
+
+ max_threaded_paths = PARAM_VALUE (PARAM_MAX_THREAD_PATHS);
+ fsm_find_control_statement_thread_paths (cond, visited_phis, bb_path);
+
+ delete visited_phis;
+ vec_free (bb_path);
+
+ return -1;
}
return 0;
}
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 151ed83..5847078 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -167,8 +167,9 @@ dump_jump_thread_path (FILE *dump_file, vec<jump_thread_edge *> path,
bool registering)
{
fprintf (dump_file,
- " %s jump thread: (%d, %d) incoming edge; ",
+ " %s%s jump thread: (%d, %d) incoming edge; ",
(registering ? "Registering" : "Cancelling"),
+ (path[0]->type == EDGE_START_FSM_THREAD ? " FSM": ""),
path[0]->e->src->index, path[0]->e->dest->index);
for (unsigned int i = 1; i < path.length (); i++)
@@ -2343,6 +2344,55 @@ thread_through_all_blocks (bool may_peel_loop_headers)
threaded_blocks = BITMAP_ALLOC (NULL);
memset (&thread_stats, 0, sizeof (thread_stats));
+ for (i = 0; i < paths.length (); )
+ {
+ vec<jump_thread_edge *> *path = paths[i];
+ edge entry = (*path)[0]->e;
+
+ if ((*path)[0]->type != EDGE_START_FSM_THREAD
+ /* Do not jump-thread twice from the same block. */
+ || bitmap_bit_p (threaded_blocks, entry->src->index)) {
+ i++;
+ continue;
+ }
+
+ unsigned len = path->length ();
+ edge exit = (*path)[len - 1]->e;
+ basic_block *region = XNEWVEC (basic_block, len - 1);
+
+ for (unsigned int j = 0; j < len - 1; j++)
+ region[j] = (*path)[j]->e->dest;
+
+ bool success = gimple_duplicate_sese_region (entry, exit, region,
+ len - 1, NULL, 0);
+ delete_jump_thread_path (path);
+ paths.unordered_remove (i);
+
+ if (success) {
+ /* We do not update dominance info. */
+ free_dominance_info (CDI_DOMINATORS);
+
+ bitmap_set_bit (threaded_blocks, entry->src->index);
+ }
+ }
+
+ for (i = 0; i < paths.length (); )
+ {
+ vec<jump_thread_edge *> *path = paths[i];
+ edge entry = (*path)[0]->e;
+
+ /* Do not jump-thread twice from the same block. */
+ if (bitmap_bit_p (threaded_blocks, entry->src->index))
+ {
+ delete_jump_thread_path (path);
+ paths.unordered_remove (i);
+ }
+ else
+ i++;
+ }
+
+ bitmap_clear (threaded_blocks);
+
mark_threaded_blocks (threaded_blocks);
initialize_original_copy_tables ();
diff --git a/gcc/tree-ssa-threadupdate.h b/gcc/tree-ssa-threadupdate.h
index 426aca5..42c3a9e 100644
--- a/gcc/tree-ssa-threadupdate.h
+++ b/gcc/tree-ssa-threadupdate.h
@@ -26,6 +26,7 @@ extern bool thread_through_all_blocks (bool);
enum jump_thread_edge_type
{
EDGE_START_JUMP_THREAD,
+ EDGE_START_FSM_THREAD,
EDGE_COPY_SRC_BLOCK,
EDGE_COPY_SRC_JOINER_BLOCK,
EDGE_NO_COPY_SRC_BLOCK
--
2.1.0.243.g30d45f7