This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: Ajit Kumar Agarwal <ajit dot kumar dot agarwal at xilinx dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Vinod Kathail <vinodk at xilinx dot com>, Shail Aditya Gupta <shailadi at xilinx dot com>, Vidhumouli Hunsigida <vidhum at xilinx dot com>, Nagaraju Mekala <nmekala at xilinx dot com>
- Date: Thu, 3 Dec 2015 15:45:50 +0100
- Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
- Authentication-results: sourceware.org; auth=none
- References: <37378DC5BCD0EE48BA4B082E0B55DFAA41F3F56C at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA41F41150 at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <CAFiYyc0T+8CeeruKp-S2byhyBXW41ub7esqin+OuAYAy8q1WJA at mail dot gmail dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA41F42541 at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <CAFiYyc23ZRxFv4S=XKQUX2VgT113GDZVVjOYViSoYy4yP11cdg at mail dot gmail dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA4295155D at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA4295ADCB at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <55D4F921 dot 2020708 at redhat dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA4297704C at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <5643A732 dot 4040707 at redhat dot com> <CAFiYyc07fxKy=pxo4t81M8WVOx9PLnzmMthkbBm52wub93dErQ at mail dot gmail dot com> <5644C6CC dot 90203 at redhat dot com> <5644DB59 dot 9040809 at redhat dot com> <56450B62 dot 4090404 at redhat dot com> <CAFiYyc06=_w4+B0C1d8y0rgCmNCy7vvsFfdvCqstnh6t5zsqbA at mail dot gmail dot com> <56460F19 dot 5010009 at redhat dot com> <0B62FFB6-DF7A-4080-A655-3E51070E1DEE at gmail dot com> <564646AA dot 5030300 at redhat dot com> <564673DA dot 3020403 at redhat dot com> <CAFiYyc2WXOS5kJFeKPa0aqaWbHV-CnDeWJ98zRa5X=FbAZ2THw at mail dot gmail dot com>
On Thu, Dec 3, 2015 at 3:38 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Sat, Nov 14, 2015 at 12:35 AM, Jeff Law <law@redhat.com> wrote:
>> On 11/13/2015 01:23 PM, Jeff Law wrote:
>>>
>>> On 11/13/2015 11:09 AM, Richard Biener wrote:
>>>
>>>>>
>>>>> BTW Do we have an API for indicating that new blocks have been added to
>>>>>
>>>>> a loop? If so, then we can likely drop the LOOPS_NEED_FIXUP.
>>>>
>>>>
>>>> Please. It's called add_to_loop or so.
>>>
>>> Haha, the block duplication code was handling this already. So in
>>> theory I can just drop the LOOPS_NEED_FIXUP completely. Testing now.
>>>
>>> jeff
>>>
>> Attached is the committed patch for path splitting. As noted above, we
>> didn't need the LOOPS_NEED_FIXUP in the final version, so that wart is gone
>> :-)
>>
>> I do find myself wondering if this can/should be generalized beyond just
>> paths heading to loop backedges. However to do so I think we'd need to be
>> able to undo this transformation reliably and we'd need some heuristics when
>> to duplicate to expose the redundancy vs rely on PRE techniques and jump
>> threading. I vaguely remember a paper which touched on these topics, but I
>> can't seem to find it.
>>
>> Anyway, bootstrapped and regression tested on x86_64-linux-gnu. Installed on
>> the trunk.
>
> This pass is now enabled by default with -Os but has no limits on the amount of
> stmts it copies. It also will make all loops with this shape have at least two
> exits (if the resulting loop will be disambiguated the inner loop will
> have two exits).
> Having more than one exit will disable almost all loop optimizations after it.
>
> The pass itself documents the transform it does but does zero to motivate it.
>
> What's the benefit of this pass (apart from disrupting further optimizations)?
>
> I can see a _single_ case where duplicating the latch will allow threading
> one of the paths through the loop header to eliminate the original exit. Then
> disambiguation may create a nice nested loop out of this. Of course that
> is only profitable again if you know the remaining single exit of the inner
> loop (exiting to the outer one) is executed infrequently (thus the inner loop
> actually loops).
>
> But no checks other than on the CFG shape exist (oh, it checks it will
> at _least_ copy two stmts!).
>
> Given the profitability constraints above (well, correct me if I am
> wrong on these)
> it looks like the whole transform should be done within the FSM threading
> code which might be able to compute whether there will be an inner loop
> with a single exit only.
>
> I'm inclined to request the pass to be removed again or at least disabled by
> default.
>
> What closed source benchmark was this transform invented for?
Ah, some EEMBC one.
Btw, the testcase that was added shows
if (xc < xm)
{
xk = (unsigned char) (xc < xy ? xc : xy);
}
else
{
xk = (unsigned char) (xm < xy ? xm : xy);
}
which might be better handled by phiopt transforming it into
xk = MIN (xc, MIN (xm, xy))
phiopt1 sees (hooray to GENERIC folding)
xc_26 = ~xr_21;
xm_27 = ~xg_23;
xy_28 = ~xb_25;
if (xr_21 > xg_23)
goto <bb 5>;
else
goto <bb 6>;
<bb 5>:
xk_29 = MIN_EXPR <xc_26, xy_28>;
goto <bb 7>;
<bb 6>:
xk_30 = MIN_EXPR <xm_27, xy_28>;
<bb 7>:
# xk_4 = PHI <xk_29(5), xk_30(6)>
btw, see PR67438 for a similar testcase and the above pattern.
Richard.
> Richard.
>
>>
>>
>>
>> commit c1891376e5dcc99ad8be2d22f9551c03f9bb2729
>> Author: Jeff Law <law@redhat.com>
>> Date: Fri Nov 13 16:29:34 2015 -0700
>>
>> [Patch,tree-optimization]: Add new path Splitting pass on tree ssa
>> representation
>>
>> * Makefile.in (OBJS): Add gimple-ssa-split-paths.o
>> * common.opt (-fsplit-paths): New flag controlling path splitting.
>> * doc/invoke.texi (fsplit-paths): Document.
>> * opts.c (default_options_table): Add -fsplit-paths to -O2.
>> * passes.def: Add split_paths pass.
>> * timevar.def (TV_SPLIT_PATHS): New timevar.
>> * tracer.c: Include "tracer.h"
>> (ignore_bb_p): No longer static.
>> (transform_duplicate): New function, broken out of tail_duplicate.
>> (tail_duplicate): Use transform_duplicate.
>> * tracer.h (ignore_bb_p): Declare
>> (transform_duplicate): Likewise.
>> * tree-pass.h (make_pass_split_paths): Declare.
>> * gimple-ssa-split-paths.c: New file.
>>
>> * gcc.dg/tree-ssa/split-path-1.c: New test.
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index dde2695..a7abe37 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,21 @@
>> +2015-11-13 Ajit Agarwal <ajitkum@xilinx.com>
>> + Jeff Law <law@redhat.com>
>> +
>> + * Makefile.in (OBJS): Add gimple-ssa-split-paths.o
>> + * common.opt (-fsplit-paths): New flag controlling path splitting.
>> + * doc/invoke.texi (fsplit-paths): Document.
>> + * opts.c (default_options_table): Add -fsplit-paths to -O2.
>> + * passes.def: Add split_paths pass.
>> + * timevar.def (TV_SPLIT_PATHS): New timevar.
>> + * tracer.c: Include "tracer.h"
>> + (ignore_bb_p): No longer static.
>> + (transform_duplicate): New function, broken out of tail_duplicate.
>> + (tail_duplicate): Use transform_duplicate.
>> + * tracer.h (ignore_bb_p): Declare
>> + (transform_duplicate): Likewise.
>> + * tree-pass.h (make_pass_split_paths): Declare.
>> + * gimple-ssa-split-paths.c: New file.
>> +
>> 2015-11-13 Kai Tietz <ktietz70@googlemail.com>
>> Marek Polacek <polacek@redhat.com>
>> Jason Merrill <jason@redhat.com>
>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>> index d3fd5e9..5c294df 100644
>> --- a/gcc/Makefile.in
>> +++ b/gcc/Makefile.in
>> @@ -1277,6 +1277,7 @@ OBJS = \
>> gimple-pretty-print.o \
>> gimple-ssa-backprop.o \
>> gimple-ssa-isolate-paths.o \
>> + gimple-ssa-split-paths.o \
>> gimple-ssa-strength-reduction.o \
>> gimple-streamer-in.o \
>> gimple-streamer-out.o \
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 757ce85..3eb520e 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -2403,6 +2403,10 @@ ftree-vrp
>> Common Report Var(flag_tree_vrp) Init(0) Optimization
>> Perform Value Range Propagation on trees.
>>
>> +fsplit-paths
>> +Common Report Var(flag_split_paths) Init(0) Optimization
>> +Split paths leading to loop backedges.
>> +
>> funit-at-a-time
>> Common Report Var(flag_unit_at_a_time) Init(1)
>> Compile whole compilation unit at a time.
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index c18df98..eeb79e6 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -354,6 +354,7 @@ Objective-C and Objective-C++ Dialects}.
>> -fdump-tree-fre@r{[}-@var{n}@r{]} @gol
>> -fdump-tree-vtable-verify @gol
>> -fdump-tree-vrp@r{[}-@var{n}@r{]} @gol
>> +-fdump-tree-split-paths@r{[}-@var{n}@r{]} @gol
>> -fdump-tree-storeccp@r{[}-@var{n}@r{]} @gol
>> -fdump-final-insns=@var{file} @gol
>> -fcompare-debug@r{[}=@var{opts}@r{]} -fcompare-debug-second @gol
>> @@ -448,6 +449,7 @@ Objective-C and Objective-C++ Dialects}.
>> -fsel-sched-pipelining -fsel-sched-pipelining-outer-loops @gol
>> -fsemantic-interposition -fshrink-wrap -fsignaling-nans @gol
>> -fsingle-precision-constant -fsplit-ivs-in-unroller @gol
>> +-fsplit-paths @gol
>> -fsplit-wide-types -fssa-backprop -fssa-phiopt @gol
>> -fstack-protector -fstack-protector-all -fstack-protector-strong @gol
>> -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol
>> @@ -7171,6 +7173,11 @@ output on to @file{stderr}. If two conflicting dump
>> filenames are
>> given for the same pass, then the latter option overrides the earlier
>> one.
>>
>> +@item split-paths
>> +@opindex fdump-tree-split-paths
>> +Dump each function after splitting paths to loop backedges. The file
>> +name is made by appending @file{.split-paths} to the source file name.
>> +
>> @item all
>> Turn on all options, except @option{raw}, @option{slim}, @option{verbose}
>> and @option{lineno}.
>> @@ -7808,6 +7815,7 @@ also turns on the following optimization flags:
>> -frerun-cse-after-loop @gol
>> -fsched-interblock -fsched-spec @gol
>> -fschedule-insns -fschedule-insns2 @gol
>> +-fsplit-paths @gol
>> -fstrict-aliasing -fstrict-overflow @gol
>> -ftree-builtin-call-dce @gol
>> -ftree-switch-conversion -ftree-tail-merge @gol
>> @@ -8821,7 +8829,7 @@ currently enabled, but may be enabled by @option{-O2}
>> in the future.
>>
>> @item -ftree-sink
>> @opindex ftree-sink
>> -Perform forward store motion on trees. This flag is
>> +Perform forward store motion on trees. This flag is
>> enabled by default at @option{-O} and higher.
>>
>> @item -ftree-bit-ccp
>> @@ -9127,6 +9135,12 @@ enabled by default at @option{-O2} and higher. Null
>> pointer check
>> elimination is only done if @option{-fdelete-null-pointer-checks} is
>> enabled.
>>
>> +@item -fsplit-paths
>> +@opindex fsplit-paths
>> +Split paths leading to loop backedges. This can improve dead code
>> +elimination and common subexpression elimination. This is enabled by
>> +default at @option{-O2} and above.
>> +
>> @item -fsplit-ivs-in-unroller
>> @opindex fsplit-ivs-in-unroller
>> Enables expression of values of induction variables in later iterations
>> diff --git a/gcc/gimple-ssa-split-paths.c b/gcc/gimple-ssa-split-paths.c
>> new file mode 100644
>> index 0000000..602e916
>> --- /dev/null
>> +++ b/gcc/gimple-ssa-split-paths.c
>> @@ -0,0 +1,270 @@
>> +/* Support routines for Splitting Paths to loop backedges
>> + Copyright (C) 2015 Free Software Foundation, Inc.
>> + Contributed by Ajit Kumar Agarwal <ajitkum@xilinx.com>.
>> +
>> + This file is part of GCC.
>> +
>> + GCC is free software; you can redistribute it and/or modify
>> + it under the terms of the GNU General Public License as published by
>> + the Free Software Foundation; either version 3, or (at your option)
>> + any later version.
>> +
>> +GCC is distributed in the hope that it will be useful,
>> +but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> +GNU General Public License for more details.
>> +
>> +You should have received a copy of the GNU General Public License
>> +along with GCC; see the file COPYING3. If not see
>> +<http://www.gnu.org/licenses/>. */
>> +
>> +#include "config.h"
>> +#include "system.h"
>> +#include "coretypes.h"
>> +#include "backend.h"
>> +#include "tree.h"
>> +#include "gimple.h"
>> +#include "tree-pass.h"
>> +#include "cfganal.h"
>> +#include "cfgloop.h"
>> +#include "gimple-iterator.h"
>> +#include "tracer.h"
>> +
>> +/* Given LATCH, the latch block in a loop, see if the shape of the
>> + path reaching LATCH is suitable for being split by duplication.
>> + If so, return the block that will be duplicated into its predecessor
>> + paths. Else return NULL. */
>> +
>> +static basic_block
>> +find_block_to_duplicate_for_splitting_paths (basic_block latch)
>> +{
>> + /* We should have simple latches at this point. So the latch should
>> + have a single successor. This implies the predecessor of the latch
>> + likely has the loop exit. And it's that predecessor we're most
>> + interested in. To keep things simple, we're going to require that
>> + the latch have a single predecessor too. */
>> + if (single_succ_p (latch) && single_pred_p (latch))
>> + {
>> + basic_block bb = get_immediate_dominator (CDI_DOMINATORS, latch);
>> + gcc_assert (single_pred_edge (latch)->src == bb);
>> +
>> + /* If BB has been marked as not to be duplicated, then honor that
>> + request. */
>> + if (ignore_bb_p (bb))
>> + return NULL;
>> +
>> + gimple *last = gsi_stmt (gsi_last_nondebug_bb (bb));
>> + /* The immediate dominator of the latch must end in a conditional.
>> */
>> + if (!last || gimple_code (last) != GIMPLE_COND)
>> + return NULL;
>> +
>> + /* We're hoping that BB is a join point for an IF-THEN-ELSE diamond
>> + region. Verify that it is.
>> +
>> + First, verify that BB has two predecessors (each arm of the
>> + IF-THEN-ELSE) and two successors (the latch and exit). */
>> + if (EDGE_COUNT (bb->preds) == 2 && EDGE_COUNT (bb->succs) == 2)
>> + {
>> + /* Now verify that BB's immediate dominator ends in a
>> + conditional as well. */
>> + basic_block bb_idom = get_immediate_dominator (CDI_DOMINATORS,
>> bb);
>> + gimple *last = gsi_stmt (gsi_last_nondebug_bb (bb_idom));
>> + if (!last || gimple_code (last) != GIMPLE_COND)
>> + return NULL;
>> +
>> + /* And that BB's immediate dominator's successors are the
>> + the predecessors of BB. */
>> + if (!find_edge (bb_idom, EDGE_PRED (bb, 0)->src)
>> + || !find_edge (bb_idom, EDGE_PRED (bb, 1)->src))
>> + return NULL;
>> +
>> + /* So at this point we have a simple diamond for an IF-THEN-ELSE
>> + construct starting at BB_IDOM, with a join point at BB. BB
>> + pass control outside the loop or to the loop latch.
>> +
>> + We're going to want to create two duplicates of BB, one for
>> + each successor of BB_IDOM. */
>> + return bb;
>> + }
>> + }
>> + return NULL;
>> +}
>> +
>> +/* Return TRUE if BB is a reasonable block to duplicate by examining
>> + its size, false otherwise. BB will always be a loop latch block.
>> +
>> + Should this use the same tests as we do for jump threading? */
>> +
>> +static bool
>> +is_feasible_trace (basic_block bb)
>> +{
>> + int num_stmt = 0;
>> + gimple_stmt_iterator gsi;
>> +
>> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>> + {
>> + gimple *stmt = gsi_stmt (gsi);
>> + if (!is_gimple_debug (stmt))
>> + num_stmt++;
>> + }
>> +
>> + /* We may want to limit how many statements we copy. */
>> + if (num_stmt > 1)
>> + return true;
>> +
>> + return false;
>> +}
>> +
>> +/* If the immediate dominator of the latch of the loop is
>> + block with conditional branch, then the loop latch is
>> + duplicated to its predecessors path preserving the SSA
>> + semantics.
>> +
>> + CFG before transformation.
>> +
>> + 2
>> + |
>> + |
>> + +---->3
>> + | / \
>> + | / \
>> + | 4 5
>> + | \ /
>> + | \ /
>> + | 6
>> + | / \
>> + | / \
>> + | 8 7
>> + | | |
>> + ---+ E
>> +
>> +
>> +
>> + Block 8 is the latch. We're going to make copies of block 6 (9 & 10)
>> + and wire things up so they look like this:
>> +
>> + 2
>> + |
>> + |
>> + +---->3
>> + | / \
>> + | / \
>> + | 4 5
>> + | | |
>> + | | |
>> + | 9 10
>> + | |\ /|
>> + | | \ / |
>> + | | 7 |
>> + | | | |
>> + | | E |
>> + | | |
>> + | \ /
>> + | \ /
>> + +-----8
>> +
>> +
>> + Blocks 9 and 10 will get merged into blocks 4 & 5 respectively which
>> + enables CSE, DCE and other optimizations to occur on a larger block
>> + of code. */
>> +
>> +static bool
>> +split_paths ()
>> +{
>> + bool changed = false;
>> + loop_p loop;
>> +
>> + loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
>> + initialize_original_copy_tables ();
>> + calculate_dominance_info (CDI_DOMINATORS);
>> +
>> + FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
>> + {
>> + /* See if there is a block that we can duplicate to split the
>> + path to the loop latch. */
>> + basic_block bb = find_block_to_duplicate_for_splitting_paths
>> (loop->latch);
>> +
>> + /* BB is the merge point for an IF-THEN-ELSE we want to transform.
>> +
>> + Essentially we want to create two duplicates of BB and append
>> + a duplicate to the THEN and ELSE clauses. This will split the
>> + path leading to the latch. BB will be unreachable and removed. */
>> + if (bb && is_feasible_trace (bb))
>> + {
>> + if (dump_file && (dump_flags & TDF_DETAILS))
>> + fprintf (dump_file,
>> + "Duplicating join block %d into predecessor paths\n",
>> + bb->index);
>> + basic_block pred0 = EDGE_PRED (bb, 0)->src;
>> + basic_block pred1 = EDGE_PRED (bb, 1)->src;
>> + transform_duplicate (pred0, bb);
>> + transform_duplicate (pred1, bb);
>> + changed = true;
>> + }
>> + }
>> +
>> + loop_optimizer_finalize ();
>> + free_original_copy_tables ();
>> + return changed;
>> +}
>> +
>> +/* Main entry point for splitting paths. Returns TODO_cleanup_cfg if any
>> + paths where split, otherwise return zero. */
>> +
>> +static unsigned int
>> +execute_split_paths ()
>> +{
>> + /* If we don't have at least 2 real blocks and backedges in the
>> + CFG, then there's no point in trying to perform path splitting. */
>> + if (n_basic_blocks_for_fn (cfun) <= NUM_FIXED_BLOCKS + 1
>> + || !mark_dfs_back_edges ())
>> + return 0;
>> +
>> + bool changed = split_paths();
>> + if (changed)
>> + free_dominance_info (CDI_DOMINATORS);
>> +
>> + return changed ? TODO_cleanup_cfg : 0;
>> +}
>> +
>> +static bool
>> +gate_split_paths ()
>> +{
>> + return flag_split_paths;
>> +}
>> +
>> +namespace {
>> +
>> +const pass_data pass_data_split_paths =
>> +{
>> + GIMPLE_PASS, /* type */
>> + "split-paths", /* name */
>> + OPTGROUP_NONE, /* optinfo_flags */
>> + TV_SPLIT_PATHS, /* tv_id */
>> + PROP_ssa, /* properties_required */
>> + 0, /* properties_provided */
>> + 0, /* properties_destroyed */
>> + 0, /* todo_flags_start */
>> + TODO_update_ssa, /* todo_flags_finish */
>> +};
>> +
>> +class pass_split_paths : public gimple_opt_pass
>> +{
>> + public:
>> + pass_split_paths (gcc::context *ctxt)
>> + : gimple_opt_pass (pass_data_split_paths, ctxt)
>> + {}
>> + /* opt_pass methods: */
>> + opt_pass * clone () { return new pass_split_paths (m_ctxt); }
>> + virtual bool gate (function *) { return gate_split_paths (); }
>> + virtual unsigned int execute (function *) { return execute_split_paths
>> (); }
>> +
>> +}; // class pass_split_paths
>> +
>> +} // anon namespace
>> +
>> +gimple_opt_pass *
>> +make_pass_split_paths (gcc::context *ctxt)
>> +{
>> + return new pass_split_paths (ctxt);
>> +}
>> diff --git a/gcc/opts.c b/gcc/opts.c
>> index 930ae43..be04cf5 100644
>> --- a/gcc/opts.c
>> +++ b/gcc/opts.c
>> @@ -523,6 +523,7 @@ static const struct default_options
>> default_options_table[] =
>> { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1
>> },
>> { OPT_LEVELS_2_PLUS, OPT_fipa_ra, NULL, 1 },
>> { OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 },
>> + { OPT_LEVELS_2_PLUS, OPT_fsplit_paths, NULL, 1 },
>>
>> /* -O3 optimizations. */
>> { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
>> diff --git a/gcc/passes.def b/gcc/passes.def
>> index c0ab6b9..db822d3 100644
>> --- a/gcc/passes.def
>> +++ b/gcc/passes.def
>> @@ -274,6 +274,7 @@ along with GCC; see the file COPYING3. If not see
>> POP_INSERT_PASSES ()
>> NEXT_PASS (pass_simduid_cleanup);
>> NEXT_PASS (pass_lower_vector_ssa);
>> + NEXT_PASS (pass_split_paths);
>> NEXT_PASS (pass_cse_reciprocals);
>> NEXT_PASS (pass_reassoc);
>> NEXT_PASS (pass_strength_reduction);
>> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
>> index 3301130..ee92aaf 100644
>> --- a/gcc/testsuite/ChangeLog
>> +++ b/gcc/testsuite/ChangeLog
>> @@ -1,3 +1,8 @@
>> +2015-11-13 Ajit Agarwal <ajitkum@xilinx.com>
>> + Jeff Law <law@redhat.com>
>> +
>> + * gcc.dg/tree-ssa/split-path-1.c: New test.
>> +
>> 2015-11-13 Nathan Sidwell <nathan@codesourcery.com>
>>
>> * c-c++-common/goacc/loop-auto-1.c: New.
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
>> new file mode 100644
>> index 0000000..1239892
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
>> @@ -0,0 +1,67 @@
>> +/* { dg-do run } */
>> +/* { dg-options "-O2 -fdump-tree-split-paths-details " } */
>> +
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +
>> +#define RGBMAX 255
>> +
>> +int
>> +test()
>> +{
>> + int i, Pels;
>> + unsigned char sum = 0;
>> + unsigned char xr, xg, xb;
>> + unsigned char xc, xm, xy, xk;
>> + unsigned char *ReadPtr, *EritePtr;
>> +
>> + ReadPtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100);
>> + EritePtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100);
>> +
>> + for (i = 0; i < 100;i++)
>> + {
>> + ReadPtr[i] = 100 - i;
>> + }
>> +
>> + for (i = 0; i < 100; i++)
>> + {
>> + xr = *ReadPtr++;
>> + xg = *ReadPtr++;
>> + xb = *ReadPtr++;
>> +
>> + xc = (unsigned char) (RGBMAX - xr);
>> + xm = (unsigned char) (RGBMAX - xg);
>> + xy = (unsigned char) (RGBMAX - xb);
>> +
>> + if (xc < xm)
>> + {
>> + xk = (unsigned char) (xc < xy ? xc : xy);
>> + }
>> + else
>> + {
>> + xk = (unsigned char) (xm < xy ? xm : xy);
>> + }
>> +
>> + xc = (unsigned char) (xc - xk);
>> + xm = (unsigned char) (xm - xk);
>> + xy = (unsigned char) (xy - xk);
>> +
>> + *EritePtr++ = xc;
>> + *EritePtr++ = xm;
>> + *EritePtr++ = xy;
>> + *EritePtr++ = xk;
>> + sum += *EritePtr;
>> + }
>> + return sum;
>> +}
>> +
>> +int
>> +main()
>> +{
>> + if (test() != 33)
>> + abort();
>> +
>> + return 0;
>> +}
>> +
>> +/* { dg-final { scan-tree-dump "Duplicating join block" "split-paths" } }
>> */
>> diff --git a/gcc/timevar.def b/gcc/timevar.def
>> index b429faf..45e3b70 100644
>> --- a/gcc/timevar.def
>> +++ b/gcc/timevar.def
>> @@ -252,6 +252,7 @@ DEFTIMEVAR (TV_GCSE_AFTER_RELOAD , "load CSE after
>> reload")
>> DEFTIMEVAR (TV_REE , "ree")
>> DEFTIMEVAR (TV_THREAD_PROLOGUE_AND_EPILOGUE, "thread pro- & epilogue")
>> DEFTIMEVAR (TV_IFCVT2 , "if-conversion 2")
>> +DEFTIMEVAR (TV_SPLIT_PATHS , "split paths")
>> DEFTIMEVAR (TV_COMBINE_STACK_ADJUST , "combine stack adjustments")
>> DEFTIMEVAR (TV_PEEPHOLE2 , "peephole 2")
>> DEFTIMEVAR (TV_RENAME_REGISTERS , "rename registers")
>> diff --git a/gcc/tracer.c b/gcc/tracer.c
>> index 941dc20..c2dba4c 100644
>> --- a/gcc/tracer.c
>> +++ b/gcc/tracer.c
>> @@ -51,9 +51,9 @@
>> #include "tree-inline.h"
>> #include "cfgloop.h"
>> #include "fibonacci_heap.h"
>> +#include "tracer.h"
>>
>> static int count_insns (basic_block);
>> -static bool ignore_bb_p (const_basic_block);
>> static bool better_p (const_edge, const_edge);
>> static edge find_best_successor (basic_block);
>> static edge find_best_predecessor (basic_block);
>> @@ -85,7 +85,7 @@ bb_seen_p (basic_block bb)
>> }
>>
>> /* Return true if we should ignore the basic block for purposes of tracing.
>> */
>> -static bool
>> +bool
>> ignore_bb_p (const_basic_block bb)
>> {
>> if (bb->index < NUM_FIXED_BLOCKS)
>> @@ -226,6 +226,24 @@ find_trace (basic_block bb, basic_block *trace)
>> return i;
>> }
>>
>> +/* Duplicate block BB2, placing it after BB in the CFG. Return the
>> + newly created block. */
>> +basic_block
>> +transform_duplicate (basic_block bb, basic_block bb2)
>> +{
>> + edge e;
>> + basic_block copy;
>> +
>> + e = find_edge (bb, bb2);
>> +
>> + copy = duplicate_block (bb2, e, bb);
>> + flush_pending_stmts (e);
>> +
>> + add_phi_args_after_copy (©, 1, NULL);
>> +
>> + return (copy);
>> +}
>> +
>> /* Look for basic blocks in frequency order, construct traces and tail
>> duplicate
>> if profitable. */
>>
>> @@ -321,17 +339,8 @@ tail_duplicate (void)
>> entries or at least rotate the loop. */
>> && bb2->loop_father->header != bb2)
>> {
>> - edge e;
>> - basic_block copy;
>> -
>> nduplicated += counts [bb2->index];
>> -
>> - e = find_edge (bb, bb2);
>> -
>> - copy = duplicate_block (bb2, e, bb);
>> - flush_pending_stmts (e);
>> -
>> - add_phi_args_after_copy (©, 1, NULL);
>> + basic_block copy = transform_duplicate (bb, bb2);
>>
>> /* Reconsider the original copy of block we've duplicated.
>> Removing the most common predecessor may make it to be
>> diff --git a/gcc/tracer.h b/gcc/tracer.h
>> new file mode 100644
>> index 0000000..cd1792a
>> --- /dev/null
>> +++ b/gcc/tracer.h
>> @@ -0,0 +1,26 @@
>> +/* Header file for Tracer.
>> + Copyright (C) 2015 Free Software Foundation, Inc.
>> +
>> +This file is part of GCC.
>> +
>> +GCC is free software; you can redistribute it and/or modify it under
>> +the terms of the GNU General Public License as published by the Free
>> +Software Foundation; either version 3, or (at your option) any later
>> +version.
>> +
>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
>> + for more details.
>> +
>> +You should have received a copy of the GNU General Public License
>> +along with GCC; see the file COPYING3. If not see
>> +<http://www.gnu.org/licenses/>. */
>> +
>> +#ifndef GCC_TRACER_H
>> +#define GCC_TRACER_H
>> +
>> +extern basic_block transform_duplicate (basic_block bb, basic_block bb2);
>> +extern bool ignore_bb_p (const_basic_block bb);
>> +
>> +#endif /* GCC_TRACER_H */
>> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
>> index 49e22a9..da67761 100644
>> --- a/gcc/tree-pass.h
>> +++ b/gcc/tree-pass.h
>> @@ -390,6 +390,7 @@ extern gimple_opt_pass *make_pass_tree_loop_done
>> (gcc::context *ctxt);
>> extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt);
>> extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt);
>> extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt);
>> +extern gimple_opt_pass *make_pass_split_paths (gcc::context *ctxt);
>> extern gimple_opt_pass *make_pass_phi_only_cprop (gcc::context *ctxt);
>> extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt);
>> extern gimple_opt_pass *make_pass_build_alias (gcc::context *ctxt);
>>