[Patch] Improving jump-thread pass for PR 54742
Sebastian Pop
sebpop@gmail.com
Thu Dec 4 08:38:00 GMT 2014
Jeff Law wrote:
> >+@item max-fsm-thread-path-insns
> >+Maximum number of instructions to copy when duplicating blocks on a
> >+finite state automaton jump thread path. The default is 100.
> >+
> >+@item max-fsm-thread-length
> >+Maximum number of basic blocks on a finite state automaton jump thread
> >+path. The default is 10.
> >+
> >+@item max-fsm-thread-paths
> >+Maximum number of new jump thread paths to create for a finite state
> >+automaton. The default is 50.
> Has there been any tuning on these defaults. I don't have any
> strong opinions about what they ought to be, this is more to get any
> such information recorded on the lists for historical purposes.
I have not tuned any of these defaults other than making sure that coremark is
still jump-threaded. gcc.dg/tree-ssa/ssa-dom-thread-7.c is a test-case that
will check that we always optimize coremark.
> I think it's worth a note in the debug dump anytime you abort
> threading when you hit a limit.
Done.
> I'm a bit worried about compile-time impacts of the all the
> recursion, but I'm willing to wait and see if it turns out to be a
> problem in practice.
Done, as Richi suggested, checking the flag_expensive_optimizations.
> >@@ -689,6 +692,12 @@ simplify_control_stmt_condition (edge e,
> > pass specific callback to try and simplify it further. */
> > if (cached_lhs && ! is_gimple_min_invariant (cached_lhs))
> > cached_lhs = (*simplify) (stmt, stmt);
> >+
> >+ /* We couldn't find an invariant. But, callers of this
> >+ function may be able to do something useful with the
> >+ unmodified destination. */
> >+ if (!cached_lhs)
> >+ cached_lhs = original_lhs;
> > }
> > else
> > cached_lhs = NULL;
> Can't you just use COND rather than stuffing its value away into
> ORIGINAL_LHS? CACHED_LHS may be better in some cases if it's an
> SSA_NAME (and it should be), but I doubt it matters in practice.
>
> Or is it the case that you have to have the original condition --
> without any context sensitive equivalences used to "simplify" the
> condition.
I think we need to start the search for FSM jump-threads with the original non
simplified condition.
> >+/* Return true if there is at least one path from START_BB to END_BB.
> >+ VISITED_BBS is used to make sure we don't fall into an infinite loop. */
> >+
> >+static bool
> >+fsm_find_thread_path (basic_block start_bb, basic_block end_bb,
> >+ vec<basic_block, va_gc> *&path,
> >+ hash_set<basic_block> *visited_bbs, int n_insns)
> Update comment to indicate how PATH is used to return a path from
> START_BB to END_BB.
Done.
> >+ for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> >+ gsi_next (&gsi))
> >+ ++n_insns;
> Probably don't want to count labels and GIMPLE_NOPs. Probably do
> want to count non-virtual PHIs since those may end up as a copies or
> constant initializations.
Done.
>
> >+ if (j == 0)
> >+ kind = EDGE_START_FSM_THREAD;
> >+ else if (single_pred_p (e->src))
> >+ kind = EDGE_NO_COPY_SRC_BLOCK;
> >+ else {
> >+ kind = EDGE_COPY_SRC_JOINER_BLOCK;
> >+ ++joiners;
> >+ }
> Presumably the mis-formatting was added when you tracked the #
> joiners. AFAICT that is a write-only variable and ought to be
> removed. Along with the braces on the final ELSE which should
> restore proper formatting.
Done.
>
>
> >@@ -2343,6 +2493,55 @@ thread_through_all_blocks (bool may_peel_loop_headers)
> > threaded_blocks = BITMAP_ALLOC (NULL);
> > memset (&thread_stats, 0, sizeof (thread_stats));
> >
> >+ for (i = 0; i < paths.length ();)
> Comment before this loop. I can see what you're doing, but I'm
> already very familiar with this code. Basically what are you
> looking for in this loop and what do you do?
Done.
> Overall I think this is very very close and I really like the
> overall direction. There's a few minor issues noted above and with
> those addressed, I think we should be ready to go.
Thanks for your careful review. Please let me know if there still are things I
can improve in the attached patch.
The path passes bootstrap on x86_64-linux and powerpc64-linux, and regtest except
a fail I have not seen in the past:
FAIL: gcc.c-torture/compile/pr27571.c -Os (internal compiler error)
I am still investigating why this fails: as far as I can see for now this is
because in copying the FSM path we create an internal loop that is then
discovered by the loop verifier as a natural loop and is not yet in the existing
loop sturctures. I will try to fix this in duplicate_seme by invalidating the
loop structure after we code generated all the FSM paths. I will submit an
updated patch when it passes regtest.
> Removing most of tree-ssa-threadupdate.c and using SEME duplication
> would be a huge step forward for making this code more
> understandable. I look forward to any work you do in this space in
> the future.
I will clean up the patch and I will submit it for review (for stage 1.)
Sebastian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-extend-jump-thread-for-finite-state-automata-PR-5474.patch
Type: text/x-diff
Size: 25214 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20141204/8c904372/attachment.bin>
More information about the Gcc-patches
mailing list