[Patch] Improving jump-thread pass for PR 54742

Sebastian Pop sebpop@gmail.com
Thu Dec 4 08:38:00 GMT 2014


Jeff Law wrote:
> >+@item max-fsm-thread-path-insns
> >+Maximum number of instructions to copy when duplicating blocks on a
> >+finite state automaton jump thread path.  The default is 100.
> >+
> >+@item max-fsm-thread-length
> >+Maximum number of basic blocks on a finite state automaton jump thread
> >+path.  The default is 10.
> >+
> >+@item max-fsm-thread-paths
> >+Maximum number of new jump thread paths to create for a finite state
> >+automaton.  The default is 50.
> Has there been any tuning on these defaults.  I don't have any
> strong opinions about what they ought to be, this is more to get any
> such information recorded on the lists for historical purposes.

I have not tuned any of these defaults other than making sure that coremark is
still jump-threaded.  gcc.dg/tree-ssa/ssa-dom-thread-7.c is a test-case that
will check that we always optimize coremark.

> I think it's worth a note in the debug dump anytime you abort
> threading when you hit a limit.

Done.

> I'm a bit worried about compile-time impacts of the all the
> recursion, but I'm willing to wait and see if it turns out to be a
> problem in practice.

Done, as Richi suggested, checking the flag_expensive_optimizations.

> >@@ -689,6 +692,12 @@ simplify_control_stmt_condition (edge e,
> >  	 pass specific callback to try and simplify it further.  */
> >        if (cached_lhs && ! is_gimple_min_invariant (cached_lhs))
> >          cached_lhs = (*simplify) (stmt, stmt);
> >+
> >+      /* We couldn't find an invariant.  But, callers of this
> >+	 function may be able to do something useful with the
> >+	 unmodified destination.  */
> >+      if (!cached_lhs)
> >+	cached_lhs = original_lhs;
> >      }
> >    else
> >      cached_lhs = NULL;
> Can't you just use COND rather than stuffing its value away into
> ORIGINAL_LHS?    CACHED_LHS may be better in some cases if it's an
> SSA_NAME (and it should be), but I doubt it matters in practice.
> 
> Or is it the case that you have to have the original condition --
> without any context sensitive equivalences used to "simplify" the
> condition.

I think we need to start the search for FSM jump-threads with the original non
simplified condition.

> >+/* Return true if there is at least one path from START_BB to END_BB.
> >+   VISITED_BBS is used to make sure we don't fall into an infinite loop.  */
> >+
> >+static bool
> >+fsm_find_thread_path (basic_block start_bb, basic_block end_bb,
> >+		      vec<basic_block, va_gc> *&path,
> >+		      hash_set<basic_block> *visited_bbs, int n_insns)
> Update comment to indicate how PATH is used to return a path from
> START_BB to END_BB.

Done.

> >+		  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> >+		       gsi_next (&gsi))
> >+		    ++n_insns;
> Probably don't want to count labels and GIMPLE_NOPs.  Probably do
> want to count non-virtual PHIs since those may end up as a copies or
> constant initializations.

Done.

> 
> >+		      if (j == 0)
> >+			kind = EDGE_START_FSM_THREAD;
> >+		      else if (single_pred_p (e->src))
> >+			kind = EDGE_NO_COPY_SRC_BLOCK;
> >+		      else {
> >+			kind = EDGE_COPY_SRC_JOINER_BLOCK;
> >+			++joiners;
> >+		      }
> Presumably the mis-formatting was added when you tracked the #
> joiners.  AFAICT that is a write-only variable and ought to be
> removed.  Along with the braces on the final ELSE which should
> restore proper formatting.

Done.

> 
> 
> >@@ -2343,6 +2493,55 @@ thread_through_all_blocks (bool may_peel_loop_headers)
> >    threaded_blocks = BITMAP_ALLOC (NULL);
> >    memset (&thread_stats, 0, sizeof (thread_stats));
> >
> >+  for (i = 0; i < paths.length ();)
> Comment before this loop.  I can see what you're doing, but I'm
> already very familiar with this code.  Basically what are you
> looking for in this loop and what do you do?

Done.

> Overall I think this is very very close and I really like the
> overall direction.  There's a few minor issues noted above and with
> those addressed, I think we should be ready to go.

Thanks for your careful review.  Please let me know if there still are things I
can improve in the attached patch.
The path passes bootstrap on x86_64-linux and powerpc64-linux, and regtest except 
a fail I have not seen in the past:

FAIL: gcc.c-torture/compile/pr27571.c   -Os  (internal compiler error)

I am still investigating why this fails: as far as I can see for now this is
because in copying the FSM path we create an internal loop that is then
discovered by the loop verifier as a natural loop and is not yet in the existing
loop sturctures.  I will try to fix this in duplicate_seme by invalidating the
loop structure after we code generated all the FSM paths.  I will submit an
updated patch when it passes regtest.

> Removing most of tree-ssa-threadupdate.c and using SEME duplication
> would be a huge step forward for making this code more
> understandable. I look forward to any work you do in this space in
> the future.

I will clean up the patch and I will submit it for review (for stage 1.)

Sebastian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-extend-jump-thread-for-finite-state-automata-PR-5474.patch
Type: text/x-diff
Size: 25214 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20141204/8c904372/attachment.bin>


More information about the Gcc-patches mailing list