This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Catch 11.8% more jump threading opportunities at treelevel.


On Tue, 2005-02-08 at 16:46 -0500, Kazu Hirata wrote:
> Hi,
> 
> Attached is an experimental patch to catch 11.8% more jump threading
> opportunities at tree level.
> 
> Consider:
> 
> <L0>:;
>   if (b_4 != 2) goto <L1>; else goto <L2>;
> 
> <L1>:;
>   D.1120_7 = 10;
>   goto <bb 6> (<L6>);
> 
> <L2>:;
>   if (b_4 == 1) goto <L4>; else goto <L5>;
> 
> Note that edge from <L0> to <L2> can be threaded through <L2> to <L5>.
> However, currently, DOM does not record "b_4 == 2" when threading edge
> from <L0> to <L2> due to what seems to be an artificial restriction in
> dom_opt_finalize_block and fails to thread the edge.  The patch below
> removes this restriction and allows DOM to take the above jump
> threading opportunity.
> 
> The story does not end here.  Once we remove the restriction, we know
> "too much" about variables and end up missing some jump threading
> opportunities.  Consider:
> 
> <bb 0>
>   D.1120_2 = *p_1;
>   if (D.1120_2 != 0) goto <L0>; else goto <L1>;
> 
> <L0>:;
>   bar ();
> 
> <L1>:;
>   D.1120_3 = *p_1;
>   if (D.1120_3 != 0) goto <L2>; else goto <L3>;
> 
> When we consider threading edge from <bb 0> to <L1>, we know that
> D.1120_2 == *p_1 and that D.1120_2 == 0.  As soon as
> thread_across_edge sees "D.1120_3 = *p_1;", it is simplified to
> "D.1120_3 = 0;".  The rhs is no longer an SSA_NAME.  Since
> thread_across_edge requires that every statement leading up to
> COND_EXPR or SWITCH_EXPR must be a nop like a copy betwen two
> SSA_NAMEs with the same underlying variable, we drop this jump
> threading opportunity.
> 
> To work around this problem, I created a variant of lookup_avail_expr
> without SSA_NAME_VALUE chasing.  This new function allows
> thread_across_edge to know that "D.1120_2 == D.1120_3;", making it
> possible to take the jump threading opportunity.
> 
> This small hack seems to bring a substantial improvement.
> 
> Here are the numbers of jump threading opportunities picked up at
> various points while compiling cc1-i files.
> 
>       original patched    diff%
> -------------------------------
> tree     19431   21729 +11.826% <- wow!
> RTL       2962    2224 -24.915%
> -------------------------------
> total    22393   23953  +6.966%
> 
> Jeff seems to have his own plan to improve the jump threading
> selection code, but I just wanted to let him and everybody else know
> that there are a lot of low-hanging jump threading opportunities
> waiting there regardless of whether or not we want to go with my
> approach.  Note that we take a lot of work away from the RTL-level
> jump bypass code, which is a good thing.  I never feel comfortable
> when RTL optimizers are deciphering conditonal jumps that are
> expressed in so many different ways from architecture to architecture.
> 
> Bootstrapped on i686-pc-linux-gnu.  Any comments?
> 
> p.s.
> Let's not worry about lack of comment, code duplication, etc, for
> now. :-)
> 
> Kazu Hirata
> 
> 2005-02-08  Kazu Hirata  <kazu@cs.umass.edu>
> 
> 	PR tree-optimization/19516, PR tree-optimization/19804, 
> 	* tree-ssa-dom.c (thread_across_edge): If cached_lhs is
> 	useless, call lookup_avail_expr_1 to see if we can detect a
> 	nop.
> 	(lookup_avail_expr_1): New.
This is not safe in the presence of loops in the CFG.  Please do not
install.

I have changed queued for 4.1 which revamp the thread selection code
which are far more effective than what you have posted.

jeff



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]