This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Catch 11.8% more jump threading opportunities at treelevel.
- From: Jeffrey A Law <law at redhat dot com>
- To: Kazu Hirata <kazu at cs dot umass dot edu>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 08 Feb 2005 15:34:15 -0700
- Subject: Re: [RFC] Catch 11.8% more jump threading opportunities at treelevel.
- Organization: Red Hat, Inc
- References: <20050208.164642.74751967.kazu@cs.umass.edu>
- Reply-to: law at redhat dot com
On Tue, 2005-02-08 at 16:46 -0500, Kazu Hirata wrote:
> Hi,
>
> Attached is an experimental patch to catch 11.8% more jump threading
> opportunities at tree level.
>
> Consider:
>
> <L0>:;
> if (b_4 != 2) goto <L1>; else goto <L2>;
>
> <L1>:;
> D.1120_7 = 10;
> goto <bb 6> (<L6>);
>
> <L2>:;
> if (b_4 == 1) goto <L4>; else goto <L5>;
>
> Note that edge from <L0> to <L2> can be threaded through <L2> to <L5>.
> However, currently, DOM does not record "b_4 == 2" when threading edge
> from <L0> to <L2> due to what seems to be an artificial restriction in
> dom_opt_finalize_block and fails to thread the edge. The patch below
> removes this restriction and allows DOM to take the above jump
> threading opportunity.
>
> The story does not end here. Once we remove the restriction, we know
> "too much" about variables and end up missing some jump threading
> opportunities. Consider:
>
> <bb 0>
> D.1120_2 = *p_1;
> if (D.1120_2 != 0) goto <L0>; else goto <L1>;
>
> <L0>:;
> bar ();
>
> <L1>:;
> D.1120_3 = *p_1;
> if (D.1120_3 != 0) goto <L2>; else goto <L3>;
>
> When we consider threading edge from <bb 0> to <L1>, we know that
> D.1120_2 == *p_1 and that D.1120_2 == 0. As soon as
> thread_across_edge sees "D.1120_3 = *p_1;", it is simplified to
> "D.1120_3 = 0;". The rhs is no longer an SSA_NAME. Since
> thread_across_edge requires that every statement leading up to
> COND_EXPR or SWITCH_EXPR must be a nop like a copy betwen two
> SSA_NAMEs with the same underlying variable, we drop this jump
> threading opportunity.
>
> To work around this problem, I created a variant of lookup_avail_expr
> without SSA_NAME_VALUE chasing. This new function allows
> thread_across_edge to know that "D.1120_2 == D.1120_3;", making it
> possible to take the jump threading opportunity.
>
> This small hack seems to bring a substantial improvement.
>
> Here are the numbers of jump threading opportunities picked up at
> various points while compiling cc1-i files.
>
> original patched diff%
> -------------------------------
> tree 19431 21729 +11.826% <- wow!
> RTL 2962 2224 -24.915%
> -------------------------------
> total 22393 23953 +6.966%
>
> Jeff seems to have his own plan to improve the jump threading
> selection code, but I just wanted to let him and everybody else know
> that there are a lot of low-hanging jump threading opportunities
> waiting there regardless of whether or not we want to go with my
> approach. Note that we take a lot of work away from the RTL-level
> jump bypass code, which is a good thing. I never feel comfortable
> when RTL optimizers are deciphering conditonal jumps that are
> expressed in so many different ways from architecture to architecture.
>
> Bootstrapped on i686-pc-linux-gnu. Any comments?
>
> p.s.
> Let's not worry about lack of comment, code duplication, etc, for
> now. :-)
>
> Kazu Hirata
>
> 2005-02-08 Kazu Hirata <kazu@cs.umass.edu>
>
> PR tree-optimization/19516, PR tree-optimization/19804,
> * tree-ssa-dom.c (thread_across_edge): If cached_lhs is
> useless, call lookup_avail_expr_1 to see if we can detect a
> nop.
> (lookup_avail_expr_1): New.
This is not safe in the presence of loops in the CFG. Please do not
install.
I have changed queued for 4.1 which revamp the thread selection code
which are far more effective than what you have posted.
jeff