This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] A jump threading opportunity for condition branch
Richard Biener <rguenther@suse.de> writes:
> On Thu, 23 May 2019, Jiufu Guo wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>>
>> > On Thu, 23 May 2019, Jiufu Guo wrote:
>> >
>> >> Hi,
>> >>
>> >> Richard Biener <rguenther@suse.de> writes:
>> >>
>> >> > On Tue, 21 May 2019, Jiufu Guo wrote:
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >> This patch implements a new opportunity of jump threading for PR77820.
>> >> >> In this optimization, conditional jumps are merged with unconditional jump.
>> >> >> And then moving CMP result to GPR is eliminated.
>> >> >>
>> >> >> It looks like below:
>> >> >>
>> >> >> <P0>
>> >> >> p0 = a CMP b
>> >> >> goto <X>;
>> >> >>
>> >> >> <P1>
>> >> >> p1 = c CMP d
>> >> >> goto <X>;
>> >> >>
>> >> >> <X>
>> >> >> # phi = PHI <p0 (P0), p1 (P1)>
>> >> >> if (phi != 0) goto <Y>; else goto <Z>;
>> >> >>
>> >> >> Could be transformed to:
>> >> >>
>> >> >> <P0>
>> >> >> p0 = a CMP b
>> >> >> if (p0 != 0) goto <Y>; else goto <Z>;
>> >> >>
>> >> >> <P1>
>> >> >> p1 = c CMP d
>> >> >> if (p1 != 0) goto <Y>; else goto <Z>;
>> >> >>
>> >> >>
>> >> >> This optimization eliminates:
>> >> >> 1. saving CMP result: p0 = a CMP b.
>> >> >> 2. additional CMP on branch: if (phi != 0).
>> >> >> 3. converting CMP result if there is phi = (INT_CONV) p0 if there is.
>> >> >>
>> >> >> Bootstrapped and tested on powerpc64le with no regressions(one case is improved)
>> >> >> and new testcases are added. Is this ok for trunk?
>> >> >>
>> >> >> Thanks!
>> >> >> Jiufu Guo
>> >> >>
>> >> ...
>> >> >> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
>> >> >> index c3ea2d6..23000f6 100644
>> >> >> --- a/gcc/tree-ssa-threadedge.c
>> >> >> +++ b/gcc/tree-ssa-threadedge.c
>> >> >> @@ -1157,6 +1157,90 @@ thread_through_normal_block (edge e,
>> >> >> return 0;
>> >> >> }
>> >> >>
>> >> >> +/* Return true if PHI's INDEX-th incoming value is a CMP, and the CMP is
>> >> >> + defined in the incoming basic block. Otherwise return false. */
>> >> >> +static bool
>> >> >> +cmp_from_unconditional_block (gphi *phi, int index)
>> >> >> +{
>> >> >> + tree value = gimple_phi_arg_def (phi, index);
>> >> >> + if (!(TREE_CODE (value) == SSA_NAME && has_single_use (value)))
>> >> >> + return false;
>> >> >
>> >> > Not sure why we should reject a constant here but I guess we
>> >> > expect it to find a simplified condition anyways ;)
>> >> >
>> >> Const could be accepted here, like "# t_9 = PHI <5(3), t_17(4)>". I
>> >> found this case is already handled by other jump-threading code, like
>> >> 'ethread' pass.
>> >>
>> >> >> +
>> >> >> + gimple *def = SSA_NAME_DEF_STMT (value);
>> >> >> +
>> >> >> + if (!is_gimple_assign (def))
>> >> >> + return false;
>> >> >> +
>> >> >> + if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)))
>> >> >> + {
>> >> >> + value = gimple_assign_rhs1 (def);
>> >> >> + if (!(TREE_CODE (value) == SSA_NAME && has_single_use (value)))
>> >> >> + return false;
>> >> >> +
>> >> >> + def = SSA_NAME_DEF_STMT (value);
>> >> >> +
>> >> >> + if (!is_gimple_assign (def))
>> >> >> + return false;
>> >> >
>> >> > too much vertial space.
>> >> >
>> >> Thanks, I will refine it.
>> >> >> + }
>> >> >> +
>> >> >> + if (TREE_CODE_CLASS (gimple_assign_rhs_code (def)) != tcc_comparison)
>> >> >> + return false;
>> >> >> +
>> >> >> + /* Check if phi's incoming value is defined in the incoming basic_block. */
>> >> >> + edge e = gimple_phi_arg_edge (phi, index);
>> >> >> + if (def->bb != e->src)
>> >> >> + return false;
>> >> >
>> >> > why does this matter?
>> >> >
>> >> Through preparing pathes and duplicating block, this transform can also
>> >> help to combine a cmp in previous block and a gcond in current block.
>> >> "if (def->bb != e->src)" make sure the cmp is define in the incoming
>> >> block of the current; and then combining "cmp with gcond" is safe. If
>> >> the cmp is defined far from the incoming block, it would be hard to
>> >> achieve the combining, and the transform may not needed.
>> >
>> > We're in SSA form so the "combining" doesn't really care where the
>> > definition comes from.
>> >
>> >> >> +
>> >> >> + if (!single_succ_p (def->bb))
>> >> >> + return false;
>> >> >
>> >> > Or this? The actual threading will ensure this will hold true.
>> >> >
>> >> Yes, other thread code check this and ensure it to be true, like
>> >> function thread_through_normal_block. Since this new function is invoked
>> >> outside thread_through_normal_block, so, checking single_succ_p is also
>> >> needed for this case.
>> >
>> > I mean threading will isolate the path making this trivially true.
>> > It's also no requirement for combining, in fact due to the single-use
>> > check the definition can be sinked across the edge already (if
>> > the edges dest didn't have multiple predecessors which this threading
>> > will fix as well).
>> >
>> I would relax these check and have a test.
>>
>> And I refactor the code a little as below. Thanks for any comments!
>>
>> bool
>> edge_forwards_cmp_to_conditional_jump_through_empty_bb_p (edge e)
>> {
>> basic_block bb = e->dest;
>>
>> /* See if there is only one stmt which is gcond. */
>> gimple *gs = last_and_only_stmt (bb);
>> if (gs == NULL || gimple_code (gs) != GIMPLE_COND)
>> return false;
>>
>> /* See if gcond's condition is "(phi !=/== 0/1)". */
>> tree cond = gimple_cond_lhs (gs);
>> if (TREE_CODE (cond) != SSA_NAME
>> || gimple_code (SSA_NAME_DEF_STMT (cond)) != GIMPLE_PHI
>> || gimple_bb (SSA_NAME_DEF_STMT (cond)) != bb)
>> return false;
>> enum tree_code code = gimple_cond_code (gs);
>> tree rhs = gimple_cond_rhs (gs);
>> if (!(code == NE_EXPR || code == EQ_EXPR || integer_onep (rhs)
>> || integer_zerop (rhs)))
>
> GCCs coding standard says that if a condition doesn't fit on
> a single line you should split after each || or &&
Get it.
>
>> return false;
>>
>> gphi *phi = as_a<gphi *> (SSA_NAME_DEF_STMT (cond));
>
> If you had used dyn_cast <gphi *> () above the GIMPLE_PHI
> check would have been for phi != NULL and you'd save a line
> of code.
>
>> edge_iterator ei;
>> edge in_e;
>> FOR_EACH_EDGE (in_e, ei, bb->preds)
>> {
>
> As said in my first review I'd just check whether for the
> edge we want to thread through the definition comes from a CMP.
> Suppose you have
>
> # val_1 = PHI <a_2, b_3, c_4>
> if (val_1 != 0)
>
> and only one edge has a b_3 = d_5 != 0 condition it's still
> worth tail-duplicating the if block.
Right.
>
> otherwise it looks ok to me.
>
> Thanks,
> Richard.
>
I would update accordingly and have tests. If pass, would I send refined
patch and ask for approval to deliver code.
Thanks!
Jiufu Guo.
>> /* Check if phi's incoming value is CMP */
>> gimple *def;
>> tree value = PHI_ARG_DEF_FROM_EDGE (phi, in_e);
>> if (TREE_CODE (value) == SSA_NAME && has_single_use (value)
>> && is_gimple_assign (SSA_NAME_DEF_STMT (value)))
>> def = SSA_NAME_DEF_STMT (value);
>> else
>> return false;
>>
>> /* Or if it is (INTCONV) (a CMP b). */
>> if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)))
>> {
>> value = gimple_assign_rhs1 (def);
>> if (TREE_CODE (value) == SSA_NAME && has_single_use (value)
>> && is_gimple_assign (SSA_NAME_DEF_STMT (value)))
>> def = SSA_NAME_DEF_STMT (value);
>> else
>> return false;
>> }
>> if (TREE_CODE_CLASS (gimple_assign_rhs_code (def)) != tcc_comparison)
>> return false;
>> }
>>
>> return true;
>> }
>>
>> Thanks,
>> Jiufu Guo
>> >> >> + return true;
>> >> >> +}
>> >> >> +
>> >> >> +/* There are basic blocks look like:
>> >> >> + <P0>
>> >> >> + p0 = a CMP b ; or p0 = (INT)( a CMP b)
>> >> >> + goto <X>;
>> >> >> +
>> >> >> + <P1>
>> >> >> + p1 = c CMP d
>> >> >> + goto <X>;
>> >> >> +
>> >> >> + <X>
>> >> >> + # phi = PHI <p0 (P0), p1 (P1)>
>> >> >> + if (phi != 0) goto <Y>; else goto <Z>;
>> >> >> +
>> >> >> + Then, <X>: a trivial join block.
>> >> >> +
>> >> >> + Check if BB is <X> in like above. */
>> >> >> +
>> >> >> +bool
>> >> >> +is_trivial_join_block (basic_block bb)
>> >> >
>> >> > I'd make this work on a specific edge.
>> >> >
>> >> > edge_forwards_conditional_to_conditional_jump_through_empty_bb_p (edge e)
>> >> > {
>> >> > basic_block b = e->dest;
>> >> >
>> >> > maybe too elaborate name ;)
>> >> >
>> >> Thanks for help to name the function! It is very valuable for me ;)
>> >> >> +{
>> >> >> + gimple *gs = last_and_only_stmt (bb);
>> >> >> + if (gs == NULL)
>> >> >> + return false;
>> >> >> +
>> >> >> + if (gimple_code (gs) != GIMPLE_COND)
>> >> >> + return false;
>> >> >> +
>> >> >> + tree cond = gimple_cond_lhs (gs);
>> >> >> +
>> >> >> + if (TREE_CODE (cond) != SSA_NAME)
>> >> >> + return false;
>> >> >
>> >> > space after if( too much vertical space in this function
>> >> > for my taste btw.
>> >> Will update this.
>> >> >
>> >> > For the forwarding to work we want a NE_EXPR or EQ_EXPR
>> >> > as gimple_cond_code and integer_one_p or integer_zero_p
>> >> > gimple_cond_rhs.
>> >> Right, checking those would be more safe. Since no issue found, during
>> >> bootstrap and regression tests, so I did not add these checking. I will
>> >> add this checking.
>> >> >
>> >> >> +
>> >> >> + if (gimple_code (SSA_NAME_DEF_STMT (cond)) != GIMPLE_PHI)
>> >> >> + return false;
>> >> >> +
>> >> >> + gphi *phi = as_a<gphi *> (SSA_NAME_DEF_STMT (cond));
>> >> >
>> >> > I think to match your pattern you want to check that
>> >> > gimple_bb (phi) == bb as well here.
>> >> Right, it should be checked. I will update.
>> >> >
>> >> >> + for (unsigned int i = 0; i < phi->nargs; i++)
>> >> >> + if (!cmp_from_unconditional_block (phi, i))
>> >> >
>> >> > Just process the incoming edge argument and inline the
>> >> > helper. You can use PHI_ARG_DEF_FROM_EDGE here.
>> >> I will refine code, and try to use it.
>> >> >
>> >> > Thanks for integrating this into jump-threading - it does look
>> >> > like a good fit.
>> >> >
>> >> > How often does this trigger during bootstrap?
>> >> Thanks for your sugguestion, this could help to evaluate patch. During
>> >> bootstrap(stage 2 or 3), in gcc source code, 1300-1500 basic blocks are
>> >> fullfile this tranform.
>> >
>> > Thanks,
>> > Richard.
>>
>>