This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH V4] Loop split upon semi-invariant condition (PR tree-optimization/89134)


Hi, Richard

   This is a new patch to support more generalized semi-invariant condition, which uses
control dependence analysis.

Thanks,
Feng

________________________________________
From: Feng Xue OS <fxue@os.amperecomputing.com>
Sent: Friday, October 25, 2019 11:43 AM
To: Richard Biener
Cc: Michael Matz; Philipp Tomsich; gcc-patches@gcc.gnu.org; Christoph Müllner; erick.ochoa@theobroma-systems.com
Subject: Re: [PATCH V3] Loop split upon semi-invariant condition (PR tree-optimization/89134)

Richard,

    Thanks for your comments.

>+      /* For PHI node that is not in loop header, its source operands should
>+        be defined inside the loop, which are seen as loop variant.  */
>+      if (def_bb != loop->header || !skip_head)
>+       return false;

> so if we have
>
> for (;;)
>  {
>     if (x)
>       a = ..;
>     else
>       a = ...;
>     if (cond-to-split-on dependent on a)
> ...
>  }
>
> the above is too restrictive in case 'x' is semi-invariant as well, correct?
In above case, cond-on-a will not be identified as semi-invariant, in that
a is defined by PHI with real multi-sources. To handle it,  besides each
source value, we should add extra check on each source's control
dependence node (x in the case), which might have not a little code expansion.
Anyway, I'll have a try.


>+         /* A new value comes from outside of loop.  */
>+         if (!bb || !flow_bb_inside_loop_p (loop, bb))
>+           return false;

> but that means starting from the second iteration the value is invariant.
No. Traversal direction is reverse to loop execution. In the following,
start from "x_1 = ", extract latch value x_3, and get x_3 definition, and
finally reach "x_1 =".

Loop:
      x_1 = PHI (x_0, x_3)
      ...
      x_3 =
      ...
      goto Loop;


>+                 /* Don't consider redefinitions in excluded basic blocks.  */
>+                 if (!dominated_by_p (CDI_DOMINATORS, e->src, skip_head))
>+                   {
>+                     /* There are more than one source operands that can
>+                        provide value to the SSA name, it is variant.  */
>+                     if (from)
>+                       return false;
>
> they might be the same though, for PHIs with > 2 arguments.
OK. Will add value equivalence check.


> In the cycle handling you are not recursing via stmt_semi_invariant_p
> but only handle SSA name copies - any particular reason for that?
The cycle handling is specified for ssa that crosses iteration. It is
semi-invariant if it remains unchanged after certain iteration, which
means its value in previous iteration (coming from latch edge) is just
a copy of its self,  nothing else. So, recursion via stmt_semi_invariant_p
is unnecessary.

Loop:
      x_1 = PHI (x_0, x_3);
      x_2 = PHI(x_1, value defined in excluded branch);
      x_3 = x_2;
      goto Loop;


>+static bool
>+branch_removable_p (basic_block branch_bb)
>+{
>+  if (single_pred_p (branch_bb))
>+    return true;
>
> I'm not sure what this function tests - at least the single_pred_p check
> looks odd to me given the dominator checks later.  The single predecessor
> could simply be a forwarder.  I wonder if you are looking for branches forming
> an irreducible loop?  I think you can then check EDGE_IRREDUCIBLE_LOOP
> or BB_IRREDUCIBLE_LOOP on the condition block (btw, I don't see
> testcases covering the appearant special-cases in the patch - refering to
> existing ones via a comment often helps understanding the code).

Upon condition evaluation, if a branch is not selected,
This function test a branch is reachable from other place other than its
conditional statement. This ensure that when the branch is not selected
upon condition evaluation, trace path led by the branch will never
be executed so that it can be excluded  during semi-invariantness analysis.

If single_pred_p, only condition statement can reach the branch.

If not, consider a half diamond condition control graph, with a back-edge to
true branch.

            condition
               |  \
               |   \
               |  false branch
   .--->----.  |   /
   |        |  |  /
 other    true branch
   |        |
   '---<----'

If there is an edge from false branch, true branch can not be excluded even it
is not selected.  And back edge from "other" (dominated by true branch) does
not have any impact.


>+
>+  return EDGE_SUCC (cond_bb, (unsigned) invar[1]);
>+}
>
> magic ensures that invar[1] is always the invariant edge?  Oh, it's a bool.
> Ick.  I wonder if logic with int invariant_edge = -1; and the loop setting
> it to either 0 or 1 would be easier to follow...
OK.


> Note your stmt_semi_invariant_p check is exponential for a condition
> like
>
>   _1 = 1;
>   _2 = _1 + _1;
>   _3 = _2 + _2;
>   if (_3 != param_4(D))
>
> because you don't track ops you already proved semi-invariant.  We've
> run into such situation repeatedly in SCEV analysis so I doubt it can be
> disregarded as irrelevant in practice.  A worklist approach could then
> also get rid of the recursion.  You are already computing the stmts
> forming the condition in compute_added_num_insns so another option
> is to re-use that.
OK.


> Btw, I wonder if we can simply re-use PARAM_MAX_PEELED_INSNS
> instead of adding yet another param (it also happens to have the same
> size).  Because we are "peeling" the loop.
I'll check that.

>+  edge invar_branch = get_cond_invariant_branch (loop, cond);
>+
>+  if (!invar_branch)
>+    return NULL;
>
> extra vertical space is unwanted in such cases.
OK.

>+  if (dump_file && (dump_flags & TDF_DETAILS))
>+   {
>+     fprintf (dump_file, "In %s(), split loop %d at branch<%s>, BB %d\n",
>+             current_function_name (), loop1->num,
>+             true_invar ? "T" : "F", cond_bb->index);
>+     print_gimple_stmt (dump_file, cond, 0, TDF_SLIM | TDF_VOPS);
>+   }
>
> can you please use sth like
>
>  if (dump_enabled_p ())
>    dump_printf_loc (MSG_OPTIMIZED_LOCATIONS,
>                             cond, "loop split on semi-invariant condition");
>
> so -fopt-info-loop will show it?
OK.


>+  /* Generate a bool type temporary to hold result of the condition.  */
>+  tree tmp = make_ssa_name (boolean_type_node);
>+  gimple_stmt_iterator gsi = gsi_last_bb (cond_bb);
>+  gimple *stmt = gimple_build_assign (tmp,
>+                                     gimple_cond_code (cond),
>+                                     gimple_cond_lhs (cond),
>+                                     gimple_cond_rhs (cond));
>
> shorter is
>
>   gimple_seq stmts = NULL;
>   tree tmp = gimple_build (&stmts, gimple_cond_code (cond),
>                                      boolean_type_node,
>  gimple_cond_lhs (cond), gimple_cond_rhs (cond));
>   gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
OK.


>+  gsi_insert_before (&gsi, stmt, GSI_NEW_STMT);
>+  gimple_cond_set_condition (cond, EQ_EXPR, tmp, boolean_true_node);
>
> but I wonder what's the point here to move the condition computation to
> a temporary?  Why not just build the original condition again for break_cond?
OK.


> in split_loop_on_cond you'll find the first semi-invariant condition
> to split on,
> but we'll not visit the split loop again (also for original splitting I guess).
> Don't we eventually want to recurse on that?
Currently, we only do a round of loop-split. It is a TODO to enable more than
one loop-splits on a loop.

Attachment: 0001-Loop-split-on-semi-invariant-conditional-statement.patch
Description: 0001-Loop-split-on-semi-invariant-conditional-statement.patch


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]