Add a pass to back-propagate use information

Richard Biener richard.guenther@gmail.com
Mon Oct 19 14:35:00 GMT 2015


On Mon, Oct 19, 2015 at 2:38 PM, Richard Sandiford
<richard.sandiford@arm.com> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Thu, Oct 15, 2015 at 3:17 PM, Richard Sandiford
>> <richard.sandiford@arm.com> wrote:
>>> This patch adds a pass that collects information that is common to all
>>> uses of an SSA name X and back-propagates that information up the statements
>>> that generate X.  The general idea is to use the information to simplify
>>> instructions (rather than a pure DCE) so I've simply called it
>>> tree-ssa-backprop.c, to go with tree-ssa-forwprop.c.
>>>
>>> At the moment the only use of the pass is to remove unnecessry sign
>>> operations, so that it's effectively a global version of
>>> fold_strip_sign_ops.  I'm hoping it could be extended in future to
>>> record which bits of an integer are significant.  There are probably
>>> other potential uses too.
>>>
>>> A later patch gets rid of fold_strip_sign_ops.
>>>
>>> Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
>>> OK to install?
>>>
>>> Thanks,
>>> Richard
>>>
>>>
>>> gcc/
>>>         * doc/invoke.texi (-fdump-tree-backprop, -ftree-backprop): Document.
>>>         * Makefile.in (OBJS): Add tree-ssa-backprop.o.
>>>         * common.opt (ftree-backprop): New option.
>>>         * fold-const.h (negate_mathfn_p): Declare.
>>>         * fold-const.c (negate_mathfn_p): Make public.
>>>         * timevar.def (TV_TREE_BACKPROP): New.
>>>         * tree-passes.h (make_pass_backprop): Declare.
>>>         * passes.def (pass_backprop): Add.
>>>         * tree-ssa-backprop.c: New file.
>>>
>>> gcc/testsuite/
>>>         * gcc.dg/tree-ssa/backprop-1.c, gcc.dg/tree-ssa/backprop-2.c,
>>>         gcc.dg/tree-ssa/backprop-3.c, gcc.dg/tree-ssa/backprop-4.c,
>>>         gcc.dg/tree-ssa/backprop-5.c, gcc.dg/tree-ssa/backprop-6.c: New tests.
>>>
>>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>>> index 783e4c9..69e669d 100644
>>> --- a/gcc/Makefile.in
>>> +++ b/gcc/Makefile.in
>>> @@ -1445,6 +1445,7 @@ OBJS = \
>>>         tree-switch-conversion.o \
>>>         tree-ssa-address.o \
>>>         tree-ssa-alias.o \
>>> +       tree-ssa-backprop.o \
>>>         tree-ssa-ccp.o \
>>>         tree-ssa-coalesce.o \
>>>         tree-ssa-copy.o \
>>> diff --git a/gcc/common.opt b/gcc/common.opt
>>> index 5060208..5aef625 100644
>>> --- a/gcc/common.opt
>>> +++ b/gcc/common.opt
>>> @@ -2364,6 +2364,10 @@ ftree-pta
>>>  Common Report Var(flag_tree_pta) Optimization
>>>  Perform function-local points-to analysis on trees.
>>>
>>> +ftree-backprop
>>> +Common Report Var(flag_tree_backprop) Init(1) Optimization
>>> +Enable backward propagation of use properties at the tree level.
>>
>> Don't add new -ftree-* "tree" doesn't add any info for our users.  I'd
>> also refer to SSA level rather than "tree" level.  Not sure if -fbackprop
>> is good, but let's go for it.
>
> OK.
>
>>> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
>>> index de45a2c..7f00e72 100644
>>> --- a/gcc/fold-const.c
>>> +++ b/gcc/fold-const.c
>>> @@ -319,7 +318,7 @@ fold_overflow_warning (const char* gmsgid, enum
>> warn_strict_overflow_code wc)
>>>  /* Return true if the built-in mathematical function specified by CODE
>>>     is odd, i.e. -f(x) == f(-x).  */
>>>
>>> -static bool
>>> +bool
>>>  negate_mathfn_p (enum built_in_function code)
>>>  {
>>>    switch (code)
>>
>> Belongs more to builtins.[ch] if exported.
>
> The long-term plan is to abstract away whether it's a built-in function
> or an internal function, in which case I hope to have a single predicate
> that handles both.  I'm not sure where the code should go after that change.
> Maybe a new file?

Hmm, we'll see.  So just leave it in fold-const.c for now.

>>> diff --git a/gcc/fold-const.h b/gcc/fold-const.h
>>> index ee74dc8..4d5b24b 100644
>>> --- a/gcc/fold-const.h
>>> +++ b/gcc/fold-const.h
>>> @@ -173,6 +173,7 @@ extern tree sign_bit_p (tree, const_tree);
>>>  extern tree exact_inverse (tree, tree);
>>>  extern tree const_unop (enum tree_code, tree, tree);
>>>  extern tree const_binop (enum tree_code, tree, tree, tree);
>>> +extern bool negate_mathfn_p (enum built_in_function);
>>>
>>>  /* Return OFF converted to a pointer offset type suitable as offset for
>>>     POINTER_PLUS_EXPR.  Use location LOC for this conversion.  */
>>> diff --git a/gcc/passes.def b/gcc/passes.def
>>> index dc3f44c..36d2b3b 100644
>>> --- a/gcc/passes.def
>>> +++ b/gcc/passes.def
>>> @@ -159,6 +159,7 @@ along with GCC; see the file COPYING3.  If not see
>>>        /* After CCP we rewrite no longer addressed locals into SSA
>>>          form if possible.  */
>>>        NEXT_PASS (pass_complete_unrolli);
>>> +      NEXT_PASS (pass_backprop);
>>>        NEXT_PASS (pass_phiprop);
>>>        NEXT_PASS (pass_forwprop);
>>
>> Any reason to not put this later?  I was thinking before reassoc.
>
> I think we're relying on FRE to notice the redundancy in the
> builtins-*.c tests, once this pass has converted the version
> with redundant sign ops to make it look like the version without.
> reassoc is likely to be too late.

There is PRE after reassoc (run as FRE at -O1).

> I also thought it should go before rather than after some instance
> of forwprop because the pass might expose more forward folding
> opportunities.  E.g. if the sign of A = -B * B doesn't matter,
> we'll end up with A = B * B, which might be foldable with uses of A.
> It seems less likely that forwprop would expose more backprop
> opportunities.

Indeed.  I was asking because backprop runs after inlining but
with nearly no effective scalar cleanup after it to cleanup after
inlining.

In principle it only depends on some kind of DCE (DSE?) to avoid
false uses, right?

It's probably ok where you put it, I just wanted to get an idea of your
reasoning.

>
>>> diff --git a/gcc/tree-ssa-backprop.c b/gcc/tree-ssa-backprop.c
>>> new file mode 100644
>>> index 0000000..bf2dcd9
>>> --- /dev/null
>>> +++ b/gcc/tree-ssa-backprop.c
>>
>> gimple-ssa-backprop.c
>
> OK.
>
>>> +/* Return usage information for general operand OP, or null if none.  */
>>> +
>>> +const usage_info *
>>> +backprop::lookup_operand (tree op)
>>> +{
>>> +  if (op && TREE_CODE (op) == SSA_NAME)
>>> +    if (usage_info **slot = m_info_map.get (op))
>>> +      return *slot;
>>
>> Is this usually fully populated at the end?  If so a direct mapping using
>> SSA_NAME_VERSION would be cheaper.
>
> No, in practice it should be very sparse.
>
>>> +/* Make INFO describe all uses of RHS in ASSIGN.  */
>>> +
>>> +void
>>> +backprop::process_assign_use (gassign *assign, tree rhs, usage_info *info)
>>> +{
>>> +  tree lhs = gimple_assign_lhs (assign);
>>> +  switch (gimple_assign_rhs_code (assign))
>>> +    {
>>> +    case ABS_EXPR:
>>> +      /* The sign of the input doesn't matter.  */
>>> +      info->flags.ignore_sign = true;
>>> +      break;
>>> +
>>> +    case COND_EXPR:
>>> +      /* For A = B ? C : D, propagate information about all uses of A
>>> +        to B and C.  */
>>> +      if (rhs != gimple_assign_rhs1 (assign))
>>> +       if (const usage_info *lhs_info = lookup_operand (lhs))
>>
>> Use && instead of nested if
>
> That means introducing an extra level of braces just for something
> that that isn't needed by the first statement, i.e.:
>
>     {
>       const usage_info *lhs_info;
>       if (rhs != gimple_assign_rhs1 (assign)
>           && (lhs_info = lookup_operand (lhs)))
>         *info = *lhs_info;
>       break;
>     }
>
> There also used to be a strong preference for not embedding assignments
> in && and || conditions.
>
> If there had been some other set-up for the lookup_operand call, we
> would have had:
>
>       if (rhs != gimple_assign_rhs1 (assign))
>         {
>           ...
>           if (const usage_info *lhs_info = lookup_operand (lhs))
>             ..
>         }
>
> and presumably that would have been OK.  So if the original really isn't,
> acceptable, I'd rather write it as:
>
>       if (rhs != gimple_assign_rhs1 (assign))
>         {
>           const usage_info *lhs_info = lookup_operand (lhs);
>           if (lhs_info)
>             ..
>         }
>
> I thought we'd embraced the idea of declarations in if conditions though.

I missed the init in if ... yeah, there doesn't seem to be a good way
though I prefer {} around the inner if at least which means the last
form would be my reference.

>>> +/* Process all uses of VAR and record or update the result in
>>> +   M_INFO_MAP and M_VARS.  */
>>> +
>>> +void
>>> +backprop::process_var (tree var)
>>> +{
>>> +  if (has_zero_uses (var))
>>> +    return;
>>> +
>>> +  usage_info info;
>>> +  intersect_uses (var, &info);
>>
>> It feels somewhat backward that we do a backward walk over all stmts
>> (thus processing all use-stmts before defs) but then "forward" process
>> uses of defs.  Wouldn't it be easier(?) to "intersect" incrementally
>> by processing only uses for all stmts we walk?  Or is this about
>> avoiding to allocate a 'info' for DEFs that end up with no interesting
>> info?
>
> Yeah, that was the idea.  In practice very few SSA names end up
> with any useful information.

Ok, let's add a comment somewhere noting that and the choice of
"backward forward propagation".

>>> +/* Process all statements and phis in BB, during the first post-order
>> walk.  */
>>> +
>>> +void
>>> +backprop::process_block (basic_block bb)
>>> +{
>>> +  for (gimple_stmt_iterator gsi = gsi_last_bb (bb); !gsi_end_p (gsi);
>>> +       gsi_prev (&gsi))
>>> +    {
>>> +      tree lhs = gimple_get_lhs (gsi_stmt (gsi));
>>> +      if (lhs && TREE_CODE (lhs) == SSA_NAME)
>>> +       process_var (lhs);
>>
>> I think it would be cleaner to walk over al SSA defs (that includes asms which
>> you seem to miss, no idea if that's important)
>
> The idea was the same here.  The point of the pass is to back-propagate
> information from uses to definitions, so there's no point recording
> information about SSA names whose definitions could never benefit.
>
>>> +    }
>>> +  for (gphi_iterator gpi = gsi_start_phis (bb); !gsi_end_p (gpi);
>>> +       gsi_next (&gpi))
>>> +    process_var (gimple_phi_result (gpi.phi ()));
>>> +}
>>> +
>>> +/* Delete the definition of VAR, which has no uses.  */
>>> +
>>> +static void
>>> +remove_unused_var (tree var)
>>> +{
>>> +  gimple *stmt = SSA_NAME_DEF_STMT (var);
>>> +  if (dump_file && (dump_flags & TDF_DETAILS))
>>> +    {
>>> +      fprintf (dump_file, "Deleting ");
>>> +      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
>>> +    }
>>> +  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
>>> +  gsi_remove (&gsi, true);
>>> +  release_ssa_name (var);
>>
>> What if var wasn't the only def?  Please use release_defs ()
>> so in case this is a call with a virtual definition it is released.
>
> I didn't think the calls involved here could have those, but maybe
> they could because of errno?  Ugh.  Are there any other valid reasons
> for having VDEFs on these calls?

No, only errno.

>>> +/* Strip all sign operations from the rvalue at *RHS_PTR in STMT.
>>> +   Return true if something changed.  The caller is responsible
>>> +   for the necessary bookkeeping.  */
>>> +
>>> +static bool
>>> +strip_sign_op (gimple *stmt, tree *rhs_ptr)
>>> +{
>>> +  if (tree new_rhs = strip_sign_op (*rhs_ptr))
>>> +    {
>>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>>> +       note_replacement (stmt, *rhs_ptr, new_rhs);
>>> +      *rhs_ptr = new_rhs;
>>
>> So it looks you are only changing stmts when the stmt result produces
>> the same value.  Just double-checking, as otherwise you'd need to care
>> about debug stmts ...
>
> No, it can change values, like the case you saw later for phis.
> This applies to all the cases where the optimisation depends on the
> propagated info for the lhs, rather than being inherent to the operation.
> So e.g. we can change the value of A in A = B * C, if all uses of A
> don't care about the sign.
>
> At the moment the only change we can make is that the result could be
> the negative of its original value.

Ok, so then there is the debug issue.  Consider

  x = ...;

which you change the sign for.  The user in gdb when printing 'x' needs to
see the original value or "optimized out", not the negated value.  This means
you have to replace the LHS of the stmt with a new SSA name which is
best done (with proper debug effects) by removing the original stmt and
replacing all uses of the LHS with the new stmt lhs.  Note that the
same is true for all derived values, so even if you only change

 _1 = ...;

(no user visible value) then a derived value


 x_2 = _1 + 2;

might change sign.

Well, you have to think about it at least ;)

>>> +/* STMT has been changed.  Give the fold machinery a chance to simplify
>>> +   and canonicalize it (e.g. by ensuring that commutative operands have
>>> +   the right order), then record the updates.  */
>>> +
>>> +void
>>> +backprop::complete_change (gimple *stmt)
>>> +{
>>> +  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
>>> +  if (fold_stmt (&gsi))
>>> +    {
>>> +      if (dump_file && (dump_flags & TDF_DETAILS))
>>> +       {
>>> +         fprintf (dump_file, "  which folds to: ");
>>> +         print_gimple_stmt (dump_file, gsi_stmt (gsi), 0, TDF_SLIM);
>>> +       }
>>
>> I suppose it can't (yet) happen that the result no longer throws.
>
> Yeah, think so.
>
>>> +/* Optimize PHI on the basis that INFO describes all uses of the result.  */
>>> +
>>> +void
>>> +backprop::optimize_phi (gphi *phi, const usage_info *info)
>>> +{
>>> +  /* If the sign of the result doesn't matter, strip sign operations
>>> +     from all arguments.  */
>>> +  if (info->flags.ignore_sign)
>>> +    {
>>> +      use_operand_p use;
>>> +      ssa_op_iter oi;
>>> +      FOR_EACH_PHI_ARG (use, phi, oi, SSA_OP_USE)
>>> +       if (tree new_arg = strip_sign_op (USE_FROM_PTR (use)))
>>> +         {
>>> +           if (dump_file && (dump_flags & TDF_DETAILS))
>>> +             note_replacement (phi, USE_FROM_PTR (use), new_arg);
>>> +           replace_exp (use, new_arg);
>>
>> So this seems to be a case where the value _does_ change?  Why bother
>> to substitute in PHIs btw.?
>
> For:
>
>      double
>      f (double *a, int n, double start)
>      {
>        double x = fabs (start);
>        for (int i = 0; i < n; ++i)
>          x *= a[i];
>        return __builtin_cos (x);
>      }
>
> it's the loop phi for x that has the redundant sign op as argument.
> We want to rewrite it to "start" even if we need to keep "fabs (start)"
> around for some other uses (assuming an extended example).

Ah, I see.

So apart from the debug issue the patch looks fine then.

Thanks,
Richard.

> Thanks,
> Richard
>



More information about the Gcc-patches mailing list