This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Minimize downward code motion during reassociation


Ping.

- Easwaran

On Wed, Oct 31, 2012 at 12:16 PM, Easwaran Raman <eraman@google.com> wrote:
> On Wed, Oct 31, 2012 at 4:36 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Wed, Oct 24, 2012 at 4:02 AM, Easwaran Raman <eraman@google.com> wrote:
>>> On Tue, Oct 23, 2012 at 2:52 AM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Mon, Oct 22, 2012 at 8:31 PM, Easwaran Raman <eraman@google.com> wrote:
>>>>> On Mon, Oct 22, 2012 at 12:59 AM, Richard Biener
>>>>> <richard.guenther@gmail.com> wrote:
>>>>>> On Fri, Oct 19, 2012 at 12:36 AM, Easwaran Raman <eraman@google.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> During expression reassociation, statements are conservatively moved
>>>>>>> downwards to ensure that dependences are correctly satisfied after
>>>>>>> reassocation. This could lead to lengthening of live ranges. This
>>>>>>> patch moves statements only to the extent necessary. Bootstraps and no
>>>>>>> test regression on x86_64/linux. OK for trunk?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Easwaran
>>>>>>>
>>>>>>> 2012-10-18   Easwaran Raman  <eraman@google.com>
>>>>>>> * tree-ssa-reassoc.c(assign_uids): New function.
>>>>>>> (assign_uids_in_relevant_bbs): Likewise.
>>>>>>> (ensure_ops_are_available): Likewise.
>>>>>>> (rewrite_expr_tree): Do not move statements beyond what is
>>>>>>> necessary. Remove call to swap_ops_for_binary_stmt...
>>>>>>> (reassociate_bb): ... and move it here.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Index: gcc/tree-ssa-reassoc.c
>>>>>>> ===================================================================
>>>>>>> --- gcc/tree-ssa-reassoc.c (revision 192487)
>>>>>>> +++ gcc/tree-ssa-reassoc.c (working copy)
>>>>>>> @@ -2250,6 +2250,128 @@ swap_ops_for_binary_stmt (VEC(operand_entry_t, hea
>>>>>>>      }
>>>>>>>  }
>>>>>>>
>>>>>>> +/* Assign UIDs to statements in basic block BB.  */
>>>>>>> +
>>>>>>> +static void
>>>>>>> +assign_uids (basic_block bb)
>>>>>>> +{
>>>>>>> +  unsigned uid = 0;
>>>>>>> +  gimple_stmt_iterator gsi;
>>>>>>> +  /* First assign uids to phis.  */
>>>>>>> +  for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>>>>>> +    {
>>>>>>> +      gimple stmt = gsi_stmt (gsi);
>>>>>>> +      gimple_set_uid (stmt, uid++);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +  /* Then assign uids to stmts.  */
>>>>>>> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>>>>>> +    {
>>>>>>> +      gimple stmt = gsi_stmt (gsi);
>>>>>>> +      gimple_set_uid (stmt, uid++);
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +/* For each operand in OPS, find the basic block that contains the statement
>>>>>>> +   which defines the operand. For all such basic blocks, assign UIDs.  */
>>>>>>> +
>>>>>>> +static void
>>>>>>> +assign_uids_in_relevant_bbs (VEC(operand_entry_t, heap) * ops)
>>>>>>> +{
>>>>>>> +  operand_entry_t oe;
>>>>>>> +  int i;
>>>>>>> +  struct pointer_set_t *seen_bbs = pointer_set_create ();
>>>>>>> +
>>>>>>> +  for (i = 0; VEC_iterate (operand_entry_t, ops, i, oe); i++)
>>>>>>> +    {
>>>>>>> +      gimple def_stmt;
>>>>>>> +      basic_block bb;
>>>>>>> +      if (TREE_CODE (oe->op) != SSA_NAME)
>>>>>>> +        continue;
>>>>>>> +      def_stmt = SSA_NAME_DEF_STMT (oe->op);
>>>>>>> +      bb = gimple_bb (def_stmt);
>>>>>>> +      if (!pointer_set_contains (seen_bbs, bb))
>>>>>>> +        {
>>>>>>> +          assign_uids (bb);
>>>>>>> +          pointer_set_insert (seen_bbs, bb);
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +  pointer_set_destroy (seen_bbs);
>>>>>>> +}
>>>>>>
>>>>>> Please assign UIDs once using the existing renumber_gimple_stmt_uids ().
>>>>>> You seem to call the above multiple times and thus do work bigger than
>>>>>> O(number of basic blocks).
>>>>>
>>>>> The reason I call the above multiple times is that gsi_move_before
>>>>> might get called between two calls to the above. For instance, after
>>>>> rewrite_expr_tree is called once, the following sequence of calls
>>>>> could happen:
>>>>> reassociate_bb -> linearize_expr_tree -> linearize_expr ->
>>>>> gsi_move_before.  So it is not sufficient to call
>>>>> renumber_gimple_stmt_uids once per do_reassoc. Or do you want me to
>>>>> use renumber_gimple_stmt_uids_in_blocks instead of
>>>>> assign_uids_in_relevant_bbs?
>>>>
>>>> It's sufficient to call it once if you conservatively update UIDs at stmt
>>>> move / insert time (assign the same UID as the stmt before/after).
>>>
>>> I was worried that that would make the code brittle, but looking at
>>> the places where a new statement is added or moved, I think it is
>>> fine.  I have made the change in the new patch.
>>
>> Thanks.
>>
>>>>>
>>>>>
>>>>>>> +/* Ensure that operands in the OPS vector starting from OPINDEXth
>>>>>>> entry are live
>>>>>>> +   at STMT.  This is accomplished by moving STMT if needed.  */
>>>>>>> +
>>>>>>> +static void
>>>>>>> +ensure_ops_are_available (gimple stmt, VEC(operand_entry_t, heap) *
>>>>>>> ops, int opindex)
>>>>>>> +{
>>>>>>> +  int i;
>>>>>>> +  int len = VEC_length (operand_entry_t, ops);
>>>>>>> +  gimple insert_stmt = stmt;
>>>>>>> +  basic_block insert_bb = gimple_bb (stmt);
>>>>>>> +  gimple_stmt_iterator gsi_insert, gsistmt;
>>>>>>> +  for (i = opindex; i < len; i++)
>>>>>>> +    {
>>>>>>
>>>>>> Likewise you call this for each call to rewrite_expr_tree, so it seems to me
>>>>>> this is quadratic in the number of ops in the op vector.
>>>>>
>>>>> The call to ensure_ops_are_available inside rewrite_expr_tree is
>>>>> guarded by if (!moved) and I am setting moved = true there to ensure
>>>>> that ensure_ops_are_available inside is called once per reassociation
>>>>> of a expression tree.
>>>>
>>>> It would be a lot easier to understand this function if you call it once
>>>> after rewrite_expr_tree finished.
>>> I think you meant  "before rewrite_expr_tree". One thing to note here
>>> is this function is called only when we first see a statement whose
>>> operand changes after reassociation. That's the reason the moving of
>>> statements happen inside rewrite_expr_tree.
>>
>> I meant after because rewrite_expr_tree ends up swapping things.  But
>> I note you changed that in your revised patch - good ;)
>>
>> So now that rewrite_expr_tree does no longer change order of the ops
>> vector we can move stmts indeed before it, but backwards.  Thus,
>> instead of walking forward and
>>
>> +              /* Ensure that STMT1 is moved to a place where all operands in
>> +                 OPS[opindex + i...] are available.  */
>> +              ensure_ops_are_available (stmt1, ops, opindex + i);
>>
>> we can walk backward and only need to consider the immediate next operand
>> because we know all further operands are available at the place the immediate
>> next operand is defined.
>>
>> Which should remove the need for the loop in ensure_ops_are_available
>>
>> +  for (i = opindex; i < len; i++)
>> +    {
>>
>> Correct?  (We may need to special-case the last (two?) stmts).
>
> Sorry, I am not able to understand your proposal here. The confusion
> relates to the use of walking forward vs backward. In the current
> patch, I start from the last statement (root of the expression tree)
> and follow the chain of defs using SSA_NAME_DEF_STMT (which I consider
> as walking backward). Do you want to start from the leaf of the
> expression and follow the chain in the opposite direction till the
> stmt defining the root of the tree is reached? That would indeed avoid
> comparing each statement against all statements that define the ops
> array as it is sufficient to compare against the statements that
> define the RHS operands. If that is indeed what you are proposing, I
> tried that approach and ran into issues. Doing this would mean that
> the IR is in an inconsistent state during intermediate stages. For
> instance, transforming
>
> t1 = a + b
> t2 = t1 + c
> d = ...
> t3 = t2 + d
>
> into
>
> t1 = a + d
> t2 = t1 + c
> d = ...
> t3 = t2 + b
>
> would result in a state where t2 =  t1 + c appears before t1 = a + b.
> I don't know if that is the root cause, but I noticed that doing so
> resulted in ssa_verify complaining about DEBUG statements whose
> operands appeared before their defs. If this is not what you mean by
> walking backwards, could you please explain with a simple example?
>
> Thanks,
> Easwaran
>
>
>>>>
>>>>>>
>>>>>> Why make this all so complicated?  It seems to me that we should
>>>>>> fixup stmt order only after the whole ops vector has been materialized.
>>>>>
>>>>>
>>>>>>
>>>>>>> +      operand_entry_t oe = VEC_index (operand_entry_t, ops, i);
>>>>>>> +      gimple def_stmt;
>>>>>>> +      basic_block def_bb;
>>>>>>> +      /* Ignore constants and operands with default definitons.  */
>>>>>>> +      if (TREE_CODE (oe->op) != SSA_NAME
>>>>>>> +          || SSA_NAME_IS_DEFAULT_DEF (oe->op))
>>>>>>> +        continue;
>>>>>>> +      def_stmt = SSA_NAME_DEF_STMT (oe->op);
>>>>>>> +      def_bb = gimple_bb (def_stmt);
>>>>>>> +      if (def_bb != insert_bb
>>>>>>> +          && !dominated_by_p (CDI_DOMINATORS, insert_bb, def_bb))
>>>>>>> +        {
>>>>>>> +          insert_bb = def_bb;
>>>>>>> +          insert_stmt = def_stmt;
>>>>>>> +        }
>>>>>>> +      else if (def_bb == insert_bb
>>>>>>> +               && gimple_uid (insert_stmt) < gimple_uid (def_stmt))
>>>>>>> +        insert_stmt = def_stmt;
>>>>>>> +    }
>>>>>>> +  if (insert_stmt == stmt)
>>>>>>> +    return;
>>>>>>> +  gsistmt = gsi_for_stmt (stmt);
>>>>>>> +  /* If GSI_STMT is a phi node, then do not insert just after that statement.
>>>>>>> +     Instead, find the first non-label gimple statement in BB and insert before
>>>>>>> +     that.  */
>>>>>>> +  if (gimple_code (insert_stmt) == GIMPLE_PHI)
>>>>>>> +    {
>>>>>>> +      gsi_insert = gsi_after_labels (insert_bb);
>>>>>>> +      gsi_move_before (&gsistmt, &gsi_insert);
>>>>>>> +    }
>>>>>>> +  /* Statements marked for throw can not be in the middle of a basic block. So
>>>>>>> +     we can not insert a statement (not marked for throw) immediately
>>>>>>> after.  */
>>>>>>> +  else if (lookup_stmt_eh_lp (insert_stmt) > 0
>>>>>>
>>>>>> that's already performed by stmt_can_throw_internal
>>>>> Ok. I was hit by the "statement marked for throw in middle of block"
>>>>> error in verify_gimple_in_cfg  and simply copied the sequence of
>>>>> checks that led to that.
>>>>>
>>>>>>
>>>>>>> +           && stmt_can_throw_internal (insert_stmt))SSA_NAME_DEF_STMT
>>>>>>
>>>>>> But all this should be a non-issue as re-assoc should never assign an ops
>>>>>> vector entry for such stmts (but it could have leafs defined by such stmts).
>>>>>> If you only ever move definitions down, like the present code does caring
>>>>>> about leafs should not be necessary.
>>>>>
>>>>> I added this as I got an ICE in verify_gimple_in_cfg. So it seems like
>>>>> such statements gets assigned into the ops vector.
>>>>
>>>> That would be a bug.  See how linearize_expr_tree checks whether
>>>> the defining stmt of an op may throw.  That is, it _does_ add the
>>>> SSA def to the vector but does not recurse on its defining stmt operands.
>>>
>>> But as long as the SSA def of a throwing statement is added to the ops
>>> vector, I need to add the check, even if it doesn't recurse on its
>>> operands. Consider the following
>>>
>>> <bb1>:
>>> t1 = a + b
>>> c = ... // c might throw
>>>
>>> <bb2>:
>>> t2 = t1 + c
>>>
>>> The expression could be rewritten as (a+c) + b.
>>>
>>> Since c's definition follows t1's, t1 = a+b has to be moved after c =
>>> ... That's why I need the check.
>>
>> Hmm, indeed.
>>
>> +      /* There should be exactly one normal edge out of the basic block.  */
>> +      FOR_EACH_EDGE (e, ei, insert_bb->succs)
>> +        {
>> +          if (!(e->flags & EDGE_COMPLEX))
>> +            {
>> +              gcc_assert (succ_edge == NULL);
>> +              succ_edge = e;
>> +            }
>> +        }
>>
>> I think we have sth like find_fallthru_edge for this.
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Easwaran
>>>
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>> Easwaran
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>> +    {
>>>>>>> +      edge e, succ_edge = NULL;
>>>>>>> +      edge_iterator ei;
>>>>>>> +
>>>>>>> +      /* There should be exactly one normal edge out of the basic block.  */
>>>>>>> +      FOR_EACH_EDGE (e, ei, insert_bb->succs)
>>>>>>> +        {
>>>>>>> +          if (!(e->flags & EDGE_COMPLEX))
>>>>>>> +            {
>>>>>>> +              gcc_assert (succ_edge == NULL);
>>>>>>> +              succ_edge = e;
>>>>>>> +            }
>>>>>>> +        }
>>>>>>> +      /* Insert STMT at the beginning of the successor basic block.  */
>>>>>>> +      insert_bb = succ_edge->dest;
>>>>>>> +      gsi_insert = gsi_after_labels (insert_bb);
>>>>>>> +      gsi_move_before (&gsistmt, &gsi_insert);
>>>>>>> +    }
>>>>>>> +  else
>>>>>>> +    {
>>>>>>> +      gsi_insert = gsi_for_stmt (insert_stmt);
>>>>>>> +      gsi_move_after (&gsistmt, &gsi_insert);
>>>>>>> +    }
>>>>>>> +
>>>>>>> +}
>>>>>>> +
>>>>>>>  /* Recursively rewrite our linearized statements so that the operators
>>>>>>>     match those in OPS[OPINDEX], putting the computation in rank
>>>>>>>     order.  */
>>>>>>> @@ -2262,11 +2384,6 @@ rewrite_expr_tree (gimple stmt, unsigned int opind
>>>>>>>    tree rhs2 = gimple_assign_rhs2 (stmt);
>>>>>>>    operand_entry_t oe;
>>>>>>>
>>>>>>> -  /* If we have three operands left, then we want to make sure the ones
>>>>>>> -     that get the double binary op are chosen wisely.  */
>>>>>>> -  if (opindex + 3 == VEC_length (operand_entry_t, ops))
>>>>>>> -    swap_ops_for_binary_stmt (ops, opindex, stmt);
>>>>>>> -
>>>>>>>    /* The final recursion case for this function is that you have
>>>>>>>       exactly two operations left.
>>>>>>>       If we had one exactly one op in the entire list to start with, we
>>>>>>> @@ -2312,19 +2429,17 @@ rewrite_expr_tree (gimple stmt, unsigned int opind
>>>>>>>      {
>>>>>>>        if (!moved)
>>>>>>>   {
>>>>>>> -  gimple_stmt_iterator gsinow, gsirhs1;
>>>>>>> -  gimple stmt1 = stmt, stmt2;
>>>>>>> -  unsigned int count;
>>>>>>> +  gimple stmt1 = stmt;
>>>>>>> +  unsigned int count, i = 1;
>>>>>>>
>>>>>>> -  gsinow = gsi_for_stmt (stmt);
>>>>>>>    count = VEC_length (operand_entry_t, ops) - opindex - 2;
>>>>>>> -  while (count-- != 0)
>>>>>>> +  while (i <= count)
>>>>>>>      {
>>>>>>> -      stmt2 = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt1));
>>>>>>> -      gsirhs1 = gsi_for_stmt (stmt2);
>>>>>>> -      gsi_move_before (&gsirhs1, &gsinow);
>>>>>>> -      gsi_prev (&gsinow);
>>>>>>> -      stmt1 = stmt2;
>>>>>>> +      stmt1 = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt1));
>>>>>>> +              /* Ensure that STMT1 is moved to a place where all operands in
>>>>>>> +                 OPS[opindex + i...] are available.  */
>>>>>>> +              ensure_ops_are_available (stmt1, ops, opindex + i);
>>>>>>> +              i++;
>>>>>>>      }
>>>>>>>    moved = true;
>>>>>>>   }
>>>>>>> @@ -3542,8 +3657,18 @@ reassociate_bb (basic_block bb)
>>>>>>>        && VEC_length (operand_entry_t, ops) > 3)
>>>>>>>      rewrite_expr_tree_parallel (stmt, width, ops);
>>>>>>>    else
>>>>>>> -    rewrite_expr_tree (stmt, 0, ops, false);
>>>>>>> +                    {
>>>>>>> +                      /* When there are three operands left, we want
>>>>>>> +                         to make sure the ones that get the double
>>>>>>> +                         binary op are chosen wisely.  */
>>>>>>> +                      int len = VEC_length (operand_entry_t, ops);
>>>>>>> +                      if (len >= 3)
>>>>>>> +                        swap_ops_for_binary_stmt (ops, len - 3, stmt);
>>>>>>>
>>>>>>> +                      assign_uids_in_relevant_bbs (ops);
>>>>>>> +      rewrite_expr_tree (stmt, 0, ops, false);
>>>>>>> +                    }
>>>>>>> +
>>>>>>>    /* If we combined some repeated factors into a
>>>>>>>       __builtin_powi call, multiply that result by the
>>>>>>>       reassociated operands.  */


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]