This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Move statements upwards after reassociation


On Fri, Oct 12, 2012 at 1:45 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Oct 12, 2012 at 3:09 AM, Easwaran Raman <eraman@google.com> wrote:
>> Thanks for the comments. As David wrote, the intent of the patch is
>> not to do a general purpose scheduling, but to compensate for the
>> possible live range lengthening introduced by reassociation.
>>
>>
>> On Thu, Oct 11, 2012 at 6:16 AM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Thu, Oct 11, 2012 at 3:52 AM, Easwaran Raman <eraman@google.com> wrote:
>>>>
>>>> +/* Move STMT up within its BB until it can not be moved any further.  */
>>>> +
>>>> +static void move_stmt_upwards (gimple stmt)
>>>> +{
>>>> +  gimple_stmt_iterator gsi, gsistmt;
>>>> +  tree rhs1, rhs2;
>>>> +  gimple rhs1def = NULL, rhs2def = NULL;
>>>> +  rhs1 = gimple_assign_rhs1 (stmt);
>>>> +  rhs2 = gimple_assign_rhs2 (stmt);
>>>> +  gcc_assert (rhs1);
>>>
>>> Please no such senseless asserts.  The following line will segfault anyway
>>> if rhs1 is NULL and with a debugger an assert doesn't add any useful
>>> information.
>> Ok.
>>>
>>>> +  if (TREE_CODE (rhs1) == SSA_NAME)
>>>> +    rhs1def = SSA_NAME_DEF_STMT (rhs1);
>>>> +  else if (TREE_CODE (rhs1) != REAL_CST
>>>> +           && TREE_CODE (rhs1) != INTEGER_CST)
>>>> +    return;
>>>> +  if (rhs2)
>>>
>>> You may not access gimple_assign_rhs2 if it is not there.  So you have
>>> to check whether you have an unary, binary or ternary (yes) operation.
>>
>> gimple_assign_rhs2 returns NULL_TREE if it the RHS of an assignment
>> has less than two operands.  Regarding the check for ternary
>> operation, I believe it is not necessary. A statement is considered
>> for reassociation only if the RHS class is GIMPLE_BINARY_RHS.
>> Subsequently, for rhs1 and rhs2, it checks if the def statements also
>> have the same code and so it seems to me that a statement with a
>> ternary operator in the RHS will never be considered in
>> rewrite_expr_tree.
>>
>>
>>>
>>>> +    {
>>>> +      if (TREE_CODE (rhs2) == SSA_NAME)
>>>> +        rhs2def = SSA_NAME_DEF_STMT (rhs2);
>>>> +      else if (TREE_CODE (rhs1) != REAL_CST
>>>> +               && TREE_CODE (rhs1) != INTEGER_CST)
>>>> +        return;
>>>> +    }
>>>> +  gsi = gsi_for_stmt (stmt);
>>>> +  gsistmt = gsi;
>>>> +  gsi_prev (&gsi);
>>>> +  for (; !gsi_end_p (gsi); gsi_prev (&gsi))
>>>
>>> ????
>>>
>>> This doesn't make much sense.  You can move a stmt to the nearest
>>> common post-dominator.  Assuming you only want to handle placing
>>> it after rhs1def or after rhs2def(?) you don't need any loop, just
>>> two dominator queries and an insertion after one of the definition
>>> stmts.
>>
>> Within a BB isn't that still O(size of BB)?
>
> Please document the fact that both stmts are in the same BB.
Ok.
> And no, it isn't, it is O (size of BB ^ 2).  You don't need a loop.
> operand rank should reflect the dominance relation inside the BB.

The rank of phi definitions would mess this up.

> If that doesn't work simply assign UIDs to the stmts first.
Ok.

>
>>> But this code should consider BBs.
>> For reassociation to look across BBs, the code should look something like this:
>>
>> L1 :
>>    a_2 = a_1 + 10
>>    jc L3
>> L2:
>>   a_3 = a_2 + 20
>>
>> - L1 should dominate L2 (otherwise there will be a phi node at L2 and
>> the reassociation of a_3 will not consider the definition of a_2)
>> - There are no other uses of a_2 other than the one in L3.
>>
>> After reassociation, the stmt defining a_2 would be moved to L2.  In
>> that case, the downward code motion of a_2 = a_1 + 10 to L2 is
>> beneficial (one less instruction if the branch is taken). It is not
>> obvious to me that moving  it to L1 (or whereever a_1 is defined) is
>> beneficial.
>
> In this case it doesn't matter whether a1 lives through a3 or if a2 does.
> But moving the stmt is not necessary, so why not simply avoid it.
I used that example to show that the current downward motion has a
useful side effect and this patch preserves it. Yes, in this example
the downward motion can be avoided but in general it may not be
possible. I agree with you that there is unnecessary code motion in
many cases.

> You cannot undo it with yout patch anyway.
>
>>>  And I don't see why more optimal
>>> placement cannot be done during rewrite_expr_tree itself.
>>
>> I started with that idea, but my current approach looks more simpler.
>
> Simpler, but it's a hack.
>
> So, the only place we "move" stmts in rewrite_expr_tree is here:
>
>       if (!moved)
>         {
>           gimple_stmt_iterator gsinow, gsirhs1;
>           gimple stmt1 = stmt, stmt2;
>           unsigned int count;
>
>           gsinow = gsi_for_stmt (stmt);
>           count = VEC_length (operand_entry_t, ops) - opindex - 2;
>           while (count-- != 0)
>             {
>               stmt2 = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (stmt1));
>               gsirhs1 = gsi_for_stmt (stmt2);
>               gsi_move_before (&gsirhs1, &gsinow);
>
> that moving is done unconditionally, even if stmt2 already dominates stmt1.
> Why not improve that instead?

stmt2 does dominate stmt1. The issue is it unconditionally moves stmt2
downwards even if the statement that defines ops[op_index+1] (the new
RHS of stmt2 after reassociation) already dominates stmt2 or moves it
more than necessary. One complication to fixing it here is that there
is a call to swap_ops_for_binary_stmt that could alter the contents of
the ops vector.

- Easwaran

>
> Richard.
>
>
>>
>> Thanks,
>> Easwaran
>>>
>>> NB, the whole reassoc code needs a re-write to avoid the excessive
>>> stmt modifications when it does nothing.  So I'd very much rather avoid
>>> adding anything to reassoc until that rewrite happened.
>>>
>>> Richard.
>>>
>>>> +    {
>>>> +      gimple curr_stmt = gsi_stmt (gsi);
>>>> +      if (curr_stmt == rhs1def || curr_stmt == rhs2def) {
>>>> +        gsi_move_after (&gsistmt, &gsi);
>>>> +        return;
>>>> +      }
>>>> +    }
>>>> +
>>>> +}
>>>> +
>>>>  /* Recursively rewrite our linearized statements so that the operators
>>>>     match those in OPS[OPINDEX], putting the computation in rank
>>>>     order.  */
>>>>
>>>>  static void
>>>>  rewrite_expr_tree (gimple stmt, unsigned int opindex,
>>>> -   VEC(operand_entry_t, heap) * ops, bool moved)
>>>> +   VEC(operand_entry_t, heap) * ops, bool moved,
>>>> +                   VEC(gimple, heap) **stmts_to_move)
>>>>  {
>>>>    tree rhs1 = gimple_assign_rhs1 (stmt);
>>>>    tree rhs2 = gimple_assign_rhs2 (stmt);
>>>> @@ -2299,6 +2337,7 @@ rewrite_expr_tree (gimple stmt, unsigned int opind
>>>>        print_gimple_stmt (dump_file, stmt, 0, 0);
>>>>      }
>>>>   }
>>>> +      VEC_safe_push (gimple, heap, *stmts_to_move, stmt);
>>>>        return;
>>>>      }
>>>>
>>>> @@ -2346,7 +2385,9 @@ rewrite_expr_tree (gimple stmt, unsigned int opind
>>>>      }
>>>>    /* Recurse on the LHS of the binary operator, which is guaranteed to
>>>>       be the non-leaf side.  */
>>>> -  rewrite_expr_tree (SSA_NAME_DEF_STMT (rhs1), opindex + 1, ops, moved);
>>>> +  rewrite_expr_tree (SSA_NAME_DEF_STMT (rhs1), opindex + 1, ops, moved,
>>>> +                     stmts_to_move);
>>>> +  VEC_safe_push (gimple, heap, *stmts_to_move, stmt);
>>>>  }
>>>>
>>>>  /* Find out how many cycles we need to compute statements chain.
>>>> @@ -3427,6 +3468,9 @@ reassociate_bb (basic_block bb)
>>>>  {
>>>>    gimple_stmt_iterator gsi;
>>>>    basic_block son;
>>>> +  VEC(gimple, heap) *stmts_to_move = NULL;
>>>> +  gimple stmt;
>>>> +  int i;
>>>>
>>>>    for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi))
>>>>      {
>>>> @@ -3542,7 +3586,7 @@ reassociate_bb (basic_block bb)
>>>>        && VEC_length (operand_entry_t, ops) > 3)
>>>>      rewrite_expr_tree_parallel (stmt, width, ops);
>>>>    else
>>>> -    rewrite_expr_tree (stmt, 0, ops, false);
>>>> +    rewrite_expr_tree (stmt, 0, ops, false, &stmts_to_move);
>>>>
>>>>    /* If we combined some repeated factors into a
>>>>       __builtin_powi call, multiply that result by the
>>>> @@ -3560,6 +3604,7 @@ reassociate_bb (basic_block bb)
>>>>         target_ssa);
>>>>        gimple_set_location (mul_stmt, gimple_location (stmt));
>>>>        gsi_insert_after (&gsi, mul_stmt, GSI_NEW_STMT);
>>>> +                      VEC_safe_push (gimple, heap, stmts_to_move, mul_stmt);
>>>>      }
>>>>   }
>>>>
>>>> @@ -3567,6 +3612,11 @@ reassociate_bb (basic_block bb)
>>>>      }
>>>>   }
>>>>      }
>>>> +
>>>> +  FOR_EACH_VEC_ELT (gimple, stmts_to_move, i, stmt)
>>>> +    move_stmt_upwards (stmt);
>>>> +  VEC_free (gimple, heap, stmts_to_move);
>>>> +
>>>>    for (son = first_dom_son (CDI_POST_DOMINATORS, bb);
>>>>         son;
>>>>         son = next_dom_son (CDI_POST_DOMINATORS, son))


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]