This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH GCC][04/13]Sort statements in topological order for loop distribution


On Wed, Jun 14, 2017 at 9:53 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Jun 13, 2017 at 11:59 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>> Hi,
>>> During the work I ran into a latent bug for distributing.  For the moment we sort statements
>>> in dominance order, but that's not enough because basic blocks may be sorted in reverse order
>>> of execution flow.  This results in wrong data dependence direction later.  This patch fixes
>>> the issue by sorting in topological order.
>>>
>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>
>> I suppose you are fixing
>>
>> static int
>> pg_add_dependence_edges (struct graph *rdg, vec<loop_p> loops, int dir,
>>                          vec<data_reference_p> drs1,
>>                          vec<data_reference_p> drs2)
>> {
>> ...
>>         /* Re-shuffle data-refs to be in dominator order.  */
>>         if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
>>             > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
>>           {
>>             std::swap (dr1, dr2);
>>             this_dir = -this_dir;
>>           }
>>
>> but then for stmts that are not "ordered" by RPO or DOM like
>>
>>    if (flag)
>>      ... = dr1;
>>    else
>>      ... = dr2;
>>
>> this doesn't avoid spurious swaps?  Also the code was basically
> No, this is mainly for below case:
>   if (flag)
>     {
>       partition1: arr[i] = x;
>     }
>   partition2: arr[i] = y;
>
> function pg_add_dependence_edges is like:
>     /* Re-shuffle data-refs to be in dominator order. */
>     if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
>         > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
>       {
>         std::swap (dr1, dr2);
>         this_dir = -this_dir;
>       }
>     //...
>     else if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE)
>       {
>          if (DDR_REVERSED_P (ddr))
>            {
>              std::swap (dr1, dr2);
>              this_dir = -this_dir;
>            }
>          /* Known dependences can still be unordered througout the
>             iteration space, see gcc.dg/tree-ssa/ldist-16.c.  */
>          if (DDR_NUM_DIST_VECTS (ddr) != 1)
>            this_dir = 2;
>          /* If the overlap is exact preserve stmt order.  */
>          else if (lambda_vector_zerop (DDR_DIST_VECT (ddr, 0), 1))
>            ;
>          else
>            {
>              /* Else as the distance vector is lexicographic positive
>                 swap the dependence direction.  */
>              this_dir = -this_dir;
>            }
>       }
> For non-ZERO distance vector dependence, the second part always
> computes src->dest dependence info correctly, as well as edge
> direction of PG.  For ZERO distance vector dependence, we rely on the
> swap part (thus topological order) to get correct dependence
> direction.  For mentioned case, the two data references are unordered
> in dominance relation, but ordered in RPO.  This is why DOM is not
> enough.  For if-then-else case, the order actually doesn't matter, and
> the references are unordered in either dominance relation or RPO.
> Specifically, src->dest is always computed correctly for non-ZERO
> distance vector cases, no matter <dr1, dr2> or <dr2, dr1> is passed to
> data dependence analyzer.  As for ZERO distance vector (exact
> overlap), the order doesn't matter either because they control
> dependent on the same condition.  We can simply assume an arbitrary
> order for it.

Ok, if it only is an issue for zero-distance then yes, I agree.

>> copied from tree-data-refs.c:find_data_references_in_loop which
>> does iterate over get_loop_body_in_dom_order as well.  So isn't the
>> issue latent there as well?
> In theory maybe.  In practice, this is not a problem at all since loop
> distribution is the only one handles control dependence so far.

You mean the only one running into the bogus BB ordering case.
I don't see how handling control dependences factor in here.

>>
>> That said, what about those "unordered" stmts?  I suppose
>> dependence analysis still happily computes a dependence
>> distance but in reality we'd have to consider both execution
>> orders?
> As explained, there is no need to consider both orders.  GCC doesn't
> really support control dependence parallelization?

I think autopar supports an arbitrary CFG inside the loops but as it
will never split them it won't change stmt ordering for zero-distance.

That said, if dependence info can be incorrect if applied to a loop
we should fixup tree-data-refs.c as well.  It might be useful
to make get_loop_body_in_rpo_order available then (and eventually
all _in_dom_order users can work with rpo order as well thus we
can replace it entirely as a second step).

Thanks,
Richard.

> Thanks,
> bin
>>
>> Thanks,
>> Richard.
>>
>>
>>>
>>> Thanks,
>>> bin
>>> 2017-06-07  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         * tree-loop-distribution.c (bb_top_order_index): New.
>>>         (bb_top_order_index_size, bb_top_order_cmp): New.
>>>         (stmts_from_loop): Use topological order.
>>>         (pass_loop_distribution::execute): Compute topological order for.
>>>         basic blocks.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]