This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH GCC][5/7]Extend loop distribution for two-level innermost loop nest
On Wed, Oct 11, 2017 at 2:05 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Mon, Oct 9, 2017 at 2:48 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Thu, Oct 5, 2017 at 3:17 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>> Hi,
>>> For now distribution pass only handles the innermost loop. This patch extends the pass
>>> to cover two-level innermost loop nest. It also refactors code in pass_loop_distribution::execute
>>> for better reading. Note I restrict it to 2-level loop nest on purpose because of high
>>> cost in data dependence computation. Some compilation time optimizations like reusing
>>> the data reference finding, data dependence computing, would require a rewrite of this
>>> pass like the proposed loop interchange implementation. But that's another task.
>>>
>>> This patch introduces a temporary TODO for loop nest builtin partition which is covered
>>> by next two patches.
>>>
>>> With this patch, kernel loop in bwaves now can be distributed, thus exposed for further
>>> interchange. This patch adds new test for matrix multiplication, as well as adjusts
>>> test strings of existing tests.
>>> Bootstrap and test in patch set on x86_64 and AArch64, is it OK?
>>
>> @ -714,9 +719,11 @@ ssa_name_has_uses_outside_loop_p (tree def, loop_p loop)
>>
>> FOR_EACH_IMM_USE_FAST (use_p, imm_iter, def)
>> {
>> - gimple *use_stmt = USE_STMT (use_p);
>> - if (!is_gimple_debug (use_stmt)
>> - && loop != loop_containing_stmt (use_stmt))
>> + if (is_gimple_debug (USE_STMT (use_p)))
>> + continue;
>> +
>> + basic_block use_bb = gimple_bb (USE_STMT (use_p));
>> + if (use_bb == NULL || !flow_bb_inside_loop_p (loop, use_bb))
>> return true;
>>
>> use_bb should never be NULL.
> Done.
>>
>> + /* Don't support loop nest distribution under runtime alias check
>> + since it's not likely to enable many vectorization opportunities. */
>> + if (loop->inner)
>> + {
>> + merge_dep_scc_partitions (rdg, &partitions, false);
>> + }
>>
>> extra {}
> Done.
>>
>> + /* Support loop nest distribution enclosing current innermost loop.
>> + For the moment, we only support the innermost two-level loop nest. */
>> + if (flag_tree_loop_distribution
>> + && outer->num > 0 && outer->inner == loop && loop->next == NULL
>>
>> The canonical check for is-this-non-root is loop_outer (outer) instead
>> of outer->num > 0.
> Done.
>>
>> + && single_exit (outer)
>>
>> not sure how exits are counted but if the inner loop exits also the
>> outer loop do
>> we correctly handle/reject this case?
> I tend to believe this can be handled if it's not rejected by
> niters/exit condition,
> but I am not very sure about this.
>>
>> - if (nb_generated_loops + nb_generated_calls > 0)
>> - {
>> - changed = true;
>> - dump_printf_loc (MSG_OPTIMIZED_LOCATIONS,
>> - loc, "Loop %d distributed: split to %d loops "
>> - "and %d library calls.\n",
>> - num, nb_generated_loops, nb_generated_calls);
>> + if (nb_generated_loops + nb_generated_calls > 0)
>> + {
>> + changed = true;
>> + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS,
>> + loc, "Loop%s %d distributed: split to %d loops "
>> + "and %d library calls.\n",
>> + loop_nest_p ? " nest" : "", loop->num,
>> + nb_generated_loops, nb_generated_call
>> ...
>>
>> can you adjust the printfs to say "loop nest distributed" in case we distributed
>> a nest?
> Done.
>>
>> Can you rewrite the iteration over the nest so it would theoretically support
>> arbitrary deep perfect nests? Thus simply initialize loop_nest_p less
>> cleverly...
> Done. I factored it out as a function "prepare_perfect_loop_nest". I
> also tested
> the updated patch by enabling full loop nest distribution, there is no failure
> in bootstrap, regression test, spec benchmarks. Of course, the final patch
> still only supports 2-level innermost loop nest.
>
> Is this OK?
Ok.
Thanks,
Richard.
> Thanks,
> bin
> 2017-10-04 Bin Cheng <bin.cheng@arm.com>
>
> * tree-loop-distribution.c: Adjust the general comment.
> (NUM_PARTITION_THRESHOLD): New macro.
> (ssa_name_has_uses_outside_loop_p): Support loop nest distribution.
> (classify_partition): Skip builtin pattern of loop nest's inner loop.
> (merge_dep_scc_partitions): New parameter ignore_alias_p and use it
> in call to build_partition_graph.
> (finalize_partitions): New parameter. Make loop distribution more
> conservative by fusing more partitions.
> (distribute_loop): Don't do runtime alias check in case of loop nest
> distribution.
> (find_seed_stmts_for_distribution): New function.
> (prepare_perfect_loop_nest): New function.
> (pass_loop_distribution::execute): Refactor code finding seed stmts
> and loop nest into above functions. Support loop nest distribution.
> Adjust dump information accordingly.
>
> gcc/testsuite/ChangeLog
> 2017-10-04 Bin Cheng <bin.cheng@arm.com>
>
> * gcc.dg/tree-ssa/ldist-7.c: Adjust test string.
> * gcc.dg/tree-ssa/ldist-16.c: Ditto.
> * gcc.dg/tree-ssa/ldist-25.c: Ditto.
> * gcc.dg/tree-ssa/ldist-33.c: New test.