[PATCH, PR66846] Mark inner loop for fixup in parloops
Richard Biener
richard.guenther@gmail.com
Thu Jul 16 08:46:00 GMT 2015
On Wed, Jul 15, 2015 at 9:36 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> Hi,
>
> I.
>
> In openmp expansion of loops, we do some effort to try to create matching
> loops in the loop state of the child function, f.i.in
> expand_omp_for_generic:
> ...
> struct loop *outer_loop;
> if (seq_loop)
> outer_loop = l0_bb->loop_father;
> else
> {
> outer_loop = alloc_loop ();
> outer_loop->header = l0_bb;
> outer_loop->latch = l2_bb;
> add_loop (outer_loop, l0_bb->loop_father);
> }
>
> if (!gimple_omp_for_combined_p (fd->for_stmt))
> {
> struct loop *loop = alloc_loop ();
> loop->header = l1_bb;
> /* The loop may have multiple latches. */
> add_loop (loop, outer_loop);
> }
> ...
>
> And if that doesn't work out, we try to mark the loop state for fixup, in
> expand_omp_taskreg and expand_omp_target:
> ...
> /* When the OMP expansion process cannot guarantee an up-to-date
> loop tree arrange for the child function to fixup loops. */
> if (loops_state_satisfies_p (LOOPS_NEED_FIXUP))
> child_cfun->x_current_loops->state |= LOOPS_NEED_FIXUP;
> ...
>
> and expand_omp_for:
> ...
> else
> /* If there isn't a continue then this is a degerate case where
> the introduction of abnormal edges during lowering will prevent
> original loops from being detected. Fix that up. */
> loops_state_set (LOOPS_NEED_FIXUP);
> ...
>
> However, loops are fixed up anyway, because the first pass we execute with
> the new child function is pass_fixup_cfg.
>
> The new child function contains a function call to
> __builtin_omp_get_num_threads, which is marked with ECF_CONST, so
> execute_fixup_cfg marks the function for TODO_cleanup_cfg, and subsequently
> the loops with LOOPS_NEED_FIXUP.
>
>
> II.
>
> This patch adds a verification that at the end of the omp-expand processing
> of the child function, either the loop structure is ok, or marked for fixup.
>
> This verfication triggered a failure in parloops. When an outer loop is
> being parallelized, both the outer and inner loop are cancelled. Then during
> omp-expansion, we create a loop in the loop state for the outer loop (the
> one that is transformed), but not for the inner, which causes the
> verification failure:
> ...
> outer-1.c:11:3: error: loop with header 5 not in loop tree
> ...
>
> [ I ran into this verification failure with an openacc kernels testcase on
> the gomp-4_0-branch, where parloops is called additionally from a different
> location, and pass_fixup_cfg is not the first pass that the child function
> is processed by. ]
>
> The patch contains a bit that makes sure that the loop state of the child
> function is marked for fixup in parloops. The bit is non-trival since it
> create a loop state and sets the fixup flag on the loop state, but postpones
> the init_loops_structure call till move_sese_region_to_fn, where it can
> succeed.
>
>
> III.
>
> Bootstrapped and reg-tested on x86_64.
>
> OK for trunk?
+ struct loops *loops;
+ int loop_state_flags = 0;
+ if (dest_cfun->x_current_loops == NULL)
+ {
+ /* Initialize an empty loop tree. */
+ loops = ggc_cleared_alloc<struct loops> ();
+ set_loops_for_fn (dest_cfun, loops);
+ }
+ else
+ {
+ loops = dest_cfun->x_current_loops;
+ loop_state_flags = loops->state;
+ }
+
+ if (loops->tree_root == NULL)
+ {
+ init_loops_structure (dest_cfun, loops, 1);
+ loops->state |= loop_state_flags;
+ }
+
+ loops->state |= LOOPS_MAY_HAVE_MULTIPLE_LATCHES;
this looks twisted just because you do a half-way init of the loop tree here:
+ if (loop->inner)
+ {
+ struct function *child_cfun = DECL_STRUCT_FUNCTION (child_fn);
+ struct loops *loops = ggc_cleared_alloc<struct loops> ();
+ set_loops_for_fn (child_cfun, loops);
+ child_cfun->x_current_loops->state = LOOPS_NEED_FIXUP;
+ }
why not unconditionally initialize the loop tree properly in autopar?
> Thanks,
> - Tom
More information about the Gcc-patches
mailing list