Clear basic block flags before using BB_VISITED for OpenACC loops processing

Richard Biener richard.guenther@gmail.com
Wed Oct 19 10:07:00 GMT 2016


On Tue, Oct 18, 2016 at 9:52 PM, Thomas Schwinge
<thomas@codesourcery.com> wrote:
> Hi!
>
> On Mon, 17 Oct 2016 15:38:50 +0200, I wrote:
>> On Mon, 17 Oct 2016 14:08:44 +0200, Richard Biener <richard.guenther@gmail.com> wrote:
>> > On Mon, Oct 17, 2016 at 1:47 PM, Thomas Schwinge
>> > <thomas@codesourcery.com> wrote:
>> > > On Mon, 17 Oct 2016 13:22:17 +0200, Richard Biener <richard.guenther@gmail.com> wrote:
>> > >> On Mon, Oct 17, 2016 at 11:38 AM, Thomas Schwinge
>> > >> <thomas@codesourcery.com> wrote:
>> > >> > On Fri, 14 Oct 2016 13:06:59 +0200, Richard Biener <richard.guenther@gmail.com> wrote:
>> > >> >> On Fri, Oct 14, 2016 at 1:00 PM, Nathan Sidwell <nathan@acm.org> wrote:
>> > >> >> > On 10/14/16 05:28, Richard Biener wrote:
>> > >> >> >
>> > >> >> >> The BB_VISITED flag has indetermined state at the beginning of a pass.
>> > >> >> >> You have to ensure it is cleared yourself.
>> > >> >> >
>> > >> >> >
>> > >> >> > In that case the openacc (&nvptx?) passes should be modified to clear the
>> > >> >> > flags at their start, rather than at their end.
>
> This already is a "conceptual acknowledgement" of my patch, so...
>
>> > >> > OK to commit the following?  Is such a test case appropriate (which would
>> > >> > have caught this issue right away), in particular the dg-final
>> > >> > scan-tree-dump line?
>> > >>
>> > >> Ugh.  Not worse to what we do in various dwarf scanning I guess.
>> > >
>> > > ;-|
>> > >
>> > >> Doesn't failure lead to a miscompile eventually?  So you could formulate
>> > >> this as a dg-do run test with a check for the desired outcome?
>> > >
>> > > No, unfortunately.  In this case the error is "benign" such that the
>> > > OpenACC loop processing machinery will decide to not parallelize loops
>> > > that ought to be parallelized.
>> >
>> > So you can scan for "loop parallelized" instead?
>>
>> The dump would still contain the outer loop's "Loop 0(0)" marker, so I'd
>> have to scan for "Head"/"Tail"/"UNIQUE" or similar instead -- but that
>> seems likewise fragile (for false negatives), and less useful than
>> scanning for the complete pattern.
>>
>> > I fear your pattern
>> > is quite fragile
>> > to maintain over time.
>>
>> Agreed -- but then, that's intentional: my idea for this new test case
>> has been to have it actually verify the expected OpenACC loop processing,
>> so it's clear that this pattern will need to be adjusted if changing the
>> OpenACC loop processing.
>>
>> > >  This won't generally cause any problem
>> > > (apart from performance regression, obviously); it just caused problems
>> > > in a few libgomp test cases that actually at run time test for
>> > > parallelized execution -- which will/did trigger only with nvptx
>> > > offloading enabled, which not too many people are testing.  The test case
>> > > I propose below will trigger also for non-offloading configurations.
>>
>> On IRC, Segher suggested to 'use {} instead of "" to avoid [all those
>> backslashes]' -- thanks, done.
>
> If you don't like the test case as-is (do we need multi-line tree dump
> scanning, just like we recently got for compiler diagnostics?), can I at
> least commit the OpenACC loops processing fix?  Here is the latest
> version, simplified after your r241296 IRA vs. BB_VISITED fixes:

Sure, I considered that approved already (it's even obvious).

Richard.

> commit 766cf9959b15a17e17e89a50e905b4c546893823
> Author: Thomas Schwinge <thomas@codesourcery.com>
> Date:   Mon Oct 17 15:33:09 2016 +0200
>
>     Clear basic block flags before using BB_VISITED for OpenACC loops processing
>
>         gcc/
>         * omp-low.c (oacc_loop_discovery): Call clear_bb_flags before, and
>         don't clear BB_VISITED after processing.
>
>         gcc/testsuite/
>         * gcc.dg/goacc/loop-processing-1.c: New file.
> ---
>  gcc/omp-low.c                                  |  8 +++-----
>  gcc/testsuite/gcc.dg/goacc/loop-processing-1.c | 18 ++++++++++++++++++
>  2 files changed, 21 insertions(+), 5 deletions(-)
>
> diff --git gcc/omp-low.c gcc/omp-low.c
> index 77f89d5..3ef796f 100644
> --- gcc/omp-low.c
> +++ gcc/omp-low.c
> @@ -19236,7 +19236,9 @@ oacc_loop_sibling_nreverse (oacc_loop *loop)
>  static oacc_loop *
>  oacc_loop_discovery ()
>  {
> -  basic_block bb;
> +  /* Clear basic block flags, in particular BB_VISITED which we're going to use
> +     in the following.  */
> +  clear_bb_flags ();
>
>    oacc_loop *top = new_oacc_loop_outer (current_function_decl);
>    oacc_loop_discover_walk (top, ENTRY_BLOCK_PTR_FOR_FN (cfun));
> @@ -19245,10 +19247,6 @@ oacc_loop_discovery ()
>       that diagnostics come out in an unsurprising order.  */
>    top = oacc_loop_sibling_nreverse (top);
>
> -  /* Reset the visited flags.  */
> -  FOR_ALL_BB_FN (bb, cfun)
> -    bb->flags &= ~BB_VISITED;
> -
>    return top;
>  }
>
> diff --git gcc/testsuite/gcc.dg/goacc/loop-processing-1.c gcc/testsuite/gcc.dg/goacc/loop-processing-1.c
> new file mode 100644
> index 0000000..619576a
> --- /dev/null
> +++ gcc/testsuite/gcc.dg/goacc/loop-processing-1.c
> @@ -0,0 +1,18 @@
> +/* Make sure that OpenACC loop processing happens.  */
> +/* { dg-additional-options "-O2 -fdump-tree-oaccdevlow" } */
> +
> +extern int place ();
> +
> +void vector_1 (int *ary, int size)
> +{
> +#pragma acc parallel num_workers (32) vector_length(32) copy(ary[0:size]) firstprivate (size)
> +  {
> +#pragma acc loop gang
> +    for (int jx = 0; jx < 1; jx++)
> +#pragma acc loop auto
> +      for (int ix = 0; ix < size; ix++)
> +       ary[ix] = place ();
> +  }
> +}
> +
> +/* { dg-final { scan-tree-dump {OpenACC loops.*Loop 0\(0\).*Loop 14\(1\).*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, 20\);.*Head-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, 20\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 0\);.*Tail-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 0\);.*Loop 6\(4\).*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, 6\);.*Head-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, 6\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 2\);.*Tail-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 2\);} "oaccdevlow" } } */
>
>
> Grüße
>  Thomas



More information about the Gcc-patches mailing list