This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Clear basic block flags before using BB_VISITED for OpenACC loops processing


On Mon, Oct 17, 2016 at 1:47 PM, Thomas Schwinge
<thomas@codesourcery.com> wrote:
> Hi!
>
> On Mon, 17 Oct 2016 13:22:17 +0200, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Mon, Oct 17, 2016 at 11:38 AM, Thomas Schwinge
>> <thomas@codesourcery.com> wrote:
>> > On Fri, 14 Oct 2016 13:06:59 +0200, Richard Biener <richard.guenther@gmail.com> wrote:
>> >> On Fri, Oct 14, 2016 at 1:00 PM, Nathan Sidwell <nathan@acm.org> wrote:
>> >> > On 10/14/16 05:28, Richard Biener wrote:
>> >> >
>> >> >> The BB_VISITED flag has indetermined state at the beginning of a pass.
>> >> >> You have to ensure it is cleared yourself.
>> >> >
>> >> >
>> >> > In that case the openacc (&nvptx?) passes should be modified to clear the
>> >> > flags at their start, rather than at their end.
>> >
>> > The gcc/config/nvptx/nvptx.c handling seems fine -- it explicitly clears
>> > BB_VISITED for all basic block it works on.
>> >
>> >> Yes.  But as I said, I ran into IRA ICEs (somewhere in the testsuite) when not
>> >> cleaning up after tree-ssa-propagate.c.  So somebody has to fix IRA first.
>> >
>> > Is there a GCC PR for that, or where are you tracking such issues?
>>
>> No, just tracking in my head.
>
> Tsk, tsk...  ;-)

bb-reorder.c has the same issue.

Index: gcc/bb-reorder.c
===================================================================
--- gcc/bb-reorder.c    (revision 241228)
+++ gcc/bb-reorder.c    (working copy)
@@ -2355,7 +2355,10 @@ reorder_basic_blocks_simple (void)
      To start with, everything points to itself, nothing is assigned yet.  */

   FOR_ALL_BB_FN (bb, cfun)
-    bb->aux = bb;
+    {
+      bb->aux = bb;
+      bb->flags &= ~BB_VISITED;
+    }

   EXIT_BLOCK_PTR_FOR_FN (cfun)->aux = 0;


note that I didn't really understand the IRA issue (the code looks
fine from a quick look - it initializes
to BB_VISITED).  Still removing the tree-ssa-propagate.c BB_VISITED
clearing resulted in a bootstrap failure on x86_64.

Richard.

>
>> > OK to commit the following?  Is such a test case appropriate (which would
>> > have caught this issue right away), in particular the dg-final
>> > scan-tree-dump line?
>>
>> Ugh.  Not worse to what we do in various dwarf scanning I guess.
>
> ;-|
>
>> Doesn't failure lead to a miscompile eventually?  So you could formulate
>> this as a dg-do run test with a check for the desired outcome?
>
> No, unfortunately.  In this case the error is "benign" such that the
> OpenACC loop processing machinery will decide to not parallelize loops
> that ought to be parallelized.

So you can scan for "loop parallelized" instead?  I fear your pattern
is quite fragile
to maintain over time.

Richard.

>  This won't generally cause any problem
> (apart from performance regression, obviously); it just caused problems
> in a few libgomp test cases that actually at run time test for
> parallelized execution -- which will/did trigger only with nvptx
> offloading enabled, which not too many people are testing.  The test case
> I propose below will trigger also for non-offloading configurations.
>
>> >     Clear basic block flags before using BB_VISITED for OpenACC loops processing
>> >
>> >         gcc/
>> >         * omp-low.c (oacc_loop_discovery): Call clear_bb_flags.
>> >
>> >         gcc/testsuite/
>> >         * gcc.dg/goacc/loop-processing-1.c: New file.
>> > ---
>> >  gcc/omp-low.c                                  |  9 +++++----
>> >  gcc/testsuite/gcc.dg/goacc/loop-processing-1.c | 18 ++++++++++++++++++
>> >  2 files changed, 23 insertions(+), 4 deletions(-)
>> >
>> > --- gcc/omp-low.c
>> > +++ gcc/omp-low.c
>> > @@ -19340,7 +19340,9 @@ oacc_loop_sibling_nreverse (oacc_loop *loop)
>> >  static oacc_loop *
>> >  oacc_loop_discovery ()
>> >  {
>> > -  basic_block bb;
>> > +  /* Clear basic block flags, in particular BB_VISITED which we're going to use
>> > +     in the following.  */
>> > +  clear_bb_flags ();
>> >
>> >    oacc_loop *top = new_oacc_loop_outer (current_function_decl);
>> >    oacc_loop_discover_walk (top, ENTRY_BLOCK_PTR_FOR_FN (cfun));
>> > @@ -19349,9 +19351,8 @@ oacc_loop_discovery ()
>> >       that diagnostics come out in an unsurprising order.  */
>> >    top = oacc_loop_sibling_nreverse (top);
>> >
>> > -  /* Reset the visited flags.  */
>> > -  FOR_ALL_BB_FN (bb, cfun)
>> > -    bb->flags &= ~BB_VISITED;
>> > +  /* Clear basic block flags again, as otherwise IRA will explode later on.  */
>> > +  clear_bb_flags ();
>> >
>> >    return top;
>> >  }
>> > --- /dev/null
>> > +++ gcc/testsuite/gcc.dg/goacc/loop-processing-1.c
>> > @@ -0,0 +1,18 @@
>> > +/* Make sure that OpenACC loop processing happens.  */
>> > +/* { dg-additional-options "-O2 -fdump-tree-oaccdevlow" } */
>> > +
>> > +extern int place ();
>> > +
>> > +void vector_1 (int *ary, int size)
>> > +{
>> > +#pragma acc parallel num_workers (32) vector_length(32) copy(ary[0:size]) firstprivate (size)
>> > +  {
>> > +#pragma acc loop gang
>> > +    for (int jx = 0; jx < 1; jx++)
>> > +#pragma acc loop auto
>> > +      for (int ix = 0; ix < size; ix++)
>> > +       ary[ix] = place ();
>> > +  }
>> > +}
>> > +
>> > +/* { dg-final { scan-tree-dump "OpenACC loops.*Loop 0\\\(0\\\).*Loop 14\\\(1\\\).*\\\.data_dep\\\.\[0-9_\]+ = UNIQUE \\\(OACC_HEAD_MARK, 0, 1, 20\\\);.*Head-0:.*\\\.data_dep\\\.\[0-9_\]+ = UNIQUE \\\(OACC_HEAD_MARK, 0, 1, 20\\\);.*\\\.data_dep\\\.\[0-9_\]+ = UNIQUE \\\(OACC_FORK, \\\.data_dep\\\.\[0-9_\]+, 0\\\);.*Tail-0:.*\\\.data_dep\\\.\[0-9_\]+ = UNIQUE \\\(OACC_TAIL_MARK, \\\.data_dep\\\.\[0-9_\]+, 1\\\);.*\\\.data_dep\\\.\[0-9_\]+ = UNIQUE \\\(OACC_JOIN, \\\.data_dep\\\.\[0-9_\]+, 0\\\);.*Loop 6\\\(4\\\).*\\\.data_dep\\\.\[0-9_\]+ = UNIQUE \\\(OACC_HEAD_MARK, 0, 1, 6\\\);.*Head-0:.*\\\.data_dep\\\.\[0-9_\]+ = UNIQUE \\\(OACC_HEAD_MARK, 0, 1, 6\\\);.*\\\.data_dep\\\.\[0-9_\]+ = UNIQUE \\\(OACC_FORK, \\\.data_dep\\\.\[0-9_\]+, 2\\\);.*Tail-0:.*\\\.data_dep\\\.\[0-9_\]+ = UNIQUE \\\(OACC_TAIL_MARK, \\\.data_dep\\\.\[0-9_\]+, 1\\\);.*\\\.data_dep\\\.\[0-9_\]+ = UNIQUE \\\(OACC_JOIN, \\\.data_dep\\\.\[0-9_\]+, 2\\\);" "oaccdevlow" } } */
>
>
> Grüße
>  Thomas


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]