This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Default compute dimensions (runtime)


Hi,

Continuing the thread from here:

https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00198.html

On Wed, 3 Feb 2016 19:52:09 +0300
Alexander Monakov <amonakov@ispras.ru> wrote:

> On Wed, 3 Feb 2016, Nathan Sidwell wrote:
> > You can only override at runtime those dimensions that you said
> > you'd override at runtime when you compiled your program.  
> 
> Ah, I see.  That's not obvious to me, so perhaps added documentation
> can be expanded to explain that?  (I now see that the plugin silently
> drops user-provided dimensions where a value recorded at compile time
> is present; not sure if that'd be worth a runtime diagnostic, could
> be very noisy) 

This version of the patch has slightly-expanded documentation.

> > > I don't see why you say that because cuDeviceGetAttribute provides
> > > CU_DEVICE_ATTRIBUTE_WARP_SIZE,
> > > CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK,
> > > CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X (which is not too useful for
> > > this case) and cuFuncGetAttribute that allows to get a
> > > per-function thread limit. There's a patch on gomp-nvptx branch
> > > that adds querying some of those to the plugin.  
> > 
> > thanks.  There doesn't appear to be one for number of physical CTAs
> > though, right?  
> 
> Sorry, I don't understand the question: CTA is a logical entity.  One
> could derive limit of possible concurrent CTAs from number of SMs
> (CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT) multiplied by how many
> CTAs fit on one multiprocessor.  The latter figure can be taken as a
> rough worst-case value, or semi-intelligent per-kernel estimate based
> on register limits (there's code on gomp-nvptx branch that does
> this), or one can use the cuOcc* API to ask the driver for a precise
> per-kernel figure.

While the runtime part of the patch already appears to have been
committed as part of the following patch:

https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01589.html

The compile-time part of the patch has not made it upstream yet. Thus,
this rebased and retested patch consists of the parsing changes (for
-fopenacc-dim=X:Y:Z, allowing '-') and warning changes (for strange
partitioning choices), plus associated testsuite adjustments.

Tested with offloading to NVPTX and bootstrapped.

OK for trunk?

Thanks,

Julian

20xx-xx-xx  Nathan Sidwell  <nathan@acm.org>
	    Tom de Vries  <tdevries@suse.de>
	    Thomas Schwinge  <thomas@codesourcery.com>
	    Julian Brown  <julian@codesourcery.com>

	gcc/
	* doc/invoke.texi (fopenacc-dim): Update.
	* omp-offload.c (oacc_parse_default_dims): Update.
	(oacc_validate_dims): Emit warnings about strange partitioning choices.

	gcc/testsuite/
	* c-c++-common/goacc/acc-icf.c: Update.
	* c-c++-common/goacc/parallel-dims-1.c: Likewise.
	* c-c++-common/goacc/parallel-reduction.c: Likewise.
	* c-c++-common/goacc/pr70688.c: Likewise.
	* c-c++-common/goacc/routine-1.c: Likewise.
	* c-c++-common/goacc/uninit-dim-clause.c: Likewise.
	* gfortran.dg/goacc/parallel-tree.f95: Likewise.
	* gfortran.dg/goacc/routine-4.f90: Likewise.
	* gfortran.dg/goacc/routine-level-of-parallelism-1.f90: Likewise.
	* gfortran.dg/goacc/uninit-dim-clause.f95: Likewise.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Add -w.
	* testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-warn-1.c: New.
	* testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Update.
	* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/mode-transitions.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/private-variables.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise.
	* testsuite/libgomp.oacc-fortran/par-reduction-2-1.f: Likewise.
	* testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise.
	* testsuite/libgomp.oacc-fortran/pr84028.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c: New.
commit a918a8739ae7652250c978b0ececa181a587b0c0
Author: Julian Brown <julian@codesourcery.com>
Date:   Fri Oct 5 11:11:47 2018 -0700

    OpenACC default compute dimensions
    
    20xx-xx-xx  Nathan Sidwell  <nathan@acm.org>
    	    Tom de Vries  <tdevries@suse.de>
    	    Thomas Schwinge  <thomas@codesourcery.com>
    	    Julian Brown  <julian@codesourcery.com>
    
    	gcc/
    	* doc/invoke.texi (fopenacc-dim): Update.
    	* omp-offload.c (oacc_parse_default_dims): Update.
    	(oacc_validate_dims): Emit warnings about strange partitioning choices.
    
    	gcc/testsuite/
    	* c-c++-common/goacc/acc-icf.c: Update.
    	* c-c++-common/goacc/parallel-dims-1.c: Likewise.
    	* c-c++-common/goacc/parallel-reduction.c: Likewise.
    	* c-c++-common/goacc/pr70688.c: Likewise.
    	* c-c++-common/goacc/routine-1.c: Likewise.
    	* c-c++-common/goacc/uninit-dim-clause.c: Likewise.
    	* gfortran.dg/goacc/parallel-tree.f95: Likewise.
    	* gfortran.dg/goacc/routine-4.f90: Likewise.
    	* gfortran.dg/goacc/routine-level-of-parallelism-1.f90: Likewise.
    	* gfortran.dg/goacc/uninit-dim-clause.f95: Likewise.
    
    	libgomp/
    	* testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Add -w.
    	* testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/loop-warn-1.c: New.
    	* testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Update.
    	* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/mode-transitions.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/private-variables.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise.
    	* testsuite/libgomp.oacc-fortran/par-reduction-2-1.f: Likewise.
    	* testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise.
    	* testsuite/libgomp.oacc-fortran/pr84028.f90: Likewise.
    	* testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise.
    	* testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c: New.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 167eef5..21ec028 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2167,8 +2167,12 @@ have support for @option{-pthread}.
 @cindex OpenACC accelerator programming
 Specify default compute dimensions for parallel offload regions that do
 not explicitly specify.  The @var{geom} value is a triple of
-':'-separated sizes, in order 'gang', 'worker' and, 'vector'.  A size
-can be omitted, to use a target-specific default value.
+':'-separated sizes, in order 'gang', 'worker' and, 'vector'.  If a size
+is to be deferred until execution '-' can be used, alternatively a size
+can be omitted to use a target-specific default value.  When deferring
+to runtime, the environment variable @var{GOMP_OPENACC_DIM} can be set.
+It has the same format as the option value, except that '-' is not
+permitted.
 
 @item -fopenmp
 @opindex fopenmp
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 0abf028..48d1a42 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -574,8 +574,9 @@ static int oacc_default_dims[GOMP_DIM_MAX];
 static int oacc_min_dims[GOMP_DIM_MAX];
 
 /* Parse the default dimension parameter.  This is a set of
-   :-separated optional compute dimensions.  Each specified dimension
-   is a positive integer.  When device type support is added, it is
+   :-separated optional compute dimensions.  Each dimension is either
+   a positive integer, or '-' for a dynamic value computed at
+   runtime.  When device type support is added, it is
    planned to be a comma separated list of such compute dimensions,
    with all but the first prefixed by the colon-terminated device
    type.  */
@@ -610,14 +611,20 @@ oacc_parse_default_dims (const char *dims)
 
 	  if (*pos != ':')
 	    {
-	      long val;
-	      const char *eptr;
+	      long val = 0;
 
-	      errno = 0;
-	      val = strtol (pos, CONST_CAST (char **, &eptr), 10);
-	      if (errno || val <= 0 || (int) val != val)
-		goto malformed;
-	      pos = eptr;
+	      if (*pos == '-')
+		pos++;
+	      else
+		{
+		  const char *eptr;
+
+		  errno = 0;
+		  val = strtol (pos, CONST_CAST (char **, &eptr), 10);
+		  if (errno || val <= 0 || (int) val != val)
+		    goto malformed;
+		  pos = eptr;
+		}
 	      oacc_default_dims[ix] = (int) val;
 	    }
 	}
@@ -659,6 +666,34 @@ oacc_validate_dims (tree fn, tree attrs, int *dims, int level, unsigned used)
       pos = TREE_CHAIN (pos);
     }
 
+  bool check = true;
+#ifdef ACCEL_COMPILER
+  check = false;
+#endif
+  if (check
+      && !lookup_attribute ("oacc kernels", DECL_ATTRIBUTES (fn)))
+    {
+      static char const *const axes[] =
+      /* Must be kept in sync with GOMP_DIM enumeration.  */
+	{"gang", "worker", "vector" };
+      for (ix = level >= 0 ? level : 0; ix != GOMP_DIM_MAX; ix++)
+	if (dims[ix] < 0)
+	  ; /* Defaulting axis.  */
+	else if ((used & GOMP_DIM_MASK (ix)) && dims[ix] == 1)
+	  /* There is partitioned execution, but the user requested a
+	     dimension size of 1.  They're probably confused.  */
+	  warning_at (DECL_SOURCE_LOCATION (fn), 0,
+		      "region contains %s partitoned code but"
+		      " is not %s partitioned", axes[ix], axes[ix]);
+	else if (!(used & GOMP_DIM_MASK (ix)) && dims[ix] != 1)
+	  /* The dimension is explicitly partitioned to non-unity, but
+	     no use is made within the region.  */
+	  warning_at (DECL_SOURCE_LOCATION (fn), 0,
+		      "region is %s partitioned but"
+		      " does not contain %s partitioned code",
+		      axes[ix], axes[ix]);
+    }
+
   bool changed = targetm.goacc.validate_dims (fn, dims, level);
 
   /* Default anything left to 1 or a partitioned default.  */
diff --git a/gcc/testsuite/c-c++-common/goacc/acc-icf.c b/gcc/testsuite/c-c++-common/goacc/acc-icf.c
index ecfe3f2..fb2c791 100644
--- a/gcc/testsuite/c-c++-common/goacc/acc-icf.c
+++ b/gcc/testsuite/c-c++-common/goacc/acc-icf.c
@@ -4,7 +4,7 @@
 
 #pragma acc routine gang
 int
-routine1 (int n)
+routine1 (int n) /* { dg-bogus "region is worker partitioned but does not contain worker partitioned code" "" { xfail *-*-* } } */
 {
   int i;
 
@@ -17,7 +17,7 @@ routine1 (int n)
 
 #pragma acc routine gang
 int
-routine2 (int n)
+routine2 (int n) /* { dg-bogus "region is worker partitioned but does not contain worker partitioned code" "" { xfail *-*-* } } */
 {
   int i;
 
diff --git a/gcc/testsuite/c-c++-common/goacc/parallel-dims-1.c b/gcc/testsuite/c-c++-common/goacc/parallel-dims-1.c
index 57f682f..6cdbebe 100644
--- a/gcc/testsuite/c-c++-common/goacc/parallel-dims-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/parallel-dims-1.c
@@ -3,9 +3,11 @@
 
 void f(int i)
 {
-#pragma acc kernels num_gangs(i) num_workers(i) vector_length(i)
+#pragma acc kernels \
+  num_gangs(i) num_workers(i) vector_length(i)
   ;
 
-#pragma acc parallel num_gangs(i) num_workers(i) vector_length(i)
+#pragma acc parallel /* { dg-bogus "region is (gang|worker|vector) partitioned" "" { xfail *-*-* } } */ \
+  num_gangs(i) num_workers(i) vector_length(i)
   ;
 }
diff --git a/gcc/testsuite/c-c++-common/goacc/parallel-reduction.c b/gcc/testsuite/c-c++-common/goacc/parallel-reduction.c
index d7cc947..9a142c4 100644
--- a/gcc/testsuite/c-c++-common/goacc/parallel-reduction.c
+++ b/gcc/testsuite/c-c++-common/goacc/parallel-reduction.c
@@ -6,7 +6,7 @@ main ()
 
 #pragma acc data copy (dummy)
   {
-#pragma acc parallel num_gangs (10) copy (sum) reduction (+:sum)
+#pragma acc parallel num_gangs (10) copy (sum) reduction (+:sum) /* { dg-warning "gang partitioned" } */
     {
       int v = 5;
       sum += 10 + v;
diff --git a/gcc/testsuite/c-c++-common/goacc/pr70688.c b/gcc/testsuite/c-c++-common/goacc/pr70688.c
index 5a23665..3f5584a 100644
--- a/gcc/testsuite/c-c++-common/goacc/pr70688.c
+++ b/gcc/testsuite/c-c++-common/goacc/pr70688.c
@@ -21,7 +21,7 @@ parallel_reduction ()
 
 #pragma acc data copy (dummy)
   {
-#pragma acc parallel num_gangs (10) copy (sum) reduction (+:sum)
+#pragma acc parallel num_gangs (10) copy (sum) reduction (+:sum) /* { dg-warning "region is gang partitioned" } */
     {
       int v = 5;
       sum += 10 + v;
@@ -36,11 +36,11 @@ main ()
 {
   int i, s = 0;
 
-#pragma acc parallel num_gangs (10) copy (s) reduction (+:s)
+#pragma acc parallel num_gangs (10) copy (s) reduction (+:s) /* { dg-warning "region is gang partitioned" } */
   for (i = 0; i < n; i++)
     s += i+1;
 
-#pragma acc parallel num_gangs (10) reduction (+:s) copy (s)
+#pragma acc parallel num_gangs (10) reduction (+:s) copy (s) /* { dg-warning "region is gang partitioned" } */
   for (i = 0; i < n; i++)
     s += i+1;
 
diff --git a/gcc/testsuite/c-c++-common/goacc/routine-1.c b/gcc/testsuite/c-c++-common/goacc/routine-1.c
index a756922..b90e2c1 100644
--- a/gcc/testsuite/c-c++-common/goacc/routine-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/routine-1.c
@@ -1,16 +1,16 @@
 
 #pragma acc routine gang
-void gang (void)
+void gang (void) /* { dg-warning "partitioned" 3 } */
 {
 }
 
 #pragma acc routine worker
-void worker (void)
+void worker (void) /* { dg-warning "partitioned" 2 } */
 {
 }
 
 #pragma acc routine vector
-void vector (void)
+void vector (void) /* { dg-warning "partitioned" 1 } */
 {
 }
 
diff --git a/gcc/testsuite/c-c++-common/goacc/uninit-dim-clause.c b/gcc/testsuite/c-c++-common/goacc/uninit-dim-clause.c
index 9f11196..72aacd7 100644
--- a/gcc/testsuite/c-c++-common/goacc/uninit-dim-clause.c
+++ b/gcc/testsuite/c-c++-common/goacc/uninit-dim-clause.c
@@ -4,14 +4,17 @@ void acc_parallel()
 {
   int i, j, k;
 
-  #pragma acc parallel num_gangs(i) /* { dg-warning "is used uninitialized in this function" } */
-  ;
+  #pragma acc parallel loop gang num_gangs(i) /* { dg-warning "is used uninitialized in this function" } */
+  for (i = 0; i < 1; i++)
+    ;
 
-  #pragma acc parallel num_workers(j) /* { dg-warning "is used uninitialized in this function" } */
-  ;
+  #pragma acc parallel loop worker num_workers(j) /* { dg-warning "is used uninitialized in this function" } */
+  for (j = 0; j < 1; j++)
+    ;
 
-  #pragma acc parallel vector_length(k) /* { dg-warning "is used uninitialized in this function" } */
-  ;
+  #pragma acc parallel loop vector vector_length(k) /* { dg-warning "is used uninitialized in this function" } */
+  for (k = 0; k < 1; k++)
+    ;
 }
 
 void acc_kernels()
diff --git a/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 b/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95
index 2697bb7..aaa1bfd 100644
--- a/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95
@@ -11,6 +11,9 @@ program test
   !$acc reduction(max:q), copy(i), copyin(j), copyout(k), create(m) &
   !$acc present(o), pcopy(p), pcopyin(r), pcopyout(s), pcreate(t) &
   !$acc deviceptr(u), private(v), firstprivate(w)
+  ! { dg-warning "region is gang partitioned but does not contain gang partitioned code" "" { target *-*-* } .-1 }
+  ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-2 }
+  ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-3 }
   !$acc end parallel
 
 end program test
diff --git a/gcc/testsuite/gfortran.dg/goacc/routine-4.f90 b/gcc/testsuite/gfortran.dg/goacc/routine-4.f90
index 6714c7b..3fb60e7 100644
--- a/gcc/testsuite/gfortran.dg/goacc/routine-4.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/routine-4.f90
@@ -123,6 +123,7 @@ contains
     integer, intent (inout) :: a(N)
     integer :: i
 
+    !$acc loop gang worker vector
     do i = 1, N
        a(i) = a(i) - a(i)
     end do
@@ -133,6 +134,7 @@ contains
     integer, intent (inout) :: a(N)
     integer :: i
 
+    !$acc loop worker vector
     do i = 1, N
        a(i) = a(i) - a(i)
     end do
@@ -143,6 +145,7 @@ contains
     integer, intent (inout) :: a(N)
     integer :: i
 
+    !$acc loop vector
     do i = 1, N
        a(i) = a(i) - a(i)
     end do
@@ -153,6 +156,7 @@ contains
     integer, intent (inout) :: a(N)
     integer :: i
 
+    !$acc loop seq
     do i = 1, N
        a(i) = a(i) - a(i)
     end do
diff --git a/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90 b/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90
index 75dd1b0..1b41a68 100644
--- a/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90
@@ -2,8 +2,10 @@
 ! parallelism with the OpenACC routine directive.  The Fortran counterpart is
 ! c-c++-common/goacc/routine-level-of-parallelism-2.c
 
-subroutine g_1
+subroutine g_1 ! { dg-warning "region is gang partitioned but does not contain gang partitioned code" }
   !$acc routine gang
+! { dg-bogus "region is worker partitioned but does not contain worker partitioned code" "worker partitioned" { xfail *-*-* } .-2 }
+! { dg-bogus "region is vector partitioned but does not contain vector partitioned code" "worker partitioned" { xfail *-*-* } .-3 }
 end subroutine g_1
 
 subroutine s_1_2a
diff --git a/gcc/testsuite/gfortran.dg/goacc/uninit-dim-clause.f95 b/gcc/testsuite/gfortran.dg/goacc/uninit-dim-clause.f95
index 5dea42b..8551140 100644
--- a/gcc/testsuite/gfortran.dg/goacc/uninit-dim-clause.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/uninit-dim-clause.f95
@@ -4,14 +4,20 @@ subroutine acc_parallel
   implicit none
   integer :: i, j, k
 
-  !$acc parallel num_gangs(i) ! { dg-warning "is used uninitialized in this function" }
-  !$acc end parallel
-
-  !$acc parallel num_workers(j) ! { dg-warning "is used uninitialized in this function" }
-  !$acc end parallel
-
-  !$acc parallel vector_length(k) ! { dg-warning "is used uninitialized in this function" }
-  !$acc end parallel
+  !$acc parallel loop gang num_gangs(i) ! { dg-warning "is used uninitialized in this function" }
+  do i = 0, 1
+  end do
+  !$acc end parallel loop
+
+  !$acc parallel loop worker num_workers(j) ! { dg-warning "is used uninitialized in this function" }
+  do j = 0, 1
+  end do
+  !$acc end parallel loop
+
+  !$acc parallel loop vector vector_length(k) ! { dg-warning "is used uninitialized in this function" }
+  do k = 0, 1
+  end do
+  !$acc end parallel loop
 end subroutine acc_parallel
 
 subroutine acc_kernels
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c
index 689a443..14bc3af 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c
@@ -117,6 +117,8 @@ void t4 ()
     arr[i] = 3;
 
 #pragma acc parallel firstprivate(x) copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 119 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 119 } */
   {
 #pragma acc loop gang
     for (i = 0; i < 32; i++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c
index 34bc57e..8e2c1c9 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c
@@ -147,7 +147,7 @@ int gang_1 (int *ary, int size)
 {
   clear (ary, size);
   
-#pragma acc parallel num_gangs (32) num_workers (32) vector_length(32) copy(ary[0:size]) firstprivate (size)
+#pragma acc parallel num_gangs (32) num_workers (32) vector_length(32) copy(ary[0:size]) firstprivate (size)/* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } } */
   {
 #pragma acc loop auto
     for (int jx = 0; jx <  size  / 64; jx++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c
new file mode 100644
index 0000000..6c479e4
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c
@@ -0,0 +1,13 @@
+/* { dg-additional-options "-fopenacc-dim=16:16" } */
+/* This code uses nvptx inline assembly guarded with acc_on_device, which is
+   not optimized away at -O0, and then confuses the target assembler.
+   { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
+/* { dg-set-target-env-var "GOMP_OPENACC_DIM" "8:8" } */
+
+#include "loop-default.h"
+
+int main ()
+{
+  /* Environment should be ignored.  */
+  return test_1 (16, 16, 32);
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
index 98f02e9..5831327 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
@@ -1,3 +1,4 @@
+/* { dg-additional-options "-w" } */
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
index 4152a4e..82e8aae 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
@@ -1,3 +1,4 @@
+/* { dg-additional-options "-w" } */
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
index 7107502..2f3a44f 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
@@ -1,3 +1,4 @@
+/* { dg-additional-options "-w" } */
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c
index 6bbd04f..a1bb845 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c
@@ -1,3 +1,4 @@
+/* { dg-additional-options "-w" } */
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c
index c63a5d4..ae43bb4 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c
@@ -1,3 +1,4 @@
+/* { dg-additional-options "-w" } */
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c
index fa6fb91..10b80f1 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c
@@ -14,6 +14,7 @@ int main ()
     ary[ix] = -1;
   
 #pragma acc parallel num_workers(32) vector_length(32) copy(ary) copy(ondev)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 16 } */
   {
 #pragma acc loop worker
     for (unsigned ix = 0; ix < N; ix++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-warn-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-warn-1.c
new file mode 100644
index 0000000..20a022f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-warn-1.c
@@ -0,0 +1,37 @@
+
+/* Check warnings about suboptimal partitioning choices.  */
+
+int main ()
+{
+  int ary[10];
+
+#pragma acc parallel copy(ary) num_gangs (1) /* { dg-warning "is not gang partitioned" } */
+  {
+    #pragma acc loop gang
+    for (int  i = 0; i < 10; i++)
+      ary[i] = i;
+  }
+
+#pragma acc parallel copy(ary) num_workers (1) /* { dg-warning "is not worker partitioned" } */
+  {
+    #pragma acc loop worker
+    for (int  i = 0; i < 10; i++)
+      ary[i] = i;
+  }
+
+#pragma acc parallel copy(ary) num_gangs (8) /* { dg-warning "is gang partitioned" } */
+  {
+    #pragma acc loop worker
+    for (int  i = 0; i < 10; i++)
+      ary[i] = i;
+  }
+
+#pragma acc parallel copy(ary) num_workers (8) /* { dg-warning "is worker partitioned" } */
+  {
+    #pragma acc loop gang
+    for (int  i = 0; i < 10; i++)
+      ary[i] = i;
+  }
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/mode-transitions.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/mode-transitions.c
index 4474c12..f62daf0 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/mode-transitions.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/mode-transitions.c
@@ -287,6 +287,7 @@ void t7()
   int n = 0;
   #pragma acc parallel copy(n) \
 		       num_gangs(1) num_workers(1) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 288 } */
   {
     n++;
   }
@@ -310,6 +311,7 @@ void t8()
 
       #pragma acc parallel copy(arr) \
 			   num_gangs(gangs) num_workers(1) vector_length(32)
+      /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 312 } */
       {
 	int j;
 	#pragma acc loop gang
@@ -339,6 +341,7 @@ void t9()
 
       #pragma acc parallel copy(arr) \
 			   num_gangs(gangs) num_workers(1) vector_length(32)
+      /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 342 } */
       {
 	int j;
 	#pragma acc loop gang
@@ -371,6 +374,7 @@ void t10()
 
       #pragma acc parallel copy(arr) \
 			   num_gangs(gangs) num_workers(1) vector_length(32)
+      /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 375 } */
       {
 	int j;
 	#pragma acc loop gang
@@ -404,6 +408,7 @@ void t11()
 
   #pragma acc parallel copy(arr) \
 		       num_gangs(1024) num_workers(1) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 409 } */
   {
     int j;
 
@@ -442,6 +447,7 @@ void t12()
 
   #pragma acc parallel copyout(fizz, buzz, fizzbuzz) \
 		       num_gangs(NUM_GANGS) num_workers(1) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 448 } */
   {
     int j;
     
@@ -488,6 +494,7 @@ void t13()
 
   #pragma acc parallel copy(arr) \
 		       num_gangs(8) num_workers(8) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 495 } */
   {
     int j;
     #pragma acc loop gang
@@ -613,6 +620,7 @@ void t16()
 
   #pragma acc parallel copy(n, arr) \
 		       num_gangs(8) num_workers(16) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 621 } */
   {
     int j;
     #pragma acc loop gang
@@ -665,6 +673,7 @@ void t17()
 
 	#pragma acc parallel copyin(arr_a) copyout(arr_b) \
 			     num_gangs(num_gangs) num_workers(num_workers) vector_length(32)
+	/* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 674 } */
 	{
 	  int j;
 	  #pragma acc loop gang
@@ -882,6 +891,8 @@ void t21()
 
   #pragma acc parallel copy(arr) \
 		       num_gangs(8) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 892 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 892 } */
   {
     int j;
     #pragma acc loop gang
@@ -905,6 +916,8 @@ void t22()
 
   #pragma acc parallel copy(arr) \
 		       num_gangs(8) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 917 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 917 } */
   {
     int j;
     #pragma acc loop gang
@@ -931,6 +944,8 @@ void t23()
 
   #pragma acc parallel copy(arr) \
 		       num_gangs(8) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 945 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 945 } */
   {
     int j;
     #pragma acc loop gang
@@ -957,6 +972,8 @@ void t24()
 
   #pragma acc parallel copy(arr) \
 		       num_gangs(8) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 973 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 973 } */
   {
     int j;
     #pragma acc loop gang
@@ -988,6 +1005,7 @@ void t25()
 
   #pragma acc parallel copy(arr) \
 		       num_gangs(8) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 1006 } */
   {
     int j;
     #pragma acc loop gang
@@ -1020,6 +1038,7 @@ void t26()
 
   #pragma acc parallel copy(arr) \
 		       num_gangs(8) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 1039 } */
   {
     int j;
     #pragma acc loop gang
@@ -1070,6 +1089,8 @@ void t27()
 
   #pragma acc parallel copy(n, arr) copyout(ondev) \
 	  num_gangs(ACTUAL_GANGS) num_workers(8) vector_length(32)
+  /* { dg-warning "region is gang partitioned but does not contain gang partitioned code" "gang" { target *-*-* } 1090 } */
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 1090 } */
   {
     int j;
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
index 7781b32..ebcb760 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
@@ -96,7 +96,7 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (gangs_actual) \
+#pragma acc parallel copy (gangs_actual) /* { dg-warning "region contains gang partitoned code but is not gang partitioned" } */ \
   num_gangs (GANGS) /* { dg-warning "'num_gangs' value must be positive" "" { target c++ } } */
     {
       /* We're actually executing with num_gangs (1).  */
@@ -125,7 +125,7 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (workers_actual) \
+#pragma acc parallel copy (workers_actual) /* { dg-warning "region contains worker partitoned code but is not worker partitioned" } */ \
   num_workers (WORKERS) /* { dg-warning "'num_workers' value must be positive" "" { target c++ } } */
     {
       /* We're actually executing with num_workers (1).  */
@@ -154,7 +154,8 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (vectors_actual) /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \
+#pragma acc parallel copy (vectors_actual) /* { dg-warning "region contains vector partitoned code but is not vector partitioned" } */ \
+  /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } 157 } */ \
   vector_length (VECTORS) /* { dg-warning "'vector_length' value must be positive" "" { target c++ } } */
     {
       /* We're actually executing with vector_length (1), just the GCC nvptx
@@ -198,7 +199,7 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (gangs_actual) \
+#pragma acc parallel copy (gangs_actual) /* { dg-warning "region is gang partitioned but does not contain gang partitioned code" } */ \
   reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) \
   num_gangs (gangs)
     {
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c
index 53f03d1..f0c3447 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c
@@ -22,6 +22,8 @@ void local_g_1()
     arr[i] = 3;
 
   #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 24 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 24 } */
   {
     int x;
 
@@ -295,6 +297,8 @@ void loop_g_1()
     arr[i] = i;
 
   #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 299 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 299 } */
   {
     #pragma acc loop gang private(x)
     for (i = 0; i < 32; i++)
@@ -320,6 +324,7 @@ void loop_g_2()
     arr[i] = i;
 
   #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 326 } */
   {
     #pragma acc loop gang private(x)
     for (i = 0; i < 32; i++)
@@ -348,6 +353,7 @@ void loop_g_3()
     arr[i] = i;
 
   #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 355 } */
   {
     #pragma acc loop gang private(x)
     for (i = 0; i < 32; i++)
@@ -376,6 +382,7 @@ void loop_g_4()
     arr[i] = i;
 
   #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 384 } */
   {
     #pragma acc loop gang private(x)
     for (i = 0; i < 32; i++)
@@ -408,6 +415,7 @@ void loop_g_5()
     arr[i] = i;
 
   #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 417 } */
   {
     #pragma acc loop gang private(x)
     for (i = 0; i < 32; i++)
@@ -438,6 +446,7 @@ void loop_g_6()
     arr[i] = i;
 
   #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 448 } */
   {
     #pragma acc loop gang private(pt)
     for (i = 0; i < 32; i++)
@@ -559,6 +568,7 @@ void loop_w_1()
     arr[i] = i;
 
   #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 570 } */
   {
     int j;
 
@@ -875,6 +885,8 @@ void parallel_g_1()
     arr[i] = 3;
 
   #pragma acc parallel private(x) copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 887 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 887 } */
   {
     #pragma acc loop gang(static:1)
     for (i = 0; i < 32; i++)
@@ -904,6 +916,7 @@ void parallel_g_2()
     arr[i] = i;
 
   #pragma acc parallel private(x) copy(arr) num_gangs(32) num_workers(2) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 918 } */
   {
     #pragma acc loop gang
     for (i = 0; i < 32; i++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c
index c4940b8..68ae919 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c
@@ -14,6 +14,8 @@ void g_np_1()
     arr[i] = i;
 
   #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 16 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 16 } */
   {
     #pragma acc loop gang reduction(+:res)
     for (i = 0; i < 1024; i++)
@@ -28,6 +30,8 @@ void g_np_1()
   res = hres = 1;
 
   #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 32 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 32 } */
   {
     #pragma acc loop gang reduction(*:res)
     for (i = 0; i < 12; i++)
@@ -52,6 +56,7 @@ void gv_np_1()
     arr[i] = i;
 
   #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 58 } */
   {
     #pragma acc loop gang vector reduction(+:res)
     for (i = 0; i < 1024; i++)
@@ -76,6 +81,7 @@ void gw_np_1()
     arr[i] = i;
 
   #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 83 } */
   {
     #pragma acc loop gang worker reduction(+:res)
     for (i = 0; i < 1024; i++)
@@ -239,6 +245,7 @@ void v_p_1()
 
   #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \
 		       private(res) copyout(out)
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 246 } */
   {
     #pragma acc loop gang
     for (j = 0; j < 32; j++)
@@ -315,6 +322,7 @@ void w_p_1()
 
   #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \
 		       private(res) copyout(out)
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 323 } */
   {
     #pragma acc loop gang
     for (j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c
index a164f57..8c3b938 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c
@@ -6,6 +6,8 @@
 
 #pragma acc routine gang
 void __attribute__ ((noinline)) gang (int ary[N])
+/* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 8 } */
+/* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 8 } */
 {
 #pragma acc loop gang
     for (unsigned ix = 0; ix < N; ix++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c
index 81f1e03..e14947c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c
@@ -6,6 +6,7 @@
 
 #pragma acc routine worker
 void __attribute__ ((noinline)) worker (int ary[N])
+/* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 8 } */
 {
 #pragma acc loop worker
   for (unsigned ix = 0; ix < N; ix++)
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-1.f b/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-1.f
index aa1bb63..ff31116 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-1.f
+++ b/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-1.f
@@ -14,7 +14,7 @@
       RES2 = 0
 
 !$ACC PARALLEL NUM_GANGS(256) NUM_WORKERS(32) VECTOR_LENGTH(32)
-!$ACC& REDUCTION(+:RES1) COPY(RES1, RES2) ASYNC(1)
+!$ACC& REDUCTION(+:RES1) COPY(RES1, RES2) ASYNC(1) ! { dg-warning "region is (gang|worker|vector) partitioned" }
       res1 = res1 + 5
 
 !$ACC ATOMIC
@@ -36,7 +36,7 @@
       RES2 = 1
 
 !$ACC PARALLEL NUM_GANGS(8) NUM_WORKERS(32) VECTOR_LENGTH(32)
-!$ACC& REDUCTION(*:RES1) COPY(RES1, RES2) ASYNC(1)
+!$ACC& REDUCTION(*:RES1) COPY(RES1, RES2) ASYNC(1) ! { dg-warning "region is (gang|worker|vector) partitioned" }
       res1 = res1 * 5
 
 !$ACC ATOMIC
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-2.f b/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-2.f
index 5694de1..47c5ff3 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-2.f
+++ b/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-2.f
@@ -14,7 +14,7 @@
       RES2 = 0
 
 !$ACC PARALLEL NUM_GANGS(256) NUM_WORKERS(32) VECTOR_LENGTH(32)
-!$ACC& REDUCTION(+:RES1) COPY(RES1, RES2) ASYNC(1)
+!$ACC& REDUCTION(+:RES1) COPY(RES1, RES2) ASYNC(1) ! { dg-warning "region is (gang|worker|vector) partitioned" }
       res1 = res1 + 5
 
 !$ACC ATOMIC
@@ -36,7 +36,7 @@
       RES2 = 1
 
 !$ACC PARALLEL NUM_GANGS(8) NUM_WORKERS(32) VECTOR_LENGTH(32)
-!$ACC& REDUCTION(*:RES1) COPY(RES1, RES2) ASYNC(1)
+!$ACC& REDUCTION(*:RES1) COPY(RES1, RES2) ASYNC(1) ! { dg-warning "region is (gang|worker|vector) partitioned" }
       res1 = res1 * 5
 
 !$ACC ATOMIC
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr84028.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr84028.f90
index 2b36122..8cb76a9 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/pr84028.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr84028.f90
@@ -5,7 +5,7 @@ program foo
 
   a = 1
 
-  !$acc parallel num_gangs(1) num_workers(2)
+  !$acc parallel num_gangs(1) num_workers(2) ! { dg-warning "region is worker partitioned" }
 
   if (any(a(1:3,1:3,1:3).ne.1)) STOP 1
 
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90 b/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90
index 472a6a1..fbff5cc 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90
@@ -13,6 +13,8 @@ subroutine t1()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 15 }
+  ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 15 }
   !$acc loop gang private(x)
   do i = 1, 32
      x = i * 2;
@@ -37,6 +39,7 @@ subroutine t2()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 41 }
   !$acc loop gang private(x)
   do i = 0, 31
      x = i * 2;
@@ -65,6 +68,7 @@ subroutine t3()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 70 }
   !$acc loop gang private(x)
   do i = 0, 31
      x = i * 2;
@@ -98,6 +102,7 @@ subroutine t4()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 104 }
   !$acc loop gang private(pt)
   do i = 0, 31
      pt%x = i
@@ -208,6 +213,7 @@ subroutine t7()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
+  ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 215 }
   !$acc loop gang private(x)
   do i = 0, 31
      !$acc loop worker private(x)
@@ -507,6 +513,8 @@ subroutine t14()
   end do
 
   !$acc parallel private(x) copy(arr) num_gangs(n) num_workers(8) vector_length(32)
+  ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 515 }
+  ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 515 }
     !$acc loop gang(static:1)
     do i = 1, n
       x = i * 2;
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
index f58a95f..a83e92a 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
@@ -1,4 +1,3 @@
-
 ! { dg-do run }
 ! { dg-additional-options "-cpp" }
 
@@ -100,7 +99,7 @@ subroutine gang (a)
   integer, intent (inout) :: a(N)
   integer :: i
 
-  !$acc loop gang
+  !$acc loop gang worker vector
   do i = 1, N
     a(i) = a(i) - i 
   end do

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]