This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [OpenACC 11/11] execution tests
- From: Cesar Philippidis <cesar_philippidis at mentor dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Nathan Sidwell <nathan at acm dot org>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Bernd Schmidt <bschmidt at redhat dot com>, Jason Merrill <jason at redhat dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>
- Date: Fri, 23 Oct 2015 19:39:05 -0700
- Subject: Re: [OpenACC 11/11] execution tests
- Authentication-results: sourceware.org; auth=none
- References: <5627DD78 dot 9040302 at acm dot org> <5627ED2D dot 7000000 at acm dot org> <20151022093758 dot GM478 at tucnak dot redhat dot com> <5628EA6A dot 3080400 at acm dot org> <20151022140529 dot GS478 at tucnak dot redhat dot com> <5628F151 dot 2020805 at acm dot org> <5628F6E5 dot 3080806 at codesourcery dot com> <20151022150004 dot GU478 at tucnak dot redhat dot com> <562A98A4 dot 6030003 at codesourcery dot com>
On 10/23/2015 01:29 PM, Cesar Philippidis wrote:
> On 10/22/2015 08:00 AM, Jakub Jelinek wrote:
>> On Thu, Oct 22, 2015 at 07:47:01AM -0700, Cesar Philippidis wrote:
>>>> But it is unclear from the parsing what from these is allowed:
>>>
>>> int v, w;
>>> ...
>>> gang(26) // equivalent to gang(num:26)
>>> gang(v) // gang(num:v)
>>> vector(length: 16) // vector(length: 16)
>>> vector(length: v) // vector(length: v)
>>> vector(16) // vector(length: 16)
>>> vector(v) // vector(length: v)
>>> worker(num: 16) // worker(num: 16)
>>> worker(num: v) // worker(num: 16)
>>> worker(16) // worker(num: 16)
>>> worker(v) // worker(num: 16)
>>> gang(16, 24) // technically gang(num:16, num:24) is acceptable but it
>>> // should be an error
>>> gang(v, w) // likewise
>>> gang(static: 16, num: 5) // gang(static: 16, num: 5)
>>> gang(static: v, num: w) // gang(static: v, num: w)
>>> gang(num: 5, static: 4) // gang(num: 5, static: 4)
>>> gang(num: v, static: w) // gang(num: v, static: w)
>>>
>>> Also note that the static argument can accept '*'.
>>>
>>>> and if the length: or num: part is really optional, then
>>>> int length, num;
>>>> vector(length)
>>>> worker(num)
>>>> gang(num, static: 6)
>>>> gang(static: 5, num)
>>>> should be also accepted (or subset thereof?).
>>>
>>> Interesting question. The spec is unclear. It defines gang, worker and
>>> vector as follows in section 2.7 in the OpenACC 2.0a spec:
>>>
>>> gang [( gang-arg-list )]
>>> worker [( [num:] int-expr )]
>>> vector [( [length:] int-expr )]
>>>
>>> where gang-arg is one of:
>>>
>>> [num:] int-expr
>>> static: size-expr
>>>
>>> and gang-arg-list may have at most one num and one static argument,
>>> and where size-expr is one of:
>>>
>>> *
>>> int-expr
>>>
>>> So I've interpreted that as a requirement that length and num must be
>>> followed by an int-expr, whatever that is.
>>
>> My reading of the above is that
>> vector(length)
>> is equivalent to
>> vector(length: length)
>> and
>> worker(num)
>> is equivalent to
>> vector(num: num)
>> etc. Basically, neither length nor num aren't reserved identifiers,
>> so you can use them for variable names, and if
>> vector(v) is equivalent to vector(length: v), then
>> vector(length) should be equivalent to vector(length:length)
>> or
>> vector(length + 1) should be equivalent to vector(length: length+1)
>> static is a keyword that can't start an integral expression, so I guess
>> it is fine if you issue an expected : diagnostics after it.
>>
>> In any case, please add a testcase (both C and C++) which covers all these
>> allowed variants (ideally one testcase) and rejected variants (another
>> testcase with dg-error).
>>
>> This is still an easy case, as even the C FE has 2 tokens lookup.
>> E.g. for OpenMP map clause where
>> map (always, tofrom: x)
>> means one thing and
>> map (always, tofrom, y)
>> another one (map (tofrom: always, tofrom, y))
>> I had to do quite ugly things to get around this.
>
> Here are the updated test cases. Besides for adding a new test to
> exercise the loop shape parsing, I also removed that assembly file
> included in the original patch that Ilya noticed.
>
> Is this OK for trunk?
This patch is mostly the same as I posted earlier, with the exclusion of
the loop-shape parser test. That test was included with the c parser
changes.
Is this OK for trunk?
Cesar
2015-10-23 Nathan Sidwell <nathan@codesourcery.com>
* testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: New.
* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: New.
* testsuite/libgomp.oacc-c-c++-common/loop-g-1.s: New.
* testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: New.
* testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: New.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: New.
* testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: New.
diff --git a/libgomp/testsuite/libgomp.c++/member-2.C b/libgomp/testsuite/libgomp.c++/member-2.C
index bb348d8..bbe2bdf4 100644
--- a/libgomp/testsuite/libgomp.c++/member-2.C
+++ b/libgomp/testsuite/libgomp.c++/member-2.C
@@ -154,7 +154,7 @@ A<Q>::m1 ()
{
f = false;
#pragma omp single
- #pragma omp taskloop lastprivate (a, T<Q>::t, b, n)
+ #pragma omp taskloop lastprivate (a, T<Q>::t, b, n) private (R::r)
for (int i = 0; i < 30; i++)
{
int q = omp_get_thread_num ();
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
new file mode 100644
index 0000000..58545d0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O2" */
+
+#include <stdio.h>
+
+#define N (32*32*32+17)
+int main ()
+{
+ int ary[N];
+ int ix;
+ int exit = 0;
+ int ondev = 0;
+
+ for (ix = 0; ix < N;ix++)
+ ary[ix] = -1;
+
+#pragma acc parallel num_gangs(32) vector_length(32) copy(ary) copy(ondev)
+ {
+#pragma acc loop gang
+ for (unsigned ix = 0; ix < N; ix++)
+ {
+ if (__builtin_acc_on_device (5))
+ {
+ int g = 0, w = 0, v = 0;
+
+ __asm__ volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (g));
+ __asm__ volatile ("mov.u32 %0,%%tid.y;" : "=r" (w));
+ __asm__ volatile ("mov.u32 %0,%%tid.x;" : "=r" (v));
+ ary[ix] = (g << 16) | (w << 8) | v;
+ ondev = 1;
+ }
+ else
+ ary[ix] = ix;
+ }
+ }
+
+ for (ix = 0; ix < N; ix++)
+ {
+ int expected = ix;
+ if(ondev)
+ {
+ int g = ix / ((N + 31) / 32);
+ int w = 0;
+ int v = 0;
+
+ expected = (g << 16) | (w << 8) | v;
+ }
+
+ if (ary[ix] != expected)
+ {
+ exit = 1;
+ printf ("ary[%d]=%x expected %x\n", ix, ary[ix], expected);
+ }
+ }
+
+ return exit;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
new file mode 100644
index 0000000..c01c6fa
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O2" */
+
+#include <stdio.h>
+
+#define N (32*32*32+17)
+int main ()
+{
+ int ary[N];
+ int ix;
+ int exit = 0;
+ int ondev = 0;
+
+ for (ix = 0; ix < N;ix++)
+ ary[ix] = -1;
+
+#pragma acc parallel num_gangs(32) vector_length(32) copy(ary) copy(ondev)
+ {
+#pragma acc loop gang (static:1)
+ for (unsigned ix = 0; ix < N; ix++)
+ {
+ if (__builtin_acc_on_device (5))
+ {
+ int g = 0, w = 0, v = 0;
+
+ __asm__ volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (g));
+ __asm__ volatile ("mov.u32 %0,%%tid.y;" : "=r" (w));
+ __asm__ volatile ("mov.u32 %0,%%tid.x;" : "=r" (v));
+ ary[ix] = (g << 16) | (w << 8) | v;
+ ondev = 1;
+ }
+ else
+ ary[ix] = ix;
+ }
+ }
+
+ for (ix = 0; ix < N; ix++)
+ {
+ int expected = ix;
+ if(ondev)
+ {
+ int g = ix % 32;
+ int w = 0;
+ int v = 0;
+
+ expected = (g << 16) | (w << 8) | v;
+ }
+
+ if (ary[ix] != expected)
+ {
+ exit = 1;
+ printf ("ary[%d]=%x expected %x\n", ix, ary[ix], expected);
+ }
+ }
+
+ return exit;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c
new file mode 100644
index 0000000..f23e2f3
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c
@@ -0,0 +1,59 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O2" */
+
+#include <stdio.h>
+
+#define N (32*32*32+17)
+int main ()
+{
+ int ary[N];
+ int ix;
+ int exit = 0;
+ int ondev = 0;
+
+ for (ix = 0; ix < N;ix++)
+ ary[ix] = -1;
+
+#pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) copy(ary) copy(ondev)
+ {
+#pragma acc loop gang worker vector
+ for (unsigned ix = 0; ix < N; ix++)
+ {
+ if (__builtin_acc_on_device (5))
+ {
+ int g = 0, w = 0, v = 0;
+
+ __asm__ volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (g));
+ __asm__ volatile ("mov.u32 %0,%%tid.y;" : "=r" (w));
+ __asm__ volatile ("mov.u32 %0,%%tid.x;" : "=r" (v));
+ ary[ix] = (g << 16) | (w << 8) | v;
+ ondev = 1;
+ }
+ else
+ ary[ix] = ix;
+ }
+ }
+
+ for (ix = 0; ix < N; ix++)
+ {
+ int expected = ix;
+ if(ondev)
+ {
+ int chunk_size = (N + 32*32*32 - 1) / (32*32*32);
+
+ int g = ix / (chunk_size * 32 * 32);
+ int w = ix / 32 % 32;
+ int v = ix % 32;
+
+ expected = (g << 16) | (w << 8) | v;
+ }
+
+ if (ary[ix] != expected)
+ {
+ exit = 1;
+ printf ("ary[%d]=%x expected %x\n", ix, ary[ix], expected);
+ }
+ }
+
+ return exit;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-v-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-v-1.c
new file mode 100644
index 0000000..70c6292
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-v-1.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O2" */
+
+#include <stdio.h>
+
+#define N (32*32*32+17)
+int main ()
+{
+ int ary[N];
+ int ix;
+ int exit = 0;
+ int ondev = 0;
+
+ for (ix = 0; ix < N;ix++)
+ ary[ix] = -1;
+
+#pragma acc parallel vector_length(32) copy(ary) copy(ondev)
+ {
+#pragma acc loop vector
+ for (unsigned ix = 0; ix < N; ix++)
+ {
+ if (__builtin_acc_on_device (5))
+ {
+ int g = 0, w = 0, v = 0;
+
+ __asm__ volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (g));
+ __asm__ volatile ("mov.u32 %0,%%tid.y;" : "=r" (w));
+ __asm__ volatile ("mov.u32 %0,%%tid.x;" : "=r" (v));
+ ary[ix] = (g << 16) | (w << 8) | v;
+ ondev = 1;
+ }
+ else
+ ary[ix] = ix;
+ }
+ }
+
+ for (ix = 0; ix < N; ix++)
+ {
+ int expected = ix;
+ if(ondev)
+ {
+ int g = 0;
+ int w = 0;
+ int v = ix % 32;
+
+ expected = (g << 16) | (w << 8) | v;
+ }
+
+ if (ary[ix] != expected)
+ {
+ exit = 1;
+ printf ("ary[%d]=%x expected %x\n", ix, ary[ix], expected);
+ }
+ }
+
+ return exit;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c
new file mode 100644
index 0000000..5473c2d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O2" */
+
+#include <stdio.h>
+
+#define N (32*32*32+17)
+int main ()
+{
+ int ary[N];
+ int ix;
+ int exit = 0;
+ int ondev = 0;
+
+ for (ix = 0; ix < N;ix++)
+ ary[ix] = -1;
+
+#pragma acc parallel num_workers(32) vector_length(32) copy(ary) copy(ondev)
+ {
+#pragma acc loop worker
+ for (unsigned ix = 0; ix < N; ix++)
+ {
+ if (__builtin_acc_on_device (5))
+ {
+ int g = 0, w = 0, v = 0;
+
+ __asm__ volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (g));
+ __asm__ volatile ("mov.u32 %0,%%tid.y;" : "=r" (w));
+ __asm__ volatile ("mov.u32 %0,%%tid.x;" : "=r" (v));
+ ary[ix] = (g << 16) | (w << 8) | v;
+ ondev = 1;
+ }
+ else
+ ary[ix] = ix;
+ }
+ }
+
+ for (ix = 0; ix < N; ix++)
+ {
+ int expected = ix;
+ if(ondev)
+ {
+ int g = 0;
+ int w = ix % 32;
+ int v = 0;
+
+ expected = (g << 16) | (w << 8) | v;
+ }
+
+ if (ary[ix] != expected)
+ {
+ exit = 1;
+ printf ("ary[%d]=%x expected %x\n", ix, ary[ix], expected);
+ }
+ }
+
+ return exit;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c
new file mode 100644
index 0000000..85e4476
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O2" */
+
+#include <stdio.h>
+
+#define N (32*32*32+17)
+int main ()
+{
+ int ary[N];
+ int ix;
+ int exit = 0;
+ int ondev = 0;
+
+ for (ix = 0; ix < N;ix++)
+ ary[ix] = -1;
+
+#pragma acc parallel num_workers(32) vector_length(32) copy(ary) copy(ondev)
+ {
+#pragma acc loop worker vector
+ for (unsigned ix = 0; ix < N; ix++)
+ {
+ if (__builtin_acc_on_device (5))
+ {
+ int g = 0, w = 0, v = 0;
+
+ __asm__ volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (g));
+ __asm__ volatile ("mov.u32 %0,%%tid.y;" : "=r" (w));
+ __asm__ volatile ("mov.u32 %0,%%tid.x;" : "=r" (v));
+ ary[ix] = (g << 16) | (w << 8) | v;
+ ondev = 1;
+ }
+ else
+ ary[ix] = ix;
+ }
+ }
+
+ for (ix = 0; ix < N; ix++)
+ {
+ int expected = ix;
+ if(ondev)
+ {
+ int g = 0;
+ int w = (ix / 32) % 32;
+ int v = ix % 32;
+
+ expected = (g << 16) | (w << 8) | v;
+ }
+
+ if (ary[ix] != expected)
+ {
+ exit = 1;
+ printf ("ary[%d]=%x expected %x\n", ix, ary[ix], expected);
+ }
+ }
+
+ return exit;
+}