This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [gomp4.1] Initial support for some OpenMP 4.1 construct parsing
- From: Ilya Verbin <iverbin at gmail dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org, Kirill Yukhin <kirill dot yukhin at gmail dot com>
- Date: Thu, 25 Jun 2015 22:45:29 +0300
- Subject: Re: [gomp4.1] Initial support for some OpenMP 4.1 construct parsing
- Authentication-results: sourceware.org; auth=none
- References: <20150429111406 dot GE1751 at tucnak dot redhat dot com> <874mnzrw1z dot fsf at schwinge dot name> <20150429120644 dot GG1751 at tucnak dot redhat dot com> <20150609183608 dot GA47936 at msticlxl57 dot ims dot intel dot com> <20150609202426 dot GG10247 at tucnak dot redhat dot com>
On Tue, Jun 09, 2015 at 22:24:26 +0200, Jakub Jelinek wrote:
> On Tue, Jun 09, 2015 at 09:36:08PM +0300, Ilya Verbin wrote:
> > I don't quite understand from "If a depend clause is present, then it is treated
> > as if it had appeared on the implicit task construct that encloses the target
> > construct", is
> >
> > #pragma omp target depend(inout: x)
> >
> > equivalent to
> >
> > #pragma omp task depend(inout: x)
> > #pragma omp target
> >
> > or not?
> >
> > In other words, can't we just generate GOMP_task (...) with GOMP_target (...)
> > inside, without any new arguments?
>
> No, that would be an explicit task. Furthermore, the implicit task isn't
> on the host side, but on the offloading device side. The implicit task is
> what is executed when you enter the #pragma omp target. Ignoring the teams
> construct which is there mainly for NVidia GPGPUs, when you enter the
> #pragma omp target construct, there is an implicit parallel with
> num_threads(1) (like there is an implicit parallel with num_threads(1)
> when you enter main () of a host program), and that implicit parallel has
> a single implicit task, which executes the statements inside of #pragma
> omp target body, until you encounter #pragma omp teams or #pragma omp
> parallel. And the above statement simply says that no statements from
> the #pragma omp target body are executed until the depend dependency is
> satisfied. Whether these dependencies are host addresses, or offloading
> device addresses, is something that really needs to be figured out, I admit
> I haven't read the whole async offloading text carefully yet, nor
> participated in the telecons about it.
So, as I understood, three tasks will be generated almost simultaneously in
foo1: one on host and two on target.
Target task 1 will be executed immediately.
Host task will wait for task 1 to be completed on target.
(Or it is not possible to mix "omp target" and "omp task" dependencies?)
And task 2 will wait on target for task 1.
void foo1 ()
{
int x;
#pragma omp parallel
#pragma omp single
{
#pragma omp target nowait depend(out: x)
fprintf (stderr, "target task 1\n");
#pragma omp task depend(in: x)
fprintf (stderr, "host task\n");
#pragma omp target depend(in: x)
fprintf (stderr, "target task 2\n");
}
}
I just can't understand why do we need target tasks, i.e. why a host task with a
target region inside can't wait for another host task, like in foo2?
void foo2 ()
{
int x;
#pragma omp parallel
#pragma omp single
{
#pragma omp task depend(out: x)
#pragma omp target
fprintf (stderr, "host task with tgt 1\n");
#pragma omp task depend(in: x)
fprintf (stderr, "host task\n");
#pragma omp task depend(in: x)
#pragma omp target
fprintf (stderr, "host task with tgt 2\n");
}
}
Thanks,
-- Ilya