This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [gomp4] Redesign oacc_parallel launch API
- From: Cesar Philippidis <cesar_philippidis at mentor dot com>
- To: Nathan Sidwell <nathan at acm dot org>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Cc: Jakub Jelinek <jakub at redhat dot com>, Thomas Schwinge <THOMAS_SCHWINGE at mentor dot com>
- Date: Thu, 6 Aug 2015 09:33:33 -0700
- Subject: Re: [gomp4] Redesign oacc_parallel launch API
- Authentication-results: sourceware.org; auth=none
- References: <55B7B332 dot 1010603 at acm dot org>
On 07/28/2015 09:52 AM, Nathan Sidwell wrote:
> I've committed this patch to the gomp4 branch to redo the launch API.
> I'll post a version for trunk once the versioning patch gets approved &
> committed.
>
> This changes the API in a number of ways, allowing device-specific
> knowledge to be moved into the device compiler and out of the host
> compiler.
>
> Firstly, we attach a tuple of launch dimensions as an attribute to the
> offloaded function's 'oacc function' attribute. These are the constant
> launch dimensions. Dynamic dimensions get a zero for their slot in this
> list. Further this list can be extended in the future to an alist keyed
> by device_type.
>
> Dynamic dimensions are computed on the host. however they are passed
> via varadic args to the GOACC_parallel function (which is renamed). The
> varadic args are passed using key/value representation, and 3 keys are
> currently defined:
> END -- end of the varadic list
> DIM - set of runtime-computed dimensions. Only the dynamic ones are
> passed.
> ASYNC_WAIT - an async and a set of waits (possibly zero).
>
> I have arranged for the key to have a slot that can later be filled by
> device_type, and hence support multiple device types.
>
> The constant dimensions can be used in expansion of the GOACC_nid
> function in the device compiler. The device compiler could also process
> that list to select the device_type slot that is appropriate.
>
> For PTX the backend is augmented to emit the launch dimensions into the
> target data, from whence the ptx plugin can pick them up and overwrite
> with any dynamic ones passed in from the launch function.
Looking at set_oacc_fn_attrib, it appears that const values are also
considered dynamic. See the attached test case more more info. Is that
the expected behavior? If not, I could take a look at this after I
finished my reduction patch.
Cesar
#include <stdio.h>
const int vl = 32;
int
main ()
{
unsigned int red = 0;
#pragma acc parallel loop vector_length (vl) vector reduction (+:red) copy (red)
for (int i = 0; i < 100; i++)
red ++;
printf ("red = %d\n", red);
return 0;
}