This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/85590] [nvptx, libgomp, openacc] Use cuda runtime fns to determine launch configuration in nvptx plugin
- From: "vries at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 13 Aug 2018 12:04:56 +0000
- Subject: [Bug target/85590] [nvptx, libgomp, openacc] Use cuda runtime fns to determine launch configuration in nvptx plugin
- Auto-submitted: auto-generated
- References: <bug-85590-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85590
--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
Author: vries
Date: Mon Aug 13 12:04:24 2018
New Revision: 263505
URL: https://gcc.gnu.org/viewcvs?rev=263505&root=gcc&view=rev
Log:
[nvptx] Use CUDA driver API to select default runtime launch geometry
The CUDA driver API starting version 6.5 offers a set of runtime functions to
calculate several occupancy-related measures, as a replacement for the
occupancy
calculator spreadsheet.
This patch adds a heuristic for default runtime launch geometry, based on the
new runtime function cuOccupancyMaxPotentialBlockSize.
Build on x86_64 with nvptx accelerator and ran libgomp testsuite.
2018-08-13 Cesar Philippidis <cesar@codesourcery.com>
Tom de Vries <tdevries@suse.de>
PR target/85590
* plugin/cuda/cuda.h (CUoccupancyB2DSize): New typedef.
(cuOccupancyMaxPotentialBlockSize): Declare.
* plugin/cuda-lib.def (cuOccupancyMaxPotentialBlockSize): New
CUDA_ONE_CALL_MAYBE_NULL.
* plugin/plugin-nvptx.c (CUDA_VERSION < 6050): Define
CUoccupancyB2DSize and declare
cuOccupancyMaxPotentialBlockSize.
(nvptx_exec): Use cuOccupancyMaxPotentialBlockSize to set the
default num_gangs and num_workers when the driver supports it.
Modified:
trunk/libgomp/ChangeLog
trunk/libgomp/plugin/cuda-lib.def
trunk/libgomp/plugin/cuda/cuda.h
trunk/libgomp/plugin/plugin-nvptx.c