]> gcc.gnu.org Git - gcc.git/blame - libgomp/libgomp.texi
OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.
[gcc.git] / libgomp / libgomp.texi
CommitLineData
3721b9e1
DF
1\input texinfo @c -*-texinfo-*-
2
3@c %**start of header
4@setfilename libgomp.info
5@settitle GNU libgomp
6@c %**end of header
7
8
9@copying
abc1ac2d 10Copyright @copyright{} 2006-2022 Free Software Foundation, Inc.
3721b9e1
DF
11
12Permission is granted to copy, distribute and/or modify this document
07a67d6a 13under the terms of the GNU Free Documentation License, Version 1.3 or
3721b9e1 14any later version published by the Free Software Foundation; with the
70b1e376 15Invariant Sections being ``Funding Free Software'', the Front-Cover
3721b9e1
DF
16texts being (a) (see below), and with the Back-Cover Texts being (b)
17(see below). A copy of the license is included in the section entitled
18``GNU Free Documentation License''.
19
20(a) The FSF's Front-Cover Text is:
21
22 A GNU Manual
23
24(b) The FSF's Back-Cover Text is:
25
26 You have freedom to copy and modify this GNU Manual, like GNU
27 software. Copies published by the Free Software Foundation raise
28 funds for GNU development.
29@end copying
30
31@ifinfo
32@dircategory GNU Libraries
33@direntry
f1f3453e 34* libgomp: (libgomp). GNU Offloading and Multi Processing Runtime Library.
3721b9e1
DF
35@end direntry
36
f1f3453e 37This manual documents libgomp, the GNU Offloading and Multi Processing
41dbbb37
TS
38Runtime library. This is the GNU implementation of the OpenMP and
39OpenACC APIs for parallel and accelerator programming in C/C++ and
40Fortran.
3721b9e1
DF
41
42Published by the Free Software Foundation
4351 Franklin Street, Fifth Floor
44Boston, MA 02110-1301 USA
45
46@insertcopying
47@end ifinfo
48
49
50@setchapternewpage odd
51
52@titlepage
f1f3453e 53@title GNU Offloading and Multi Processing Runtime Library
41dbbb37 54@subtitle The GNU OpenMP and OpenACC Implementation
3721b9e1
DF
55@page
56@vskip 0pt plus 1filll
57@comment For the @value{version-GCC} Version*
58@sp 1
59Published by the Free Software Foundation @*
6051 Franklin Street, Fifth Floor@*
61Boston, MA 02110-1301, USA@*
62@sp 1
63@insertcopying
64@end titlepage
65
66@summarycontents
67@contents
68@page
69
70
c33fd160 71@node Top, Enabling OpenMP
3721b9e1
DF
72@top Introduction
73@cindex Introduction
74
f1f3453e 75This manual documents the usage of libgomp, the GNU Offloading and
41dbbb37 76Multi Processing Runtime Library. This includes the GNU
1a6d1d24 77implementation of the @uref{https://www.openmp.org, OpenMP} Application
41dbbb37
TS
78Programming Interface (API) for multi-platform shared-memory parallel
79programming in C/C++ and Fortran, and the GNU implementation of the
9651fbaf 80@uref{https://www.openacc.org, OpenACC} Application Programming
41dbbb37
TS
81Interface (API) for offloading of code to accelerator devices in C/C++
82and Fortran.
3721b9e1 83
41dbbb37
TS
84Originally, libgomp implemented the GNU OpenMP Runtime Library. Based
85on this, support for OpenACC and offloading (both OpenACC and OpenMP
864's target construct) has been added later on, and the library's name
87changed to GNU Offloading and Multi Processing Runtime Library.
f1f3453e 88
3721b9e1
DF
89
90
91@comment
92@comment When you add a new menu item, please keep the right hand
93@comment aligned to the same column. Do not use tabs. This provides
94@comment better formatting.
95@comment
96@menu
97* Enabling OpenMP:: How to enable OpenMP for your applications.
cff72ef4 98* OpenMP Implementation Status:: List of implemented features by OpenMP version
4102bda6
TS
99* OpenMP Runtime Library Routines: Runtime Library Routines.
100 The OpenMP runtime application programming
3721b9e1 101 interface.
4102bda6
TS
102* OpenMP Environment Variables: Environment Variables.
103 Influencing OpenMP runtime behavior with
104 environment variables.
cdf6119d
JN
105* Enabling OpenACC:: How to enable OpenACC for your
106 applications.
107* OpenACC Runtime Library Routines:: The OpenACC runtime application
108 programming interface.
109* OpenACC Environment Variables:: Influencing OpenACC runtime behavior with
110 environment variables.
111* CUDA Streams Usage:: Notes on the implementation of
112 asynchronous operations.
113* OpenACC Library Interoperability:: OpenACC library interoperability with the
114 NVIDIA CUBLAS library.
5fae049d 115* OpenACC Profiling Interface::
3721b9e1 116* The libgomp ABI:: Notes on the external ABI presented by libgomp.
f1f3453e
TS
117* Reporting Bugs:: How to report bugs in the GNU Offloading and
118 Multi Processing Runtime Library.
3721b9e1
DF
119* Copying:: GNU general public license says
120 how you can copy and share libgomp.
121* GNU Free Documentation License::
122 How you can copy and share this manual.
123* Funding:: How to help assure continued work for free
124 software.
3d3949df 125* Library Index:: Index of this documentation.
3721b9e1
DF
126@end menu
127
128
129@c ---------------------------------------------------------------------
130@c Enabling OpenMP
131@c ---------------------------------------------------------------------
132
133@node Enabling OpenMP
134@chapter Enabling OpenMP
135
136To activate the OpenMP extensions for C/C++ and Fortran, the compile-time
83fd6c5b 137flag @command{-fopenmp} must be specified. This enables the OpenMP directive
3721b9e1
DF
138@code{#pragma omp} in C/C++ and @code{!$omp} directives in free form,
139@code{c$omp}, @code{*$omp} and @code{!$omp} directives in fixed form,
140@code{!$} conditional compilation sentinels in free form and @code{c$},
83fd6c5b 141@code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
3721b9e1
DF
142arranges for automatic linking of the OpenMP runtime library
143(@ref{Runtime Library Routines}).
144
cff72ef4
TB
145A complete description of all OpenMP directives may be found in the
146@uref{https://www.openmp.org, OpenMP Application Program Interface} manuals.
147See also @ref{OpenMP Implementation Status}.
148
149
150@c ---------------------------------------------------------------------
151@c OpenMP Implementation Status
152@c ---------------------------------------------------------------------
153
154@node OpenMP Implementation Status
155@chapter OpenMP Implementation Status
156
157@menu
158* OpenMP 4.5:: Feature completion status to 4.5 specification
159* OpenMP 5.0:: Feature completion status to 5.0 specification
160* OpenMP 5.1:: Feature completion status to 5.1 specification
161@end menu
162
163The @code{_OPENMP} preprocessor macro and Fortran's @code{openmp_version}
164parameter, provided by @code{omp_lib.h} and the @code{omp_lib} module, have
165the value @code{201511} (i.e. OpenMP 4.5).
166
167@node OpenMP 4.5
168@section OpenMP 4.5
169
170The OpenMP 4.5 specification is fully supported.
171
172@node OpenMP 5.0
173@section OpenMP 5.0
174
ff7bc505
TB
175@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
176@c This list is sorted as in OpenMP 5.1's B.3 not as in OpenMP 5.0's B.2
177
178@multitable @columnfractions .60 .10 .25
179@headitem Description @tab Status @tab Comments
180@item Array shaping @tab N @tab
181@item Array sections with non-unit strides in C and C++ @tab N @tab
182@item Iterators @tab Y @tab
183@item @code{metadirective} directive @tab N @tab
184@item @code{declare variant} directive
2c4666fb 185 @tab P @tab simd traits not handled correctly
ff7bc505
TB
186@item @emph{target-offload-var} ICV and @code{OMP_TARGET_OFFLOAD}
187 env variable @tab Y @tab
188@item Nested-parallel changes to @emph{max-active-levels-var} ICV @tab Y @tab
189@item @code{requires} directive @tab P
b2a0f3a4
TB
190 @tab Only fulfillable requirement are @code{atomic_default_mem_order}
191 and @code{dynamic_allocators}
ff7bc505 192@item @code{teams} construct outside an enclosing target region @tab Y @tab
67c9b129 193@item Non-rectangular loop nests @tab Y @tab
ff7bc505
TB
194@item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab
195@item @code{nonmonotonic} as default loop schedule modifier for worksharing-loop
196 constructs @tab Y @tab
197@item Collapse of associated loops that are imperfectly nested loops @tab N @tab
198@item Clauses @code{if}, @code{nontemporal} and @code{order(concurrent)} in
199 @code{simd} construct @tab Y @tab
200@item @code{atomic} constructs in @code{simd} @tab Y @tab
201@item @code{loop} construct @tab Y @tab
202@item @code{order(concurrent)} clause @tab Y @tab
203@item @code{scan} directive and @code{in_scan} modifier for the
204 @code{reduction} clause @tab Y @tab
205@item @code{in_reduction} clause on @code{task} constructs @tab Y @tab
206@item @code{in_reduction} clause on @code{target} constructs @tab P
b2a0f3a4 207 @tab @code{nowait} only stub
ff7bc505
TB
208@item @code{task_reduction} clause with @code{taskgroup} @tab Y @tab
209@item @code{task} modifier to @code{reduction} clause @tab Y @tab
210@item @code{affinity} clause to @code{task} construct @tab Y @tab Stub only
211@item @code{detach} clause to @code{task} construct @tab Y @tab
212@item @code{omp_fulfill_event} runtime routine @tab Y @tab
213@item @code{reduction} and @code{in_reduction} clauses on @code{taskloop}
214 and @code{taskloop simd} constructs @tab Y @tab
215@item @code{taskloop} construct cancelable by @code{cancel} construct
216 @tab Y @tab
db14bb4c 217@item @code{mutexinoutset} @emph{dependence-type} for @code{depend} clause
ff7bc505
TB
218 @tab Y @tab
219@item Predefined memory spaces, memory allocators, allocator traits
220 @tab Y @tab Some are only stubs
221@item Memory management routines @tab Y @tab
222@item @code{allocate} directive @tab N @tab
69561fc7 223@item @code{allocate} clause @tab P @tab initial support
ff7bc505
TB
224@item @code{use_device_addr} clause on @code{target data} @tab Y @tab
225@item @code{ancestor} modifier on @code{device} clause
226 @tab P @tab Reverse offload unsupported
227@item Implicit declare target directive @tab Y @tab
228@item Discontiguous array section with @code{target update} construct
229 @tab N @tab
230@item C/C++'s lvalue expressions in @code{to}, @code{from}
231 and @code{map} clauses @tab N @tab
232@item C/C++'s lvalue expressions in @code{depend} clauses @tab Y @tab
233@item Nested @code{declare target} directive @tab Y @tab
234@item Combined @code{master} constructs @tab Y @tab
235@item @code{depend} clause on @code{taskwait} @tab Y @tab
236@item Weak memory ordering clauses on @code{atomic} and @code{flush} construct
237 @tab Y @tab
238@item @code{hint} clause on the @code{atomic} construct @tab Y @tab Stub only
239@item @code{depobj} construct and depend objects @tab Y @tab
240@item Lock hints were renamed to synchronization hints @tab Y @tab
241@item @code{conditional} modifier to @code{lastprivate} clause @tab Y @tab
242@item Map-order clarifications @tab P @tab
243@item @code{close} @emph{map-type-modifier} @tab Y @tab
244@item Mapping C/C++ pointer variables and to assign the address of
245 device memory mapped by an array section @tab P @tab
246@item Mapping of Fortran pointer and allocatable variables, including pointer
247 and allocatable components of variables
1b85638a 248 @tab P @tab Mapping of vars with allocatable components unsupported
ff7bc505
TB
249@item @code{defaultmap} extensions @tab Y @tab
250@item @code{declare mapper} directive @tab N @tab
251@item @code{omp_get_supported_active_levels} routine @tab Y @tab
252@item Runtime routines and environment variables to display runtime thread
253 affinity information @tab Y @tab
254@item @code{omp_pause_resource} and @code{omp_pause_resource_all} runtime
255 routines @tab Y @tab
256@item @code{omp_get_device_num} runtime routine @tab Y @tab
257@item OMPT interface @tab N @tab
258@item OMPD interface @tab N @tab
259@end multitable
260
261@unnumberedsubsec Other new OpenMP 5.0 features
262
263@multitable @columnfractions .60 .10 .25
264@headitem Description @tab Status @tab Comments
265@item Supporting C++'s range-based for loop @tab Y @tab
266@end multitable
267
cff72ef4
TB
268
269@node OpenMP 5.1
270@section OpenMP 5.1
271
272@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
273
274@multitable @columnfractions .60 .10 .25
275@headitem Description @tab Status @tab Comments
276@item OpenMP directive as C++ attribute specifiers @tab Y @tab
277@item @code{omp_all_memory} reserved locator @tab N @tab
278@item @emph{target_device trait} in OpenMP Context @tab N @tab
279@item @code{target_device} selector set in context selectors @tab N @tab
4a7842bb 280@item C/C++'s @code{declare variant} directive: elision support of
cff72ef4 281 preprocessed code @tab N @tab
4a7842bb 282@item @code{declare variant}: new clauses @code{adjust_args} and
cff72ef4
TB
283 @code{append_args} @tab N @tab
284@item @code{dispatch} construct @tab N @tab
285@item device-specific ICV settings the environment variables @tab N @tab
286@item assume directive @tab N @tab
287@item @code{nothing} directive @tab Y @tab
288@item @code{error} directive @tab Y @tab
289@item @code{masked} construct @tab Y @tab
290@item @code{scope} directive @tab Y @tab
291@item Loop transformation constructs @tab N @tab
292@item @code{strict} modifier in the @code{grainsize} and @code{num_tasks}
293 clauses of the taskloop construct @tab Y @tab
294@item @code{align} clause/modifier in @code{allocate} directive/clause
875124eb 295 and @code{allocator} directive @tab P @tab C/C++ on clause only
9ceaf0fe 296@item @code{thread_limit} clause to @code{target} construct @tab Y @tab
bbb7f860 297@item @code{has_device_addr} clause to @code{target} construct @tab Y @tab
cff72ef4
TB
298@item iterators in @code{target update} motion clauses and @code{map}
299 clauses @tab N @tab
300@item indirect calls to the device version of a procedure or function in
301 @code{target} regions @tab N @tab
302@item @code{interop} directive @tab N @tab
303@item @code{omp_interop_t} object support in runtime routines @tab N @tab
304@item @code{nowait} clause in @code{taskwait} directive @tab N @tab
494ebfa7 305@item Extensions to the @code{atomic} directive @tab Y @tab
cff72ef4
TB
306@item @code{seq_cst} clause on a @code{flush} construct @tab Y @tab
307@item @code{inoutset} argument to the @code{depend} clause @tab N @tab
308@item @code{private} and @code{firstprivate} argument to @code{default}
e5597f2a 309 clause in C and C++ @tab Y @tab
cff72ef4
TB
310@item @code{present} argument to @code{defaultmap} clause @tab N @tab
311@item @code{omp_set_num_teams}, @code{omp_set_teams_thread_limit},
312 @code{omp_get_max_teams}, @code{omp_get_teams_thread_limit} runtime
4a0fed0c 313 routines @tab Y @tab
caa81345 314@item @code{omp_target_is_accessible} runtime routine @tab Y @tab
cff72ef4
TB
315@item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
316 runtime routines @tab N @tab
28c49e0f 317@item @code{omp_get_mapped_ptr} runtime routine @tab Y @tab
cff72ef4 318@item @code{omp_calloc}, @code{omp_realloc}, @code{omp_aligned_alloc} and
70de20db 319 @code{omp_aligned_calloc} runtime routines @tab Y @tab
cff72ef4
TB
320@item @code{omp_alloctrait_key_t} enum: @code{omp_atv_serialized} added,
321 @code{omp_atv_default} changed @tab Y @tab
b2a0f3a4 322@item @code{omp_display_env} runtime routine @tab Y
cff72ef4
TB
323 @tab Not inside @code{target} regions
324@item @code{ompt_scope_endpoint_t} enum: @code{ompt_scope_beginend} @tab N @tab
325@item @code{ompt_sync_region_t} enum additions @tab N @tab
326@item @code{ompt_state_t} enum: @code{ompt_state_wait_barrier_implementation}
327 and @code{ompt_state_wait_barrier_teams} @tab N @tab
328@item @code{ompt_callback_target_data_op_emi_t},
329 @code{ompt_callback_target_emi_t}, @code{ompt_callback_target_map_emi_t}
330 and @code{ompt_callback_target_submit_emi_t} @tab N @tab
331@item @code{ompt_callback_error_t} type @tab N @tab
4a0fed0c 332@item @code{OMP_PLACES} syntax extensions @tab Y @tab
cff72ef4 333@item @code{OMP_NUM_TEAMS} and @code{OMP_TEAMS_THREAD_LIMIT} environment
4a0fed0c 334 variables @tab Y @tab
cff72ef4
TB
335@end multitable
336
337@unnumberedsubsec Other new OpenMP 5.1 features
338
339@multitable @columnfractions .60 .10 .25
340@headitem Description @tab Status @tab Comments
2e465919 341@item Support of strictly structured blocks in Fortran @tab Y @tab
875124eb
JJ
342@item Support of structured block sequences in C/C++ @tab Y @tab
343@item @code{unconstrained} and @code{reproducible} modifiers on @code{order}
344 clause @tab Y @tab
cff72ef4 345@end multitable
3721b9e1
DF
346
347
348@c ---------------------------------------------------------------------
4102bda6 349@c OpenMP Runtime Library Routines
3721b9e1
DF
350@c ---------------------------------------------------------------------
351
352@node Runtime Library Routines
4102bda6 353@chapter OpenMP Runtime Library Routines
3721b9e1 354
83fd6c5b 355The runtime routines described here are defined by Section 3 of the OpenMP
00b9bd52 356specification in version 4.5. The routines are structured in following
5c6ed53a 357three parts:
3721b9e1 358
72832460 359@menu
83fd6c5b
TB
360Control threads, processors and the parallel environment. They have C
361linkage, and do not throw exceptions.
f5745bed 362
5c6ed53a
TB
363* omp_get_active_level:: Number of active parallel regions
364* omp_get_ancestor_thread_num:: Ancestor thread ID
83fd6c5b
TB
365* omp_get_cancellation:: Whether cancellation support is enabled
366* omp_get_default_device:: Get the default device for target regions
0bac793e 367* omp_get_device_num:: Get device that current thread is running on
5c6ed53a 368* omp_get_dynamic:: Dynamic teams setting
74c9882b 369* omp_get_initial_device:: Device number of host device
5c6ed53a 370* omp_get_level:: Number of parallel regions
445567b2 371* omp_get_max_active_levels:: Current maximum number of active regions
d9a6bd32 372* omp_get_max_task_priority:: Maximum task priority value that can be set
4096bf82 373* omp_get_max_teams:: Maximum number of teams for teams region
6a2ba183 374* omp_get_max_threads:: Maximum number of threads of parallel region
5c6ed53a 375* omp_get_nested:: Nested parallel regions
83fd6c5b 376* omp_get_num_devices:: Number of target devices
5c6ed53a 377* omp_get_num_procs:: Number of processors online
83fd6c5b 378* omp_get_num_teams:: Number of teams
5c6ed53a 379* omp_get_num_threads:: Size of the active team
83fd6c5b 380* omp_get_proc_bind:: Whether theads may be moved between CPUs
5c6ed53a 381* omp_get_schedule:: Obtain the runtime scheduling method
445567b2 382* omp_get_supported_active_levels:: Maximum number of active regions supported
83fd6c5b 383* omp_get_team_num:: Get team number
5c6ed53a 384* omp_get_team_size:: Number of threads in a team
4096bf82 385* omp_get_teams_thread_limit:: Maximum number of threads imposed by teams
6a2ba183 386* omp_get_thread_limit:: Maximum number of threads
5c6ed53a
TB
387* omp_get_thread_num:: Current thread ID
388* omp_in_parallel:: Whether a parallel region is active
20906c66 389* omp_in_final:: Whether in final or included task region
83fd6c5b
TB
390* omp_is_initial_device:: Whether executing on the host device
391* omp_set_default_device:: Set the default device for target regions
5c6ed53a
TB
392* omp_set_dynamic:: Enable/disable dynamic teams
393* omp_set_max_active_levels:: Limits the number of active parallel regions
394* omp_set_nested:: Enable/disable nested parallel regions
4096bf82 395* omp_set_num_teams:: Set upper teams limit for teams region
5c6ed53a
TB
396* omp_set_num_threads:: Set upper team size limit
397* omp_set_schedule:: Set the runtime scheduling method
4096bf82 398* omp_set_teams_thread_limit:: Set upper thread limit for teams construct
3721b9e1
DF
399
400Initialize, set, test, unset and destroy simple and nested locks.
401
3721b9e1
DF
402* omp_init_lock:: Initialize simple lock
403* omp_set_lock:: Wait for and set simple lock
404* omp_test_lock:: Test and set simple lock if available
405* omp_unset_lock:: Unset simple lock
406* omp_destroy_lock:: Destroy simple lock
407* omp_init_nest_lock:: Initialize nested lock
408* omp_set_nest_lock:: Wait for and set simple lock
409* omp_test_nest_lock:: Test and set nested lock if available
410* omp_unset_nest_lock:: Unset nested lock
411* omp_destroy_nest_lock:: Destroy nested lock
3721b9e1
DF
412
413Portable, thread-based, wall clock timer.
414
3721b9e1
DF
415* omp_get_wtick:: Get timer precision.
416* omp_get_wtime:: Elapsed wall clock time.
0194e2f0
KCY
417
418Support for event objects.
419
420* omp_fulfill_event:: Fulfill and destroy an OpenMP event.
3721b9e1
DF
421@end menu
422
5c6ed53a
TB
423
424
425@node omp_get_active_level
426@section @code{omp_get_active_level} -- Number of parallel regions
427@table @asis
428@item @emph{Description}:
429This function returns the nesting level for the active parallel blocks,
430which enclose the calling call.
431
432@item @emph{C/C++}
433@multitable @columnfractions .20 .80
6a2ba183 434@item @emph{Prototype}: @tab @code{int omp_get_active_level(void);}
5c6ed53a
TB
435@end multitable
436
437@item @emph{Fortran}:
438@multitable @columnfractions .20 .80
acb5c916 439@item @emph{Interface}: @tab @code{integer function omp_get_active_level()}
5c6ed53a
TB
440@end multitable
441
442@item @emph{See also}:
443@ref{omp_get_level}, @ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
444
445@item @emph{Reference}:
1a6d1d24 446@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.20.
5c6ed53a
TB
447@end table
448
449
450
451@node omp_get_ancestor_thread_num
452@section @code{omp_get_ancestor_thread_num} -- Ancestor thread ID
453@table @asis
454@item @emph{Description}:
455This function returns the thread identification number for the given
83fd6c5b 456nesting level of the current thread. For values of @var{level} outside
5c6ed53a
TB
457zero to @code{omp_get_level} -1 is returned; if @var{level} is
458@code{omp_get_level} the result is identical to @code{omp_get_thread_num}.
459
460@item @emph{C/C++}
461@multitable @columnfractions .20 .80
462@item @emph{Prototype}: @tab @code{int omp_get_ancestor_thread_num(int level);}
463@end multitable
464
465@item @emph{Fortran}:
466@multitable @columnfractions .20 .80
acb5c916 467@item @emph{Interface}: @tab @code{integer function omp_get_ancestor_thread_num(level)}
5c6ed53a
TB
468@item @tab @code{integer level}
469@end multitable
470
471@item @emph{See also}:
472@ref{omp_get_level}, @ref{omp_get_thread_num}, @ref{omp_get_team_size}
473
474@item @emph{Reference}:
1a6d1d24 475@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.18.
83fd6c5b
TB
476@end table
477
478
479
480@node omp_get_cancellation
481@section @code{omp_get_cancellation} -- Whether cancellation support is enabled
482@table @asis
483@item @emph{Description}:
484This function returns @code{true} if cancellation is activated, @code{false}
485otherwise. Here, @code{true} and @code{false} represent their language-specific
486counterparts. Unless @env{OMP_CANCELLATION} is set true, cancellations are
487deactivated.
488
489@item @emph{C/C++}:
490@multitable @columnfractions .20 .80
491@item @emph{Prototype}: @tab @code{int omp_get_cancellation(void);}
492@end multitable
493
494@item @emph{Fortran}:
495@multitable @columnfractions .20 .80
496@item @emph{Interface}: @tab @code{logical function omp_get_cancellation()}
497@end multitable
498
499@item @emph{See also}:
500@ref{OMP_CANCELLATION}
501
502@item @emph{Reference}:
1a6d1d24 503@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.9.
83fd6c5b
TB
504@end table
505
506
507
508@node omp_get_default_device
509@section @code{omp_get_default_device} -- Get the default device for target regions
510@table @asis
511@item @emph{Description}:
512Get the default device for target regions without device clause.
513
514@item @emph{C/C++}:
515@multitable @columnfractions .20 .80
516@item @emph{Prototype}: @tab @code{int omp_get_default_device(void);}
517@end multitable
518
519@item @emph{Fortran}:
520@multitable @columnfractions .20 .80
521@item @emph{Interface}: @tab @code{integer function omp_get_default_device()}
522@end multitable
523
524@item @emph{See also}:
525@ref{OMP_DEFAULT_DEVICE}, @ref{omp_set_default_device}
526
527@item @emph{Reference}:
1a6d1d24 528@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.30.
5c6ed53a
TB
529@end table
530
531
532
de7fa706
JJ
533@node omp_get_device_num
534@section @code{omp_get_device_num} -- Return device number of current device
535@table @asis
536@item @emph{Description}:
537This function returns a device number that represents the device that the
538current thread is executing on. For OpenMP 5.0, this must be equal to the
539value returned by the @code{omp_get_initial_device} function when called
540from the host.
541
542@item @emph{C/C++}
543@multitable @columnfractions .20 .80
544@item @emph{Prototype}: @tab @code{int omp_get_device_num(void);}
545@end multitable
546
547@item @emph{Fortran}:
548@multitable @columnfractions .20 .80
549@item @emph{Interface}: @tab @code{integer function omp_get_device_num()}
550@end multitable
551
552@item @emph{See also}:
553@ref{omp_get_initial_device}
554
555@item @emph{Reference}:
556@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.37.
557@end table
558
559
560
3721b9e1
DF
561@node omp_get_dynamic
562@section @code{omp_get_dynamic} -- Dynamic teams setting
563@table @asis
564@item @emph{Description}:
565This function returns @code{true} if enabled, @code{false} otherwise.
566Here, @code{true} and @code{false} represent their language-specific
567counterparts.
568
14734fc7 569The dynamic team setting may be initialized at startup by the
83fd6c5b
TB
570@env{OMP_DYNAMIC} environment variable or at runtime using
571@code{omp_set_dynamic}. If undefined, dynamic adjustment is
14734fc7
DF
572disabled by default.
573
3721b9e1
DF
574@item @emph{C/C++}:
575@multitable @columnfractions .20 .80
6a2ba183 576@item @emph{Prototype}: @tab @code{int omp_get_dynamic(void);}
3721b9e1
DF
577@end multitable
578
579@item @emph{Fortran}:
580@multitable @columnfractions .20 .80
581@item @emph{Interface}: @tab @code{logical function omp_get_dynamic()}
582@end multitable
583
584@item @emph{See also}:
14734fc7 585@ref{omp_set_dynamic}, @ref{OMP_DYNAMIC}
3721b9e1
DF
586
587@item @emph{Reference}:
1a6d1d24 588@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.8.
5c6ed53a
TB
589@end table
590
591
592
74c9882b
JJ
593@node omp_get_initial_device
594@section @code{omp_get_initial_device} -- Return device number of initial device
595@table @asis
596@item @emph{Description}:
597This function returns a device number that represents the host device.
598For OpenMP 5.1, this must be equal to the value returned by the
599@code{omp_get_num_devices} function.
600
601@item @emph{C/C++}
602@multitable @columnfractions .20 .80
603@item @emph{Prototype}: @tab @code{int omp_get_initial_device(void);}
604@end multitable
605
606@item @emph{Fortran}:
607@multitable @columnfractions .20 .80
608@item @emph{Interface}: @tab @code{integer function omp_get_initial_device()}
609@end multitable
610
611@item @emph{See also}:
612@ref{omp_get_num_devices}
613
614@item @emph{Reference}:
615@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.35.
616@end table
617
618
619
5c6ed53a
TB
620@node omp_get_level
621@section @code{omp_get_level} -- Obtain the current nesting level
622@table @asis
623@item @emph{Description}:
624This function returns the nesting level for the parallel blocks,
625which enclose the calling call.
626
627@item @emph{C/C++}
628@multitable @columnfractions .20 .80
6a2ba183 629@item @emph{Prototype}: @tab @code{int omp_get_level(void);}
5c6ed53a
TB
630@end multitable
631
632@item @emph{Fortran}:
633@multitable @columnfractions .20 .80
acb5c916 634@item @emph{Interface}: @tab @code{integer function omp_level()}
5c6ed53a
TB
635@end multitable
636
637@item @emph{See also}:
638@ref{omp_get_active_level}
639
640@item @emph{Reference}:
1a6d1d24 641@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.17.
5c6ed53a
TB
642@end table
643
644
645
646@node omp_get_max_active_levels
445567b2 647@section @code{omp_get_max_active_levels} -- Current maximum number of active regions
5c6ed53a
TB
648@table @asis
649@item @emph{Description}:
6a2ba183 650This function obtains the maximum allowed number of nested, active parallel regions.
5c6ed53a
TB
651
652@item @emph{C/C++}
653@multitable @columnfractions .20 .80
6a2ba183 654@item @emph{Prototype}: @tab @code{int omp_get_max_active_levels(void);}
5c6ed53a
TB
655@end multitable
656
657@item @emph{Fortran}:
658@multitable @columnfractions .20 .80
acb5c916 659@item @emph{Interface}: @tab @code{integer function omp_get_max_active_levels()}
5c6ed53a
TB
660@end multitable
661
662@item @emph{See also}:
663@ref{omp_set_max_active_levels}, @ref{omp_get_active_level}
664
665@item @emph{Reference}:
1a6d1d24 666@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.16.
3721b9e1
DF
667@end table
668
669
d9a6bd32
JJ
670@node omp_get_max_task_priority
671@section @code{omp_get_max_task_priority} -- Maximum priority value
672that can be set for tasks.
673@table @asis
674@item @emph{Description}:
675This function obtains the maximum allowed priority number for tasks.
676
677@item @emph{C/C++}
678@multitable @columnfractions .20 .80
679@item @emph{Prototype}: @tab @code{int omp_get_max_task_priority(void);}
680@end multitable
681
682@item @emph{Fortran}:
683@multitable @columnfractions .20 .80
684@item @emph{Interface}: @tab @code{integer function omp_get_max_task_priority()}
685@end multitable
686
687@item @emph{Reference}:
1a6d1d24 688@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.29.
d9a6bd32
JJ
689@end table
690
3721b9e1 691
4096bf82
JJ
692@node omp_get_max_teams
693@section @code{omp_get_max_teams} -- Maximum number of teams of teams region
694@table @asis
695@item @emph{Description}:
696Return the maximum number of teams used for the teams region
697that does not use the clause @code{num_teams}.
698
699@item @emph{C/C++}:
700@multitable @columnfractions .20 .80
701@item @emph{Prototype}: @tab @code{int omp_get_max_teams(void);}
702@end multitable
703
704@item @emph{Fortran}:
705@multitable @columnfractions .20 .80
706@item @emph{Interface}: @tab @code{integer function omp_get_max_teams()}
707@end multitable
708
709@item @emph{See also}:
710@ref{omp_set_num_teams}, @ref{omp_get_num_teams}
711
712@item @emph{Reference}:
713@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.4.
714@end table
715
716
717
3721b9e1 718@node omp_get_max_threads
6a2ba183 719@section @code{omp_get_max_threads} -- Maximum number of threads of parallel region
3721b9e1
DF
720@table @asis
721@item @emph{Description}:
6a2ba183 722Return the maximum number of threads used for the current parallel region
5c6ed53a 723that does not use the clause @code{num_threads}.
3721b9e1
DF
724
725@item @emph{C/C++}:
726@multitable @columnfractions .20 .80
6a2ba183 727@item @emph{Prototype}: @tab @code{int omp_get_max_threads(void);}
3721b9e1
DF
728@end multitable
729
730@item @emph{Fortran}:
731@multitable @columnfractions .20 .80
732@item @emph{Interface}: @tab @code{integer function omp_get_max_threads()}
733@end multitable
734
735@item @emph{See also}:
5c6ed53a 736@ref{omp_set_num_threads}, @ref{omp_set_dynamic}, @ref{omp_get_thread_limit}
3721b9e1
DF
737
738@item @emph{Reference}:
1a6d1d24 739@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.3.
3721b9e1
DF
740@end table
741
742
743
744@node omp_get_nested
745@section @code{omp_get_nested} -- Nested parallel regions
746@table @asis
747@item @emph{Description}:
748This function returns @code{true} if nested parallel regions are
83fd6c5b 749enabled, @code{false} otherwise. Here, @code{true} and @code{false}
3721b9e1
DF
750represent their language-specific counterparts.
751
6fae7eda
KCY
752The state of nested parallel regions at startup depends on several
753environment variables. If @env{OMP_MAX_ACTIVE_LEVELS} is defined
754and is set to greater than one, then nested parallel regions will be
755enabled. If not defined, then the value of the @env{OMP_NESTED}
756environment variable will be followed if defined. If neither are
757defined, then if either @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND}
758are defined with a list of more than one value, then nested parallel
759regions are enabled. If none of these are defined, then nested parallel
760regions are disabled by default.
761
762Nested parallel regions can be enabled or disabled at runtime using
763@code{omp_set_nested}, or by setting the maximum number of nested
764regions with @code{omp_set_max_active_levels} to one to disable, or
765above one to enable.
14734fc7 766
3721b9e1
DF
767@item @emph{C/C++}:
768@multitable @columnfractions .20 .80
6a2ba183 769@item @emph{Prototype}: @tab @code{int omp_get_nested(void);}
3721b9e1
DF
770@end multitable
771
772@item @emph{Fortran}:
773@multitable @columnfractions .20 .80
87350d4a 774@item @emph{Interface}: @tab @code{logical function omp_get_nested()}
3721b9e1
DF
775@end multitable
776
777@item @emph{See also}:
6fae7eda
KCY
778@ref{omp_set_max_active_levels}, @ref{omp_set_nested},
779@ref{OMP_MAX_ACTIVE_LEVELS}, @ref{OMP_NESTED}
3721b9e1
DF
780
781@item @emph{Reference}:
1a6d1d24 782@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.11.
83fd6c5b
TB
783@end table
784
785
786
787@node omp_get_num_devices
788@section @code{omp_get_num_devices} -- Number of target devices
789@table @asis
790@item @emph{Description}:
791Returns the number of target devices.
792
793@item @emph{C/C++}:
794@multitable @columnfractions .20 .80
795@item @emph{Prototype}: @tab @code{int omp_get_num_devices(void);}
796@end multitable
797
798@item @emph{Fortran}:
799@multitable @columnfractions .20 .80
800@item @emph{Interface}: @tab @code{integer function omp_get_num_devices()}
801@end multitable
802
803@item @emph{Reference}:
1a6d1d24 804@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.31.
3721b9e1
DF
805@end table
806
807
808
809@node omp_get_num_procs
810@section @code{omp_get_num_procs} -- Number of processors online
811@table @asis
812@item @emph{Description}:
83fd6c5b 813Returns the number of processors online on that device.
3721b9e1
DF
814
815@item @emph{C/C++}:
816@multitable @columnfractions .20 .80
6a2ba183 817@item @emph{Prototype}: @tab @code{int omp_get_num_procs(void);}
3721b9e1
DF
818@end multitable
819
820@item @emph{Fortran}:
821@multitable @columnfractions .20 .80
822@item @emph{Interface}: @tab @code{integer function omp_get_num_procs()}
823@end multitable
824
825@item @emph{Reference}:
1a6d1d24 826@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.5.
83fd6c5b
TB
827@end table
828
829
830
831@node omp_get_num_teams
832@section @code{omp_get_num_teams} -- Number of teams
833@table @asis
834@item @emph{Description}:
835Returns the number of teams in the current team region.
836
837@item @emph{C/C++}:
838@multitable @columnfractions .20 .80
839@item @emph{Prototype}: @tab @code{int omp_get_num_teams(void);}
840@end multitable
841
842@item @emph{Fortran}:
843@multitable @columnfractions .20 .80
844@item @emph{Interface}: @tab @code{integer function omp_get_num_teams()}
845@end multitable
846
847@item @emph{Reference}:
1a6d1d24 848@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.32.
3721b9e1
DF
849@end table
850
851
852
853@node omp_get_num_threads
854@section @code{omp_get_num_threads} -- Size of the active team
855@table @asis
856@item @emph{Description}:
83fd6c5b 857Returns the number of threads in the current team. In a sequential section of
3721b9e1
DF
858the program @code{omp_get_num_threads} returns 1.
859
14734fc7 860The default team size may be initialized at startup by the
83fd6c5b 861@env{OMP_NUM_THREADS} environment variable. At runtime, the size
14734fc7 862of the current team may be set either by the @code{NUM_THREADS}
83fd6c5b
TB
863clause or by @code{omp_set_num_threads}. If none of the above were
864used to define a specific value and @env{OMP_DYNAMIC} is disabled,
14734fc7
DF
865one thread per CPU online is used.
866
3721b9e1
DF
867@item @emph{C/C++}:
868@multitable @columnfractions .20 .80
6a2ba183 869@item @emph{Prototype}: @tab @code{int omp_get_num_threads(void);}
3721b9e1
DF
870@end multitable
871
872@item @emph{Fortran}:
873@multitable @columnfractions .20 .80
874@item @emph{Interface}: @tab @code{integer function omp_get_num_threads()}
875@end multitable
876
877@item @emph{See also}:
878@ref{omp_get_max_threads}, @ref{omp_set_num_threads}, @ref{OMP_NUM_THREADS}
879
880@item @emph{Reference}:
1a6d1d24 881@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.2.
83fd6c5b
TB
882@end table
883
884
885
886@node omp_get_proc_bind
887@section @code{omp_get_proc_bind} -- Whether theads may be moved between CPUs
888@table @asis
889@item @emph{Description}:
890This functions returns the currently active thread affinity policy, which is
891set via @env{OMP_PROC_BIND}. Possible values are @code{omp_proc_bind_false},
432de084
TB
892@code{omp_proc_bind_true}, @code{omp_proc_bind_primary},
893@code{omp_proc_bind_master}, @code{omp_proc_bind_close} and @code{omp_proc_bind_spread},
894where @code{omp_proc_bind_master} is an alias for @code{omp_proc_bind_primary}.
83fd6c5b
TB
895
896@item @emph{C/C++}:
897@multitable @columnfractions .20 .80
898@item @emph{Prototype}: @tab @code{omp_proc_bind_t omp_get_proc_bind(void);}
899@end multitable
900
901@item @emph{Fortran}:
902@multitable @columnfractions .20 .80
903@item @emph{Interface}: @tab @code{integer(kind=omp_proc_bind_kind) function omp_get_proc_bind()}
904@end multitable
905
906@item @emph{See also}:
907@ref{OMP_PROC_BIND}, @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY},
908
909@item @emph{Reference}:
1a6d1d24 910@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.22.
5c6ed53a
TB
911@end table
912
913
914
915@node omp_get_schedule
916@section @code{omp_get_schedule} -- Obtain the runtime scheduling method
917@table @asis
918@item @emph{Description}:
83fd6c5b 919Obtain the runtime scheduling method. The @var{kind} argument will be
5c6ed53a 920set to the value @code{omp_sched_static}, @code{omp_sched_dynamic},
83fd6c5b 921@code{omp_sched_guided} or @code{omp_sched_auto}. The second argument,
d9a6bd32 922@var{chunk_size}, is set to the chunk size.
5c6ed53a
TB
923
924@item @emph{C/C++}
925@multitable @columnfractions .20 .80
d9a6bd32 926@item @emph{Prototype}: @tab @code{void omp_get_schedule(omp_sched_t *kind, int *chunk_size);}
5c6ed53a
TB
927@end multitable
928
929@item @emph{Fortran}:
930@multitable @columnfractions .20 .80
d9a6bd32 931@item @emph{Interface}: @tab @code{subroutine omp_get_schedule(kind, chunk_size)}
5c6ed53a 932@item @tab @code{integer(kind=omp_sched_kind) kind}
d9a6bd32 933@item @tab @code{integer chunk_size}
5c6ed53a
TB
934@end multitable
935
936@item @emph{See also}:
937@ref{omp_set_schedule}, @ref{OMP_SCHEDULE}
938
939@item @emph{Reference}:
1a6d1d24 940@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.13.
83fd6c5b
TB
941@end table
942
943
8949b985
KCY
944@node omp_get_supported_active_levels
945@section @code{omp_get_supported_active_levels} -- Maximum number of active regions supported
946@table @asis
947@item @emph{Description}:
948This function returns the maximum number of nested, active parallel regions
949supported by this implementation.
950
951@item @emph{C/C++}
952@multitable @columnfractions .20 .80
953@item @emph{Prototype}: @tab @code{int omp_get_supported_active_levels(void);}
954@end multitable
955
956@item @emph{Fortran}:
957@multitable @columnfractions .20 .80
958@item @emph{Interface}: @tab @code{integer function omp_get_supported_active_levels()}
959@end multitable
960
961@item @emph{See also}:
962@ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
963
964@item @emph{Reference}:
965@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.15.
966@end table
967
968
83fd6c5b
TB
969
970@node omp_get_team_num
971@section @code{omp_get_team_num} -- Get team number
972@table @asis
973@item @emph{Description}:
974Returns the team number of the calling thread.
975
976@item @emph{C/C++}:
977@multitable @columnfractions .20 .80
978@item @emph{Prototype}: @tab @code{int omp_get_team_num(void);}
979@end multitable
980
981@item @emph{Fortran}:
982@multitable @columnfractions .20 .80
983@item @emph{Interface}: @tab @code{integer function omp_get_team_num()}
984@end multitable
985
986@item @emph{Reference}:
1a6d1d24 987@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.33.
5c6ed53a
TB
988@end table
989
990
991
992@node omp_get_team_size
993@section @code{omp_get_team_size} -- Number of threads in a team
994@table @asis
995@item @emph{Description}:
996This function returns the number of threads in a thread team to which
83fd6c5b 997either the current thread or its ancestor belongs. For values of @var{level}
6a2ba183
AH
998outside zero to @code{omp_get_level}, -1 is returned; if @var{level} is zero,
9991 is returned, and for @code{omp_get_level}, the result is identical
5c6ed53a
TB
1000to @code{omp_get_num_threads}.
1001
1002@item @emph{C/C++}:
1003@multitable @columnfractions .20 .80
6a2ba183 1004@item @emph{Prototype}: @tab @code{int omp_get_team_size(int level);}
5c6ed53a
TB
1005@end multitable
1006
1007@item @emph{Fortran}:
1008@multitable @columnfractions .20 .80
1009@item @emph{Interface}: @tab @code{integer function omp_get_team_size(level)}
1010@item @tab @code{integer level}
1011@end multitable
1012
1013@item @emph{See also}:
1014@ref{omp_get_num_threads}, @ref{omp_get_level}, @ref{omp_get_ancestor_thread_num}
1015
1016@item @emph{Reference}:
1a6d1d24 1017@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.19.
5c6ed53a
TB
1018@end table
1019
1020
1021
4096bf82
JJ
1022@node omp_get_teams_thread_limit
1023@section @code{omp_get_teams_thread_limit} -- Maximum number of threads imposed by teams
1024@table @asis
1025@item @emph{Description}:
1026Return the maximum number of threads that will be able to participate in
1027each team created by a teams construct.
1028
1029@item @emph{C/C++}:
1030@multitable @columnfractions .20 .80
1031@item @emph{Prototype}: @tab @code{int omp_get_teams_thread_limit(void);}
1032@end multitable
1033
1034@item @emph{Fortran}:
1035@multitable @columnfractions .20 .80
1036@item @emph{Interface}: @tab @code{integer function omp_get_teams_thread_limit()}
1037@end multitable
1038
1039@item @emph{See also}:
1040@ref{omp_set_teams_thread_limit}, @ref{OMP_TEAMS_THREAD_LIMIT}
1041
1042@item @emph{Reference}:
1043@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.6.
1044@end table
1045
1046
1047
5c6ed53a 1048@node omp_get_thread_limit
6a2ba183 1049@section @code{omp_get_thread_limit} -- Maximum number of threads
5c6ed53a
TB
1050@table @asis
1051@item @emph{Description}:
6a2ba183 1052Return the maximum number of threads of the program.
5c6ed53a
TB
1053
1054@item @emph{C/C++}:
1055@multitable @columnfractions .20 .80
6a2ba183 1056@item @emph{Prototype}: @tab @code{int omp_get_thread_limit(void);}
5c6ed53a
TB
1057@end multitable
1058
1059@item @emph{Fortran}:
1060@multitable @columnfractions .20 .80
1061@item @emph{Interface}: @tab @code{integer function omp_get_thread_limit()}
1062@end multitable
1063
1064@item @emph{See also}:
1065@ref{omp_get_max_threads}, @ref{OMP_THREAD_LIMIT}
1066
1067@item @emph{Reference}:
1a6d1d24 1068@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.14.
3721b9e1
DF
1069@end table
1070
1071
1072
83fd6c5b 1073@node omp_get_thread_num
3721b9e1
DF
1074@section @code{omp_get_thread_num} -- Current thread ID
1075@table @asis
1076@item @emph{Description}:
6a2ba183 1077Returns a unique thread identification number within the current team.
5c6ed53a 1078In a sequential parts of the program, @code{omp_get_thread_num}
83fd6c5b
TB
1079always returns 0. In parallel regions the return value varies
1080from 0 to @code{omp_get_num_threads}-1 inclusive. The return
432de084 1081value of the primary thread of a team is always 0.
3721b9e1
DF
1082
1083@item @emph{C/C++}:
1084@multitable @columnfractions .20 .80
6a2ba183 1085@item @emph{Prototype}: @tab @code{int omp_get_thread_num(void);}
3721b9e1
DF
1086@end multitable
1087
1088@item @emph{Fortran}:
1089@multitable @columnfractions .20 .80
1090@item @emph{Interface}: @tab @code{integer function omp_get_thread_num()}
1091@end multitable
1092
1093@item @emph{See also}:
5c6ed53a 1094@ref{omp_get_num_threads}, @ref{omp_get_ancestor_thread_num}
3721b9e1
DF
1095
1096@item @emph{Reference}:
1a6d1d24 1097@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.4.
3721b9e1
DF
1098@end table
1099
1100
1101
1102@node omp_in_parallel
1103@section @code{omp_in_parallel} -- Whether a parallel region is active
1104@table @asis
1105@item @emph{Description}:
83fd6c5b
TB
1106This function returns @code{true} if currently running in parallel,
1107@code{false} otherwise. Here, @code{true} and @code{false} represent
3721b9e1
DF
1108their language-specific counterparts.
1109
1110@item @emph{C/C++}:
1111@multitable @columnfractions .20 .80
6a2ba183 1112@item @emph{Prototype}: @tab @code{int omp_in_parallel(void);}
3721b9e1
DF
1113@end multitable
1114
1115@item @emph{Fortran}:
1116@multitable @columnfractions .20 .80
1117@item @emph{Interface}: @tab @code{logical function omp_in_parallel()}
1118@end multitable
1119
1120@item @emph{Reference}:
1a6d1d24 1121@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.6.
20906c66
JJ
1122@end table
1123
1124
1125@node omp_in_final
1126@section @code{omp_in_final} -- Whether in final or included task region
1127@table @asis
1128@item @emph{Description}:
1129This function returns @code{true} if currently running in a final
83fd6c5b 1130or included task region, @code{false} otherwise. Here, @code{true}
20906c66
JJ
1131and @code{false} represent their language-specific counterparts.
1132
1133@item @emph{C/C++}:
1134@multitable @columnfractions .20 .80
1135@item @emph{Prototype}: @tab @code{int omp_in_final(void);}
1136@end multitable
1137
1138@item @emph{Fortran}:
1139@multitable @columnfractions .20 .80
1140@item @emph{Interface}: @tab @code{logical function omp_in_final()}
1141@end multitable
1142
1143@item @emph{Reference}:
1a6d1d24 1144@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.21.
3721b9e1
DF
1145@end table
1146
1147
83fd6c5b
TB
1148
1149@node omp_is_initial_device
1150@section @code{omp_is_initial_device} -- Whether executing on the host device
1151@table @asis
1152@item @emph{Description}:
1153This function returns @code{true} if currently running on the host device,
1154@code{false} otherwise. Here, @code{true} and @code{false} represent
1155their language-specific counterparts.
1156
1157@item @emph{C/C++}:
1158@multitable @columnfractions .20 .80
1159@item @emph{Prototype}: @tab @code{int omp_is_initial_device(void);}
1160@end multitable
1161
1162@item @emph{Fortran}:
1163@multitable @columnfractions .20 .80
1164@item @emph{Interface}: @tab @code{logical function omp_is_initial_device()}
1165@end multitable
1166
1167@item @emph{Reference}:
1a6d1d24 1168@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.34.
83fd6c5b
TB
1169@end table
1170
1171
1172
1173@node omp_set_default_device
1174@section @code{omp_set_default_device} -- Set the default device for target regions
1175@table @asis
1176@item @emph{Description}:
1177Set the default device for target regions without device clause. The argument
1178shall be a nonnegative device number.
1179
1180@item @emph{C/C++}:
1181@multitable @columnfractions .20 .80
1182@item @emph{Prototype}: @tab @code{void omp_set_default_device(int device_num);}
1183@end multitable
1184
1185@item @emph{Fortran}:
1186@multitable @columnfractions .20 .80
1187@item @emph{Interface}: @tab @code{subroutine omp_set_default_device(device_num)}
1188@item @tab @code{integer device_num}
1189@end multitable
1190
1191@item @emph{See also}:
1192@ref{OMP_DEFAULT_DEVICE}, @ref{omp_get_default_device}
1193
1194@item @emph{Reference}:
1a6d1d24 1195@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.29.
83fd6c5b
TB
1196@end table
1197
1198
1199
3721b9e1
DF
1200@node omp_set_dynamic
1201@section @code{omp_set_dynamic} -- Enable/disable dynamic teams
1202@table @asis
1203@item @emph{Description}:
1204Enable or disable the dynamic adjustment of the number of threads
83fd6c5b 1205within a team. The function takes the language-specific equivalent
3721b9e1
DF
1206of @code{true} and @code{false}, where @code{true} enables dynamic
1207adjustment of team sizes and @code{false} disables it.
1208
1209@item @emph{C/C++}:
1210@multitable @columnfractions .20 .80
4fed6b25 1211@item @emph{Prototype}: @tab @code{void omp_set_dynamic(int dynamic_threads);}
3721b9e1
DF
1212@end multitable
1213
1214@item @emph{Fortran}:
1215@multitable @columnfractions .20 .80
4fed6b25
TB
1216@item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(dynamic_threads)}
1217@item @tab @code{logical, intent(in) :: dynamic_threads}
3721b9e1
DF
1218@end multitable
1219
1220@item @emph{See also}:
1221@ref{OMP_DYNAMIC}, @ref{omp_get_dynamic}
1222
1223@item @emph{Reference}:
1a6d1d24 1224@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.7.
5c6ed53a
TB
1225@end table
1226
1227
1228
1229@node omp_set_max_active_levels
1230@section @code{omp_set_max_active_levels} -- Limits the number of active parallel regions
1231@table @asis
1232@item @emph{Description}:
6a2ba183 1233This function limits the maximum allowed number of nested, active
8949b985
KCY
1234parallel regions. @var{max_levels} must be less or equal to
1235the value returned by @code{omp_get_supported_active_levels}.
5c6ed53a
TB
1236
1237@item @emph{C/C++}
1238@multitable @columnfractions .20 .80
6a2ba183 1239@item @emph{Prototype}: @tab @code{void omp_set_max_active_levels(int max_levels);}
5c6ed53a
TB
1240@end multitable
1241
1242@item @emph{Fortran}:
1243@multitable @columnfractions .20 .80
6a2ba183 1244@item @emph{Interface}: @tab @code{subroutine omp_set_max_active_levels(max_levels)}
5c6ed53a
TB
1245@item @tab @code{integer max_levels}
1246@end multitable
1247
1248@item @emph{See also}:
8949b985
KCY
1249@ref{omp_get_max_active_levels}, @ref{omp_get_active_level},
1250@ref{omp_get_supported_active_levels}
5c6ed53a
TB
1251
1252@item @emph{Reference}:
1a6d1d24 1253@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.15.
3721b9e1
DF
1254@end table
1255
1256
1257
1258@node omp_set_nested
1259@section @code{omp_set_nested} -- Enable/disable nested parallel regions
1260@table @asis
1261@item @emph{Description}:
f1b0882e 1262Enable or disable nested parallel regions, i.e., whether team members
83fd6c5b 1263are allowed to create new teams. The function takes the language-specific
3721b9e1
DF
1264equivalent of @code{true} and @code{false}, where @code{true} enables
1265dynamic adjustment of team sizes and @code{false} disables it.
1266
6fae7eda
KCY
1267Enabling nested parallel regions will also set the maximum number of
1268active nested regions to the maximum supported. Disabling nested parallel
1269regions will set the maximum number of active nested regions to one.
1270
3721b9e1
DF
1271@item @emph{C/C++}:
1272@multitable @columnfractions .20 .80
4fed6b25 1273@item @emph{Prototype}: @tab @code{void omp_set_nested(int nested);}
3721b9e1
DF
1274@end multitable
1275
1276@item @emph{Fortran}:
1277@multitable @columnfractions .20 .80
4fed6b25
TB
1278@item @emph{Interface}: @tab @code{subroutine omp_set_nested(nested)}
1279@item @tab @code{logical, intent(in) :: nested}
3721b9e1
DF
1280@end multitable
1281
1282@item @emph{See also}:
6fae7eda
KCY
1283@ref{omp_get_nested}, @ref{omp_set_max_active_levels},
1284@ref{OMP_MAX_ACTIVE_LEVELS}, @ref{OMP_NESTED}
3721b9e1
DF
1285
1286@item @emph{Reference}:
1a6d1d24 1287@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.10.
3721b9e1
DF
1288@end table
1289
1290
1291
4096bf82
JJ
1292@node omp_set_num_teams
1293@section @code{omp_set_num_teams} -- Set upper teams limit for teams construct
1294@table @asis
1295@item @emph{Description}:
1296Specifies the upper bound for number of teams created by the teams construct
1297which does not specify a @code{num_teams} clause. The
1298argument of @code{omp_set_num_teams} shall be a positive integer.
1299
1300@item @emph{C/C++}:
1301@multitable @columnfractions .20 .80
1302@item @emph{Prototype}: @tab @code{void omp_set_num_teams(int num_teams);}
1303@end multitable
1304
1305@item @emph{Fortran}:
1306@multitable @columnfractions .20 .80
1307@item @emph{Interface}: @tab @code{subroutine omp_set_num_teams(num_teams)}
1308@item @tab @code{integer, intent(in) :: num_teams}
1309@end multitable
1310
1311@item @emph{See also}:
1312@ref{OMP_NUM_TEAMS}, @ref{omp_get_num_teams}, @ref{omp_get_max_teams}
1313
1314@item @emph{Reference}:
1315@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.3.
1316@end table
1317
1318
1319
3721b9e1
DF
1320@node omp_set_num_threads
1321@section @code{omp_set_num_threads} -- Set upper team size limit
1322@table @asis
1323@item @emph{Description}:
1324Specifies the number of threads used by default in subsequent parallel
83fd6c5b
TB
1325sections, if those do not specify a @code{num_threads} clause. The
1326argument of @code{omp_set_num_threads} shall be a positive integer.
3721b9e1 1327
3721b9e1
DF
1328@item @emph{C/C++}:
1329@multitable @columnfractions .20 .80
4fed6b25 1330@item @emph{Prototype}: @tab @code{void omp_set_num_threads(int num_threads);}
3721b9e1
DF
1331@end multitable
1332
1333@item @emph{Fortran}:
1334@multitable @columnfractions .20 .80
4fed6b25
TB
1335@item @emph{Interface}: @tab @code{subroutine omp_set_num_threads(num_threads)}
1336@item @tab @code{integer, intent(in) :: num_threads}
3721b9e1
DF
1337@end multitable
1338
1339@item @emph{See also}:
1340@ref{OMP_NUM_THREADS}, @ref{omp_get_num_threads}, @ref{omp_get_max_threads}
1341
1342@item @emph{Reference}:
1a6d1d24 1343@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.1.
5c6ed53a
TB
1344@end table
1345
1346
1347
1348@node omp_set_schedule
1349@section @code{omp_set_schedule} -- Set the runtime scheduling method
1350@table @asis
1351@item @emph{Description}:
83fd6c5b 1352Sets the runtime scheduling method. The @var{kind} argument can have the
5c6ed53a 1353value @code{omp_sched_static}, @code{omp_sched_dynamic},
83fd6c5b 1354@code{omp_sched_guided} or @code{omp_sched_auto}. Except for
5c6ed53a 1355@code{omp_sched_auto}, the chunk size is set to the value of
d9a6bd32
JJ
1356@var{chunk_size} if positive, or to the default value if zero or negative.
1357For @code{omp_sched_auto} the @var{chunk_size} argument is ignored.
5c6ed53a
TB
1358
1359@item @emph{C/C++}
1360@multitable @columnfractions .20 .80
d9a6bd32 1361@item @emph{Prototype}: @tab @code{void omp_set_schedule(omp_sched_t kind, int chunk_size);}
5c6ed53a
TB
1362@end multitable
1363
1364@item @emph{Fortran}:
1365@multitable @columnfractions .20 .80
d9a6bd32 1366@item @emph{Interface}: @tab @code{subroutine omp_set_schedule(kind, chunk_size)}
5c6ed53a 1367@item @tab @code{integer(kind=omp_sched_kind) kind}
d9a6bd32 1368@item @tab @code{integer chunk_size}
5c6ed53a
TB
1369@end multitable
1370
1371@item @emph{See also}:
1372@ref{omp_get_schedule}
1373@ref{OMP_SCHEDULE}
1374
1375@item @emph{Reference}:
1a6d1d24 1376@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.12.
3721b9e1
DF
1377@end table
1378
1379
1380
4096bf82
JJ
1381@node omp_set_teams_thread_limit
1382@section @code{omp_set_teams_thread_limit} -- Set upper thread limit for teams construct
1383@table @asis
1384@item @emph{Description}:
1385Specifies the upper bound for number of threads that will be available
1386for each team created by the teams construct which does not specify a
1387@code{thread_limit} clause. The argument of
1388@code{omp_set_teams_thread_limit} shall be a positive integer.
1389
1390@item @emph{C/C++}:
1391@multitable @columnfractions .20 .80
1392@item @emph{Prototype}: @tab @code{void omp_set_teams_thread_limit(int thread_limit);}
1393@end multitable
1394
1395@item @emph{Fortran}:
1396@multitable @columnfractions .20 .80
1397@item @emph{Interface}: @tab @code{subroutine omp_set_teams_thread_limit(thread_limit)}
1398@item @tab @code{integer, intent(in) :: thread_limit}
1399@end multitable
1400
1401@item @emph{See also}:
1402@ref{OMP_TEAMS_THREAD_LIMIT}, @ref{omp_get_teams_thread_limit}, @ref{omp_get_thread_limit}
1403
1404@item @emph{Reference}:
1405@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.5.
1406@end table
1407
1408
1409
3721b9e1
DF
1410@node omp_init_lock
1411@section @code{omp_init_lock} -- Initialize simple lock
1412@table @asis
1413@item @emph{Description}:
83fd6c5b 1414Initialize a simple lock. After initialization, the lock is in
3721b9e1
DF
1415an unlocked state.
1416
1417@item @emph{C/C++}:
1418@multitable @columnfractions .20 .80
1419@item @emph{Prototype}: @tab @code{void omp_init_lock(omp_lock_t *lock);}
1420@end multitable
1421
1422@item @emph{Fortran}:
1423@multitable @columnfractions .20 .80
4fed6b25
TB
1424@item @emph{Interface}: @tab @code{subroutine omp_init_lock(svar)}
1425@item @tab @code{integer(omp_lock_kind), intent(out) :: svar}
3721b9e1
DF
1426@end multitable
1427
1428@item @emph{See also}:
1429@ref{omp_destroy_lock}
1430
1431@item @emph{Reference}:
1a6d1d24 1432@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.1.
3721b9e1
DF
1433@end table
1434
1435
1436
1437@node omp_set_lock
1438@section @code{omp_set_lock} -- Wait for and set simple lock
1439@table @asis
1440@item @emph{Description}:
1441Before setting a simple lock, the lock variable must be initialized by
83fd6c5b
TB
1442@code{omp_init_lock}. The calling thread is blocked until the lock
1443is available. If the lock is already held by the current thread,
3721b9e1
DF
1444a deadlock occurs.
1445
1446@item @emph{C/C++}:
1447@multitable @columnfractions .20 .80
1448@item @emph{Prototype}: @tab @code{void omp_set_lock(omp_lock_t *lock);}
1449@end multitable
1450
1451@item @emph{Fortran}:
1452@multitable @columnfractions .20 .80
4fed6b25
TB
1453@item @emph{Interface}: @tab @code{subroutine omp_set_lock(svar)}
1454@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
3721b9e1
DF
1455@end multitable
1456
1457@item @emph{See also}:
1458@ref{omp_init_lock}, @ref{omp_test_lock}, @ref{omp_unset_lock}
1459
1460@item @emph{Reference}:
1a6d1d24 1461@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.4.
3721b9e1
DF
1462@end table
1463
1464
1465
1466@node omp_test_lock
1467@section @code{omp_test_lock} -- Test and set simple lock if available
1468@table @asis
1469@item @emph{Description}:
1470Before setting a simple lock, the lock variable must be initialized by
83fd6c5b
TB
1471@code{omp_init_lock}. Contrary to @code{omp_set_lock}, @code{omp_test_lock}
1472does not block if the lock is not available. This function returns
1473@code{true} upon success, @code{false} otherwise. Here, @code{true} and
3721b9e1
DF
1474@code{false} represent their language-specific counterparts.
1475
1476@item @emph{C/C++}:
1477@multitable @columnfractions .20 .80
1478@item @emph{Prototype}: @tab @code{int omp_test_lock(omp_lock_t *lock);}
1479@end multitable
1480
1481@item @emph{Fortran}:
1482@multitable @columnfractions .20 .80
4fed6b25
TB
1483@item @emph{Interface}: @tab @code{logical function omp_test_lock(svar)}
1484@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
3721b9e1
DF
1485@end multitable
1486
1487@item @emph{See also}:
1488@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
1489
1490@item @emph{Reference}:
1a6d1d24 1491@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.6.
3721b9e1
DF
1492@end table
1493
1494
1495
1496@node omp_unset_lock
1497@section @code{omp_unset_lock} -- Unset simple lock
1498@table @asis
1499@item @emph{Description}:
1500A simple lock about to be unset must have been locked by @code{omp_set_lock}
83fd6c5b
TB
1501or @code{omp_test_lock} before. In addition, the lock must be held by the
1502thread calling @code{omp_unset_lock}. Then, the lock becomes unlocked. If one
1503or more threads attempted to set the lock before, one of them is chosen to,
20906c66 1504again, set the lock to itself.
3721b9e1
DF
1505
1506@item @emph{C/C++}:
1507@multitable @columnfractions .20 .80
1508@item @emph{Prototype}: @tab @code{void omp_unset_lock(omp_lock_t *lock);}
1509@end multitable
1510
1511@item @emph{Fortran}:
1512@multitable @columnfractions .20 .80
4fed6b25
TB
1513@item @emph{Interface}: @tab @code{subroutine omp_unset_lock(svar)}
1514@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
3721b9e1
DF
1515@end multitable
1516
1517@item @emph{See also}:
1518@ref{omp_set_lock}, @ref{omp_test_lock}
1519
1520@item @emph{Reference}:
1a6d1d24 1521@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.5.
3721b9e1
DF
1522@end table
1523
1524
1525
1526@node omp_destroy_lock
1527@section @code{omp_destroy_lock} -- Destroy simple lock
1528@table @asis
1529@item @emph{Description}:
83fd6c5b 1530Destroy a simple lock. In order to be destroyed, a simple lock must be
3721b9e1
DF
1531in the unlocked state.
1532
1533@item @emph{C/C++}:
1534@multitable @columnfractions .20 .80
6a2ba183 1535@item @emph{Prototype}: @tab @code{void omp_destroy_lock(omp_lock_t *lock);}
3721b9e1
DF
1536@end multitable
1537
1538@item @emph{Fortran}:
1539@multitable @columnfractions .20 .80
4fed6b25
TB
1540@item @emph{Interface}: @tab @code{subroutine omp_destroy_lock(svar)}
1541@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
3721b9e1
DF
1542@end multitable
1543
1544@item @emph{See also}:
1545@ref{omp_init_lock}
1546
1547@item @emph{Reference}:
1a6d1d24 1548@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.3.
3721b9e1
DF
1549@end table
1550
1551
1552
1553@node omp_init_nest_lock
1554@section @code{omp_init_nest_lock} -- Initialize nested lock
1555@table @asis
1556@item @emph{Description}:
83fd6c5b 1557Initialize a nested lock. After initialization, the lock is in
3721b9e1
DF
1558an unlocked state and the nesting count is set to zero.
1559
1560@item @emph{C/C++}:
1561@multitable @columnfractions .20 .80
1562@item @emph{Prototype}: @tab @code{void omp_init_nest_lock(omp_nest_lock_t *lock);}
1563@end multitable
1564
1565@item @emph{Fortran}:
1566@multitable @columnfractions .20 .80
4fed6b25
TB
1567@item @emph{Interface}: @tab @code{subroutine omp_init_nest_lock(nvar)}
1568@item @tab @code{integer(omp_nest_lock_kind), intent(out) :: nvar}
3721b9e1
DF
1569@end multitable
1570
1571@item @emph{See also}:
1572@ref{omp_destroy_nest_lock}
1573
1574@item @emph{Reference}:
1a6d1d24 1575@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.1.
3721b9e1
DF
1576@end table
1577
1578
1579@node omp_set_nest_lock
6a2ba183 1580@section @code{omp_set_nest_lock} -- Wait for and set nested lock
3721b9e1
DF
1581@table @asis
1582@item @emph{Description}:
1583Before setting a nested lock, the lock variable must be initialized by
83fd6c5b
TB
1584@code{omp_init_nest_lock}. The calling thread is blocked until the lock
1585is available. If the lock is already held by the current thread, the
20906c66 1586nesting count for the lock is incremented.
3721b9e1
DF
1587
1588@item @emph{C/C++}:
1589@multitable @columnfractions .20 .80
1590@item @emph{Prototype}: @tab @code{void omp_set_nest_lock(omp_nest_lock_t *lock);}
1591@end multitable
1592
1593@item @emph{Fortran}:
1594@multitable @columnfractions .20 .80
4fed6b25
TB
1595@item @emph{Interface}: @tab @code{subroutine omp_set_nest_lock(nvar)}
1596@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
3721b9e1
DF
1597@end multitable
1598
1599@item @emph{See also}:
1600@ref{omp_init_nest_lock}, @ref{omp_unset_nest_lock}
1601
1602@item @emph{Reference}:
1a6d1d24 1603@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.4.
3721b9e1
DF
1604@end table
1605
1606
1607
1608@node omp_test_nest_lock
1609@section @code{omp_test_nest_lock} -- Test and set nested lock if available
1610@table @asis
1611@item @emph{Description}:
1612Before setting a nested lock, the lock variable must be initialized by
83fd6c5b 1613@code{omp_init_nest_lock}. Contrary to @code{omp_set_nest_lock},
3721b9e1
DF
1614@code{omp_test_nest_lock} does not block if the lock is not available.
1615If the lock is already held by the current thread, the new nesting count
83fd6c5b 1616is returned. Otherwise, the return value equals zero.
3721b9e1
DF
1617
1618@item @emph{C/C++}:
1619@multitable @columnfractions .20 .80
1620@item @emph{Prototype}: @tab @code{int omp_test_nest_lock(omp_nest_lock_t *lock);}
1621@end multitable
1622
1623@item @emph{Fortran}:
1624@multitable @columnfractions .20 .80
4fed6b25
TB
1625@item @emph{Interface}: @tab @code{logical function omp_test_nest_lock(nvar)}
1626@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
3721b9e1
DF
1627@end multitable
1628
1629
1630@item @emph{See also}:
1631@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
1632
1633@item @emph{Reference}:
1a6d1d24 1634@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.6.
3721b9e1
DF
1635@end table
1636
1637
1638
1639@node omp_unset_nest_lock
1640@section @code{omp_unset_nest_lock} -- Unset nested lock
1641@table @asis
1642@item @emph{Description}:
1643A nested lock about to be unset must have been locked by @code{omp_set_nested_lock}
83fd6c5b
TB
1644or @code{omp_test_nested_lock} before. In addition, the lock must be held by the
1645thread calling @code{omp_unset_nested_lock}. If the nesting count drops to zero, the
1646lock becomes unlocked. If one ore more threads attempted to set the lock before,
20906c66 1647one of them is chosen to, again, set the lock to itself.
3721b9e1
DF
1648
1649@item @emph{C/C++}:
1650@multitable @columnfractions .20 .80
1651@item @emph{Prototype}: @tab @code{void omp_unset_nest_lock(omp_nest_lock_t *lock);}
1652@end multitable
1653
1654@item @emph{Fortran}:
1655@multitable @columnfractions .20 .80
4fed6b25
TB
1656@item @emph{Interface}: @tab @code{subroutine omp_unset_nest_lock(nvar)}
1657@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
3721b9e1
DF
1658@end multitable
1659
1660@item @emph{See also}:
1661@ref{omp_set_nest_lock}
1662
1663@item @emph{Reference}:
1a6d1d24 1664@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.5.
3721b9e1
DF
1665@end table
1666
1667
1668
1669@node omp_destroy_nest_lock
1670@section @code{omp_destroy_nest_lock} -- Destroy nested lock
1671@table @asis
1672@item @emph{Description}:
83fd6c5b 1673Destroy a nested lock. In order to be destroyed, a nested lock must be
3721b9e1
DF
1674in the unlocked state and its nesting count must equal zero.
1675
1676@item @emph{C/C++}:
1677@multitable @columnfractions .20 .80
1678@item @emph{Prototype}: @tab @code{void omp_destroy_nest_lock(omp_nest_lock_t *);}
1679@end multitable
1680
1681@item @emph{Fortran}:
1682@multitable @columnfractions .20 .80
4fed6b25
TB
1683@item @emph{Interface}: @tab @code{subroutine omp_destroy_nest_lock(nvar)}
1684@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
3721b9e1
DF
1685@end multitable
1686
1687@item @emph{See also}:
1688@ref{omp_init_lock}
1689
1690@item @emph{Reference}:
1a6d1d24 1691@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.3.
3721b9e1
DF
1692@end table
1693
1694
1695
1696@node omp_get_wtick
1697@section @code{omp_get_wtick} -- Get timer precision
1698@table @asis
1699@item @emph{Description}:
f1b0882e 1700Gets the timer precision, i.e., the number of seconds between two
3721b9e1
DF
1701successive clock ticks.
1702
1703@item @emph{C/C++}:
1704@multitable @columnfractions .20 .80
6a2ba183 1705@item @emph{Prototype}: @tab @code{double omp_get_wtick(void);}
3721b9e1
DF
1706@end multitable
1707
1708@item @emph{Fortran}:
1709@multitable @columnfractions .20 .80
1710@item @emph{Interface}: @tab @code{double precision function omp_get_wtick()}
1711@end multitable
1712
1713@item @emph{See also}:
1714@ref{omp_get_wtime}
1715
1716@item @emph{Reference}:
1a6d1d24 1717@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.4.2.
3721b9e1
DF
1718@end table
1719
1720
1721
1722@node omp_get_wtime
1723@section @code{omp_get_wtime} -- Elapsed wall clock time
1724@table @asis
1725@item @emph{Description}:
83fd6c5b 1726Elapsed wall clock time in seconds. The time is measured per thread, no
6a2ba183 1727guarantee can be made that two distinct threads measure the same time.
21e1e594
JJ
1728Time is measured from some "time in the past", which is an arbitrary time
1729guaranteed not to change during the execution of the program.
3721b9e1
DF
1730
1731@item @emph{C/C++}:
1732@multitable @columnfractions .20 .80
6a2ba183 1733@item @emph{Prototype}: @tab @code{double omp_get_wtime(void);}
3721b9e1
DF
1734@end multitable
1735
1736@item @emph{Fortran}:
1737@multitable @columnfractions .20 .80
1738@item @emph{Interface}: @tab @code{double precision function omp_get_wtime()}
1739@end multitable
1740
1741@item @emph{See also}:
1742@ref{omp_get_wtick}
1743
1744@item @emph{Reference}:
1a6d1d24 1745@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.4.1.
3721b9e1
DF
1746@end table
1747
1748
1749
0194e2f0
KCY
1750@node omp_fulfill_event
1751@section @code{omp_fulfill_event} -- Fulfill and destroy an OpenMP event
1752@table @asis
1753@item @emph{Description}:
1754Fulfill the event associated with the event handle argument. Currently, it
1755is only used to fulfill events generated by detach clauses on task
1756constructs - the effect of fulfilling the event is to allow the task to
1757complete.
1758
1759The result of calling @code{omp_fulfill_event} with an event handle other
1760than that generated by a detach clause is undefined. Calling it with an
1761event handle that has already been fulfilled is also undefined.
1762
1763@item @emph{C/C++}:
1764@multitable @columnfractions .20 .80
1765@item @emph{Prototype}: @tab @code{void omp_fulfill_event(omp_event_handle_t event);}
1766@end multitable
1767
1768@item @emph{Fortran}:
1769@multitable @columnfractions .20 .80
1770@item @emph{Interface}: @tab @code{subroutine omp_fulfill_event(event)}
1771@item @tab @code{integer (kind=omp_event_handle_kind) :: event}
1772@end multitable
1773
1774@item @emph{Reference}:
1775@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.5.1.
1776@end table
1777
1778
1779
3721b9e1 1780@c ---------------------------------------------------------------------
4102bda6 1781@c OpenMP Environment Variables
3721b9e1
DF
1782@c ---------------------------------------------------------------------
1783
1784@node Environment Variables
4102bda6 1785@chapter OpenMP Environment Variables
3721b9e1 1786
acf0174b 1787The environment variables which beginning with @env{OMP_} are defined by
00b9bd52 1788section 4 of the OpenMP specification in version 4.5, while those
acf0174b 1789beginning with @env{GOMP_} are GNU extensions.
3721b9e1
DF
1790
1791@menu
06441dd5
SH
1792* OMP_CANCELLATION:: Set whether cancellation is activated
1793* OMP_DISPLAY_ENV:: Show OpenMP version and environment variables
1794* OMP_DEFAULT_DEVICE:: Set the device used in target regions
1795* OMP_DYNAMIC:: Dynamic adjustment of threads
1796* OMP_MAX_ACTIVE_LEVELS:: Set the maximum number of nested parallel regions
d9a6bd32 1797* OMP_MAX_TASK_PRIORITY:: Set the maximum task priority value
06441dd5 1798* OMP_NESTED:: Nested parallel regions
4096bf82 1799* OMP_NUM_TEAMS:: Specifies the number of teams to use by teams region
06441dd5
SH
1800* OMP_NUM_THREADS:: Specifies the number of threads to use
1801* OMP_PROC_BIND:: Whether theads may be moved between CPUs
1802* OMP_PLACES:: Specifies on which CPUs the theads should be placed
1803* OMP_STACKSIZE:: Set default thread stack size
1804* OMP_SCHEDULE:: How threads are scheduled
1bfc07d1 1805* OMP_TARGET_OFFLOAD:: Controls offloading behaviour
4096bf82 1806* OMP_TEAMS_THREAD_LIMIT:: Set the maximum number of threads imposed by teams
06441dd5
SH
1807* OMP_THREAD_LIMIT:: Set the maximum number of threads
1808* OMP_WAIT_POLICY:: How waiting threads are handled
1809* GOMP_CPU_AFFINITY:: Bind threads to specific CPUs
1810* GOMP_DEBUG:: Enable debugging output
1811* GOMP_STACKSIZE:: Set default thread stack size
1812* GOMP_SPINCOUNT:: Set the busy-wait spin count
1813* GOMP_RTEMS_THREAD_POOLS:: Set the RTEMS specific thread pools
3721b9e1
DF
1814@end menu
1815
1816
83fd6c5b
TB
1817@node OMP_CANCELLATION
1818@section @env{OMP_CANCELLATION} -- Set whether cancellation is activated
1819@cindex Environment Variable
1820@table @asis
1821@item @emph{Description}:
1822If set to @code{TRUE}, the cancellation is activated. If set to @code{FALSE} or
1823if unset, cancellation is disabled and the @code{cancel} construct is ignored.
1824
1825@item @emph{See also}:
1826@ref{omp_get_cancellation}
1827
1828@item @emph{Reference}:
1a6d1d24 1829@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.11
83fd6c5b
TB
1830@end table
1831
1832
1833
1834@node OMP_DISPLAY_ENV
1835@section @env{OMP_DISPLAY_ENV} -- Show OpenMP version and environment variables
1836@cindex Environment Variable
1837@table @asis
1838@item @emph{Description}:
1839If set to @code{TRUE}, the OpenMP version number and the values
1840associated with the OpenMP environment variables are printed to @code{stderr}.
1841If set to @code{VERBOSE}, it additionally shows the value of the environment
1842variables which are GNU extensions. If undefined or set to @code{FALSE},
1843this information will not be shown.
1844
1845
1846@item @emph{Reference}:
1a6d1d24 1847@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.12
83fd6c5b
TB
1848@end table
1849
1850
1851
1852@node OMP_DEFAULT_DEVICE
1853@section @env{OMP_DEFAULT_DEVICE} -- Set the device used in target regions
1854@cindex Environment Variable
1855@table @asis
1856@item @emph{Description}:
1857Set to choose the device which is used in a @code{target} region, unless the
1858value is overridden by @code{omp_set_default_device} or by a @code{device}
1859clause. The value shall be the nonnegative device number. If no device with
1860the given device number exists, the code is executed on the host. If unset,
1861device number 0 will be used.
1862
1863
1864@item @emph{See also}:
1865@ref{omp_get_default_device}, @ref{omp_set_default_device},
1866
1867@item @emph{Reference}:
1a6d1d24 1868@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.13
83fd6c5b
TB
1869@end table
1870
1871
1872
3721b9e1
DF
1873@node OMP_DYNAMIC
1874@section @env{OMP_DYNAMIC} -- Dynamic adjustment of threads
1875@cindex Environment Variable
1876@table @asis
1877@item @emph{Description}:
1878Enable or disable the dynamic adjustment of the number of threads
83fd6c5b
TB
1879within a team. The value of this environment variable shall be
1880@code{TRUE} or @code{FALSE}. If undefined, dynamic adjustment is
7c2b7f45 1881disabled by default.
3721b9e1
DF
1882
1883@item @emph{See also}:
1884@ref{omp_set_dynamic}
1885
1886@item @emph{Reference}:
1a6d1d24 1887@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.3
5c6ed53a
TB
1888@end table
1889
1890
1891
1892@node OMP_MAX_ACTIVE_LEVELS
6a2ba183 1893@section @env{OMP_MAX_ACTIVE_LEVELS} -- Set the maximum number of nested parallel regions
5c6ed53a
TB
1894@cindex Environment Variable
1895@table @asis
1896@item @emph{Description}:
6a2ba183 1897Specifies the initial value for the maximum number of nested parallel
83fd6c5b 1898regions. The value of this variable shall be a positive integer.
6fae7eda
KCY
1899If undefined, then if @env{OMP_NESTED} is defined and set to true, or
1900if @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND} are defined and set to
1901a list with more than one item, the maximum number of nested parallel
1902regions will be initialized to the largest number supported, otherwise
1903it will be set to one.
5c6ed53a
TB
1904
1905@item @emph{See also}:
6fae7eda 1906@ref{omp_set_max_active_levels}, @ref{OMP_NESTED}
5c6ed53a
TB
1907
1908@item @emph{Reference}:
1a6d1d24 1909@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.9
3721b9e1
DF
1910@end table
1911
1912
1913
d9a6bd32
JJ
1914@node OMP_MAX_TASK_PRIORITY
1915@section @env{OMP_MAX_TASK_PRIORITY} -- Set the maximum priority
1916number that can be set for a task.
1917@cindex Environment Variable
1918@table @asis
1919@item @emph{Description}:
1920Specifies the initial value for the maximum priority value that can be
1921set for a task. The value of this variable shall be a non-negative
1922integer, and zero is allowed. If undefined, the default priority is
19230.
1924
1925@item @emph{See also}:
1926@ref{omp_get_max_task_priority}
1927
1928@item @emph{Reference}:
1a6d1d24 1929@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.14
d9a6bd32
JJ
1930@end table
1931
1932
1933
3721b9e1
DF
1934@node OMP_NESTED
1935@section @env{OMP_NESTED} -- Nested parallel regions
1936@cindex Environment Variable
14734fc7 1937@cindex Implementation specific setting
3721b9e1
DF
1938@table @asis
1939@item @emph{Description}:
f1b0882e 1940Enable or disable nested parallel regions, i.e., whether team members
83fd6c5b 1941are allowed to create new teams. The value of this environment variable
6fae7eda
KCY
1942shall be @code{TRUE} or @code{FALSE}. If set to @code{TRUE}, the number
1943of maximum active nested regions supported will by default be set to the
1944maximum supported, otherwise it will be set to one. If
1945@env{OMP_MAX_ACTIVE_LEVELS} is defined, its setting will override this
1946setting. If both are undefined, nested parallel regions are enabled if
1947@env{OMP_NUM_THREADS} or @env{OMP_PROC_BINDS} are defined to a list with
1948more than one item, otherwise they are disabled by default.
3721b9e1
DF
1949
1950@item @emph{See also}:
6fae7eda 1951@ref{omp_set_max_active_levels}, @ref{omp_set_nested}
3721b9e1
DF
1952
1953@item @emph{Reference}:
1a6d1d24 1954@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.6
3721b9e1
DF
1955@end table
1956
1957
1958
4096bf82
JJ
1959@node OMP_NUM_TEAMS
1960@section @env{OMP_NUM_TEAMS} -- Specifies the number of teams to use by teams region
1961@cindex Environment Variable
1962@table @asis
1963@item @emph{Description}:
1964Specifies the upper bound for number of teams to use in teams regions
1965without explicit @code{num_teams} clause. The value of this variable shall
1966be a positive integer. If undefined it defaults to 0 which means
1967implementation defined upper bound.
1968
1969@item @emph{See also}:
1970@ref{omp_set_num_teams}
1971
1972@item @emph{Reference}:
1973@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 6.23
1974@end table
1975
1976
1977
3721b9e1
DF
1978@node OMP_NUM_THREADS
1979@section @env{OMP_NUM_THREADS} -- Specifies the number of threads to use
1980@cindex Environment Variable
14734fc7 1981@cindex Implementation specific setting
3721b9e1
DF
1982@table @asis
1983@item @emph{Description}:
83fd6c5b 1984Specifies the default number of threads to use in parallel regions. The
20906c66 1985value of this variable shall be a comma-separated list of positive integers;
6fae7eda
KCY
1986the value specifies the number of threads to use for the corresponding nested
1987level. Specifying more than one item in the list will automatically enable
1988nesting by default. If undefined one thread per CPU is used.
3721b9e1
DF
1989
1990@item @emph{See also}:
6fae7eda 1991@ref{omp_set_num_threads}, @ref{OMP_NESTED}
3721b9e1
DF
1992
1993@item @emph{Reference}:
1a6d1d24 1994@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.2
83fd6c5b
TB
1995@end table
1996
1997
1998
72832460
UB
1999@node OMP_PROC_BIND
2000@section @env{OMP_PROC_BIND} -- Whether theads may be moved between CPUs
2001@cindex Environment Variable
2002@table @asis
2003@item @emph{Description}:
2004Specifies whether threads may be moved between processors. If set to
2005@code{TRUE}, OpenMP theads should not be moved; if set to @code{FALSE}
2006they may be moved. Alternatively, a comma separated list with the
432de084
TB
2007values @code{PRIMARY}, @code{MASTER}, @code{CLOSE} and @code{SPREAD} can
2008be used to specify the thread affinity policy for the corresponding nesting
2009level. With @code{PRIMARY} and @code{MASTER} the worker threads are in the
2010same place partition as the primary thread. With @code{CLOSE} those are
2011kept close to the primary thread in contiguous place partitions. And
2012with @code{SPREAD} a sparse distribution
6fae7eda
KCY
2013across the place partitions is used. Specifying more than one item in the
2014list will automatically enable nesting by default.
72832460
UB
2015
2016When undefined, @env{OMP_PROC_BIND} defaults to @code{TRUE} when
2017@env{OMP_PLACES} or @env{GOMP_CPU_AFFINITY} is set and @code{FALSE} otherwise.
2018
2019@item @emph{See also}:
6fae7eda
KCY
2020@ref{omp_get_proc_bind}, @ref{GOMP_CPU_AFFINITY},
2021@ref{OMP_NESTED}, @ref{OMP_PLACES}
72832460
UB
2022
2023@item @emph{Reference}:
1a6d1d24 2024@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.4
72832460
UB
2025@end table
2026
2027
2028
83fd6c5b
TB
2029@node OMP_PLACES
2030@section @env{OMP_PLACES} -- Specifies on which CPUs the theads should be placed
2031@cindex Environment Variable
2032@table @asis
2033@item @emph{Description}:
2034The thread placement can be either specified using an abstract name or by an
b09af562
TB
2035explicit list of the places. The abstract names @code{threads}, @code{cores},
2036@code{sockets}, @code{ll_caches} and @code{numa_domains} can be optionally
2037followed by a positive number in parentheses, which denotes the how many places
2038shall be created. With @code{threads} each place corresponds to a single
2039hardware thread; @code{cores} to a single core with the corresponding number of
2040hardware threads; with @code{sockets} the place corresponds to a single
2041socket; with @code{ll_caches} to a set of cores that shares the last level
2042cache on the device; and @code{numa_domains} to a set of cores for which their
2043closest memory on the device is the same memory and at a similar distance from
2044the cores. The resulting placement can be shown by setting the
2045@env{OMP_DISPLAY_ENV} environment variable.
83fd6c5b
TB
2046
2047Alternatively, the placement can be specified explicitly as comma-separated
2048list of places. A place is specified by set of nonnegative numbers in curly
b09af562
TB
2049braces, denoting the hardware threads. The curly braces can be omitted
2050when only a single number has been specified. The hardware threads
83fd6c5b
TB
2051belonging to a place can either be specified as comma-separated list of
2052nonnegative thread numbers or using an interval. Multiple places can also be
2053either specified by a comma-separated list of places or by an interval. To
b09af562 2054specify an interval, a colon followed by the count is placed after
83fd6c5b
TB
2055the hardware thread number or the place. Optionally, the length can be
2056followed by a colon and the stride number -- otherwise a unit stride is
b09af562
TB
2057assumed. Placing an exclamation mark (@code{!}) directly before a curly
2058brace or numbers inside the curly braces (excluding intervals) will
2059exclude those hardware threads.
2060
2061For instance, the following specifies the same places list:
83fd6c5b
TB
2062@code{"@{0,1,2@}, @{3,4,6@}, @{7,8,9@}, @{10,11,12@}"};
2063@code{"@{0:3@}, @{3:3@}, @{7:3@}, @{10:3@}"}; and @code{"@{0:2@}:4:3"}.
2064
2065If @env{OMP_PLACES} and @env{GOMP_CPU_AFFINITY} are unset and
2066@env{OMP_PROC_BIND} is either unset or @code{false}, threads may be moved
2067between CPUs following no placement policy.
2068
2069@item @emph{See also}:
2070@ref{OMP_PROC_BIND}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind},
2071@ref{OMP_DISPLAY_ENV}
2072
2073@item @emph{Reference}:
1a6d1d24 2074@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.5
83fd6c5b
TB
2075@end table
2076
2077
2078
72832460
UB
2079@node OMP_STACKSIZE
2080@section @env{OMP_STACKSIZE} -- Set default thread stack size
83fd6c5b
TB
2081@cindex Environment Variable
2082@table @asis
2083@item @emph{Description}:
72832460
UB
2084Set the default thread stack size in kilobytes, unless the number
2085is suffixed by @code{B}, @code{K}, @code{M} or @code{G}, in which
2086case the size is, respectively, in bytes, kilobytes, megabytes
2087or gigabytes. This is different from @code{pthread_attr_setstacksize}
2088which gets the number of bytes as an argument. If the stack size cannot
2089be set due to system constraints, an error is reported and the initial
2090stack size is left unchanged. If undefined, the stack size is system
2091dependent.
83fd6c5b 2092
72832460 2093@item @emph{Reference}:
1a6d1d24 2094@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.7
3721b9e1
DF
2095@end table
2096
2097
2098
2099@node OMP_SCHEDULE
2100@section @env{OMP_SCHEDULE} -- How threads are scheduled
2101@cindex Environment Variable
14734fc7 2102@cindex Implementation specific setting
3721b9e1
DF
2103@table @asis
2104@item @emph{Description}:
2105Allows to specify @code{schedule type} and @code{chunk size}.
2106The value of the variable shall have the form: @code{type[,chunk]} where
5c6ed53a 2107@code{type} is one of @code{static}, @code{dynamic}, @code{guided} or @code{auto}
83fd6c5b 2108The optional @code{chunk} size shall be a positive integer. If undefined,
7c2b7f45 2109dynamic scheduling and a chunk size of 1 is used.
3721b9e1 2110
5c6ed53a
TB
2111@item @emph{See also}:
2112@ref{omp_set_schedule}
2113
2114@item @emph{Reference}:
1a6d1d24 2115@uref{https://www.openmp.org, OpenMP specification v4.5}, Sections 2.7.1.1 and 4.1
5c6ed53a
TB
2116@end table
2117
2118
2119
1bfc07d1
KCY
2120@node OMP_TARGET_OFFLOAD
2121@section @env{OMP_TARGET_OFFLOAD} -- Controls offloading behaviour
2122@cindex Environment Variable
2123@cindex Implementation specific setting
2124@table @asis
2125@item @emph{Description}:
2126Specifies the behaviour with regard to offloading code to a device. This
2127variable can be set to one of three values - @code{MANDATORY}, @code{DISABLED}
2128or @code{DEFAULT}.
2129
2130If set to @code{MANDATORY}, the program will terminate with an error if
2131the offload device is not present or is not supported. If set to
2132@code{DISABLED}, then offloading is disabled and all code will run on the
2133host. If set to @code{DEFAULT}, the program will try offloading to the
2134device first, then fall back to running code on the host if it cannot.
2135
2136If undefined, then the program will behave as if @code{DEFAULT} was set.
2137
2138@item @emph{Reference}:
2139@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 6.17
2140@end table
2141
2142
2143
4096bf82
JJ
2144@node OMP_TEAMS_THREAD_LIMIT
2145@section @env{OMP_TEAMS_THREAD_LIMIT} -- Set the maximum number of threads imposed by teams
2146@cindex Environment Variable
2147@table @asis
2148@item @emph{Description}:
2149Specifies an upper bound for the number of threads to use by each contention
2150group created by a teams construct without explicit @code{thread_limit}
2151clause. The value of this variable shall be a positive integer. If undefined,
2152the value of 0 is used which stands for an implementation defined upper
2153limit.
2154
2155@item @emph{See also}:
2156@ref{OMP_THREAD_LIMIT}, @ref{omp_set_teams_thread_limit}
2157
2158@item @emph{Reference}:
2159@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 6.24
2160@end table
2161
2162
2163
5c6ed53a 2164@node OMP_THREAD_LIMIT
6a2ba183 2165@section @env{OMP_THREAD_LIMIT} -- Set the maximum number of threads
5c6ed53a
TB
2166@cindex Environment Variable
2167@table @asis
2168@item @emph{Description}:
83fd6c5b
TB
2169Specifies the number of threads to use for the whole program. The
2170value of this variable shall be a positive integer. If undefined,
5c6ed53a
TB
2171the number of threads is not limited.
2172
2173@item @emph{See also}:
83fd6c5b 2174@ref{OMP_NUM_THREADS}, @ref{omp_get_thread_limit}
5c6ed53a
TB
2175
2176@item @emph{Reference}:
1a6d1d24 2177@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.10
5c6ed53a
TB
2178@end table
2179
2180
2181
2182@node OMP_WAIT_POLICY
2183@section @env{OMP_WAIT_POLICY} -- How waiting threads are handled
2184@cindex Environment Variable
2185@table @asis
2186@item @emph{Description}:
83fd6c5b 2187Specifies whether waiting threads should be active or passive. If
5c6ed53a
TB
2188the value is @code{PASSIVE}, waiting threads should not consume CPU
2189power while waiting; while the value is @code{ACTIVE} specifies that
83fd6c5b 2190they should. If undefined, threads wait actively for a short time
acf0174b
JJ
2191before waiting passively.
2192
2193@item @emph{See also}:
2194@ref{GOMP_SPINCOUNT}
5c6ed53a
TB
2195
2196@item @emph{Reference}:
1a6d1d24 2197@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.8
3721b9e1
DF
2198@end table
2199
2200
2201
2202@node GOMP_CPU_AFFINITY
2203@section @env{GOMP_CPU_AFFINITY} -- Bind threads to specific CPUs
2204@cindex Environment Variable
2205@table @asis
2206@item @emph{Description}:
83fd6c5b
TB
2207Binds threads to specific CPUs. The variable should contain a space-separated
2208or comma-separated list of CPUs. This list may contain different kinds of
06785a48 2209entries: either single CPU numbers in any order, a range of CPUs (M-N)
83fd6c5b 2210or a range with some stride (M-N:S). CPU numbers are zero based. For example,
06785a48
DF
2211@code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} will bind the initial thread
2212to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to
2213CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12,
2214and 14 respectively and then start assigning back from the beginning of
6a2ba183 2215the list. @code{GOMP_CPU_AFFINITY=0} binds all threads to CPU 0.
06785a48 2216
f1f3453e 2217There is no libgomp library routine to determine whether a CPU affinity
83fd6c5b 2218specification is in effect. As a workaround, language-specific library
06785a48
DF
2219functions, e.g., @code{getenv} in C or @code{GET_ENVIRONMENT_VARIABLE} in
2220Fortran, may be used to query the setting of the @code{GOMP_CPU_AFFINITY}
83fd6c5b 2221environment variable. A defined CPU affinity on startup cannot be changed
06785a48
DF
2222or disabled during the runtime of the application.
2223
83fd6c5b
TB
2224If both @env{GOMP_CPU_AFFINITY} and @env{OMP_PROC_BIND} are set,
2225@env{OMP_PROC_BIND} has a higher precedence. If neither has been set and
2226@env{OMP_PROC_BIND} is unset, or when @env{OMP_PROC_BIND} is set to
2227@code{FALSE}, the host system will handle the assignment of threads to CPUs.
20906c66
JJ
2228
2229@item @emph{See also}:
83fd6c5b 2230@ref{OMP_PLACES}, @ref{OMP_PROC_BIND}
3721b9e1
DF
2231@end table
2232
2233
2234
41dbbb37
TS
2235@node GOMP_DEBUG
2236@section @env{GOMP_DEBUG} -- Enable debugging output
2237@cindex Environment Variable
2238@table @asis
2239@item @emph{Description}:
2240Enable debugging output. The variable should be set to @code{0}
2241(disabled, also the default if not set), or @code{1} (enabled).
2242
2243If enabled, some debugging output will be printed during execution.
2244This is currently not specified in more detail, and subject to change.
2245@end table
2246
2247
2248
3721b9e1
DF
2249@node GOMP_STACKSIZE
2250@section @env{GOMP_STACKSIZE} -- Set default thread stack size
2251@cindex Environment Variable
14734fc7 2252@cindex Implementation specific setting
3721b9e1
DF
2253@table @asis
2254@item @emph{Description}:
83fd6c5b 2255Set the default thread stack size in kilobytes. This is different from
5c6ed53a 2256@code{pthread_attr_setstacksize} which gets the number of bytes as an
83fd6c5b
TB
2257argument. If the stack size cannot be set due to system constraints, an
2258error is reported and the initial stack size is left unchanged. If undefined,
7c2b7f45 2259the stack size is system dependent.
3721b9e1 2260
5c6ed53a 2261@item @emph{See also}:
0024f1af 2262@ref{OMP_STACKSIZE}
5c6ed53a 2263
3721b9e1 2264@item @emph{Reference}:
c1030b5c 2265@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
3721b9e1 2266GCC Patches Mailinglist},
c1030b5c 2267@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
3721b9e1
DF
2268GCC Patches Mailinglist}
2269@end table
2270
2271
2272
acf0174b
JJ
2273@node GOMP_SPINCOUNT
2274@section @env{GOMP_SPINCOUNT} -- Set the busy-wait spin count
2275@cindex Environment Variable
2276@cindex Implementation specific setting
2277@table @asis
2278@item @emph{Description}:
2279Determines how long a threads waits actively with consuming CPU power
83fd6c5b 2280before waiting passively without consuming CPU power. The value may be
acf0174b 2281either @code{INFINITE}, @code{INFINITY} to always wait actively or an
83fd6c5b 2282integer which gives the number of spins of the busy-wait loop. The
acf0174b
JJ
2283integer may optionally be followed by the following suffixes acting
2284as multiplication factors: @code{k} (kilo, thousand), @code{M} (mega,
2285million), @code{G} (giga, billion), or @code{T} (tera, trillion).
2286If undefined, 0 is used when @env{OMP_WAIT_POLICY} is @code{PASSIVE},
2287300,000 is used when @env{OMP_WAIT_POLICY} is undefined and
228830 billion is used when @env{OMP_WAIT_POLICY} is @code{ACTIVE}.
2289If there are more OpenMP threads than available CPUs, 1000 and 100
2290spins are used for @env{OMP_WAIT_POLICY} being @code{ACTIVE} or
2291undefined, respectively; unless the @env{GOMP_SPINCOUNT} is lower
2292or @env{OMP_WAIT_POLICY} is @code{PASSIVE}.
2293
2294@item @emph{See also}:
2295@ref{OMP_WAIT_POLICY}
2296@end table
2297
2298
2299
06441dd5
SH
2300@node GOMP_RTEMS_THREAD_POOLS
2301@section @env{GOMP_RTEMS_THREAD_POOLS} -- Set the RTEMS specific thread pools
2302@cindex Environment Variable
2303@cindex Implementation specific setting
2304@table @asis
2305@item @emph{Description}:
2306This environment variable is only used on the RTEMS real-time operating system.
2307It determines the scheduler instance specific thread pools. The format for
2308@env{GOMP_RTEMS_THREAD_POOLS} is a list of optional
2309@code{<thread-pool-count>[$<priority>]@@<scheduler-name>} configurations
2310separated by @code{:} where:
2311@itemize @bullet
2312@item @code{<thread-pool-count>} is the thread pool count for this scheduler
2313instance.
2314@item @code{$<priority>} is an optional priority for the worker threads of a
2315thread pool according to @code{pthread_setschedparam}. In case a priority
2316value is omitted, then a worker thread will inherit the priority of the OpenMP
432de084
TB
2317primary thread that created it. The priority of the worker thread is not
2318changed after creation, even if a new OpenMP primary thread using the worker has
06441dd5
SH
2319a different priority.
2320@item @code{@@<scheduler-name>} is the scheduler instance name according to the
2321RTEMS application configuration.
2322@end itemize
2323In case no thread pool configuration is specified for a scheduler instance,
432de084 2324then each OpenMP primary thread of this scheduler instance will use its own
06441dd5 2325dynamically allocated thread pool. To limit the worker thread count of the
432de084 2326thread pools, each OpenMP primary thread must call @code{omp_set_num_threads}.
06441dd5
SH
2327@item @emph{Example}:
2328Lets suppose we have three scheduler instances @code{IO}, @code{WRK0}, and
2329@code{WRK1} with @env{GOMP_RTEMS_THREAD_POOLS} set to
2330@code{"1@@WRK0:3$4@@WRK1"}. Then there are no thread pool restrictions for
2331scheduler instance @code{IO}. In the scheduler instance @code{WRK0} there is
2332one thread pool available. Since no priority is specified for this scheduler
432de084 2333instance, the worker thread inherits the priority of the OpenMP primary thread
06441dd5
SH
2334that created it. In the scheduler instance @code{WRK1} there are three thread
2335pools available and their worker threads run at priority four.
2336@end table
2337
2338
2339
cdf6119d
JN
2340@c ---------------------------------------------------------------------
2341@c Enabling OpenACC
2342@c ---------------------------------------------------------------------
2343
2344@node Enabling OpenACC
2345@chapter Enabling OpenACC
2346
2347To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
2348flag @option{-fopenacc} must be specified. This enables the OpenACC directive
c1030b5c 2349@code{#pragma acc} in C/C++ and @code{!$acc} directives in free form,
cdf6119d
JN
2350@code{c$acc}, @code{*$acc} and @code{!$acc} directives in fixed form,
2351@code{!$} conditional compilation sentinels in free form and @code{c$},
2352@code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
2353arranges for automatic linking of the OpenACC runtime library
2354(@ref{OpenACC Runtime Library Routines}).
2355
8d1a1cb1
TB
2356See @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
2357
cdf6119d 2358A complete description of all OpenACC directives accepted may be found in
9651fbaf 2359the @uref{https://www.openacc.org, OpenACC} Application Programming
e464fc90 2360Interface manual, version 2.6.
cdf6119d 2361
cdf6119d
JN
2362
2363
2364@c ---------------------------------------------------------------------
2365@c OpenACC Runtime Library Routines
2366@c ---------------------------------------------------------------------
2367
2368@node OpenACC Runtime Library Routines
2369@chapter OpenACC Runtime Library Routines
2370
2371The runtime routines described here are defined by section 3 of the OpenACC
e464fc90 2372specifications in version 2.6.
cdf6119d
JN
2373They have C linkage, and do not throw exceptions.
2374Generally, they are available only for the host, with the exception of
2375@code{acc_on_device}, which is available for both the host and the
2376acceleration device.
2377
de9f5e0c
JB
2378This list has not yet been updated for the OpenACC specification in
2379version 2.6.
2380
cdf6119d
JN
2381@menu
2382* acc_get_num_devices:: Get number of devices for the given device
2383 type.
2384* acc_set_device_type:: Set type of device accelerator to use.
2385* acc_get_device_type:: Get type of device accelerator to be used.
2386* acc_set_device_num:: Set device number to use.
2387* acc_get_device_num:: Get device number to be used.
6c84c8bf 2388* acc_get_property:: Get device property.
cdf6119d
JN
2389* acc_async_test:: Tests for completion of a specific asynchronous
2390 operation.
c1030b5c 2391* acc_async_test_all:: Tests for completion of all asynchronous
cdf6119d
JN
2392 operations.
2393* acc_wait:: Wait for completion of a specific asynchronous
2394 operation.
c1030b5c 2395* acc_wait_all:: Waits for completion of all asynchronous
cdf6119d
JN
2396 operations.
2397* acc_wait_all_async:: Wait for completion of all asynchronous
2398 operations.
2399* acc_wait_async:: Wait for completion of asynchronous operations.
2400* acc_init:: Initialize runtime for a specific device type.
2401* acc_shutdown:: Shuts down the runtime for a specific device
2402 type.
2403* acc_on_device:: Whether executing on a particular device
2404* acc_malloc:: Allocate device memory.
2405* acc_free:: Free device memory.
2406* acc_copyin:: Allocate device memory and copy host memory to
2407 it.
2408* acc_present_or_copyin:: If the data is not present on the device,
2409 allocate device memory and copy from host
2410 memory.
2411* acc_create:: Allocate device memory and map it to host
2412 memory.
2413* acc_present_or_create:: If the data is not present on the device,
2414 allocate device memory and map it to host
2415 memory.
2416* acc_copyout:: Copy device memory to host memory.
2417* acc_delete:: Free device memory.
2418* acc_update_device:: Update device memory from mapped host memory.
2419* acc_update_self:: Update host memory from mapped device memory.
2420* acc_map_data:: Map previously allocated device memory to host
2421 memory.
2422* acc_unmap_data:: Unmap device memory from host memory.
2423* acc_deviceptr:: Get device pointer associated with specific
2424 host address.
2425* acc_hostptr:: Get host pointer associated with specific
2426 device address.
93d90219 2427* acc_is_present:: Indicate whether host variable / array is
cdf6119d
JN
2428 present on device.
2429* acc_memcpy_to_device:: Copy host memory to device memory.
2430* acc_memcpy_from_device:: Copy device memory to host memory.
e464fc90
TB
2431* acc_attach:: Let device pointer point to device-pointer target.
2432* acc_detach:: Let device pointer point to host-pointer target.
cdf6119d
JN
2433
2434API routines for target platforms.
2435
2436* acc_get_current_cuda_device:: Get CUDA device handle.
2437* acc_get_current_cuda_context::Get CUDA context handle.
2438* acc_get_cuda_stream:: Get CUDA stream handle.
2439* acc_set_cuda_stream:: Set CUDA stream handle.
5fae049d
TS
2440
2441API routines for the OpenACC Profiling Interface.
2442
2443* acc_prof_register:: Register callbacks.
2444* acc_prof_unregister:: Unregister callbacks.
2445* acc_prof_lookup:: Obtain inquiry functions.
2446* acc_register_library:: Library registration.
cdf6119d
JN
2447@end menu
2448
2449
2450
2451@node acc_get_num_devices
2452@section @code{acc_get_num_devices} -- Get number of devices for given device type
2453@table @asis
2454@item @emph{Description}
2455This function returns a value indicating the number of devices available
2456for the device type specified in @var{devicetype}.
2457
2458@item @emph{C/C++}:
2459@multitable @columnfractions .20 .80
2460@item @emph{Prototype}: @tab @code{int acc_get_num_devices(acc_device_t devicetype);}
2461@end multitable
2462
2463@item @emph{Fortran}:
2464@multitable @columnfractions .20 .80
2465@item @emph{Interface}: @tab @code{integer function acc_get_num_devices(devicetype)}
2466@item @tab @code{integer(kind=acc_device_kind) devicetype}
2467@end multitable
2468
2469@item @emph{Reference}:
e464fc90 2470@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
24713.2.1.
2472@end table
2473
2474
2475
2476@node acc_set_device_type
2477@section @code{acc_set_device_type} -- Set type of device accelerator to use.
2478@table @asis
2479@item @emph{Description}
c1030b5c 2480This function indicates to the runtime library which device type, specified
cdf6119d
JN
2481in @var{devicetype}, to use when executing a parallel or kernels region.
2482
2483@item @emph{C/C++}:
2484@multitable @columnfractions .20 .80
2485@item @emph{Prototype}: @tab @code{acc_set_device_type(acc_device_t devicetype);}
2486@end multitable
2487
2488@item @emph{Fortran}:
2489@multitable @columnfractions .20 .80
2490@item @emph{Interface}: @tab @code{subroutine acc_set_device_type(devicetype)}
2491@item @tab @code{integer(kind=acc_device_kind) devicetype}
2492@end multitable
2493
2494@item @emph{Reference}:
e464fc90 2495@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
24963.2.2.
2497@end table
2498
2499
2500
2501@node acc_get_device_type
2502@section @code{acc_get_device_type} -- Get type of device accelerator to be used.
2503@table @asis
2504@item @emph{Description}
2505This function returns what device type will be used when executing a
2506parallel or kernels region.
2507
b52643ab
KCY
2508This function returns @code{acc_device_none} if
2509@code{acc_get_device_type} is called from
2510@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
2511callbacks of the OpenACC Profiling Interface (@ref{OpenACC Profiling
2512Interface}), that is, if the device is currently being initialized.
2513
cdf6119d
JN
2514@item @emph{C/C++}:
2515@multitable @columnfractions .20 .80
2516@item @emph{Prototype}: @tab @code{acc_device_t acc_get_device_type(void);}
2517@end multitable
2518
2519@item @emph{Fortran}:
2520@multitable @columnfractions .20 .80
2521@item @emph{Interface}: @tab @code{function acc_get_device_type(void)}
2522@item @tab @code{integer(kind=acc_device_kind) acc_get_device_type}
2523@end multitable
2524
2525@item @emph{Reference}:
e464fc90 2526@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
25273.2.3.
2528@end table
2529
2530
2531
2532@node acc_set_device_num
2533@section @code{acc_set_device_num} -- Set device number to use.
2534@table @asis
2535@item @emph{Description}
2536This function will indicate to the runtime which device number,
8d1a1cb1 2537specified by @var{devicenum}, associated with the specified device
cdf6119d
JN
2538type @var{devicetype}.
2539
2540@item @emph{C/C++}:
2541@multitable @columnfractions .20 .80
8d1a1cb1 2542@item @emph{Prototype}: @tab @code{acc_set_device_num(int devicenum, acc_device_t devicetype);}
cdf6119d
JN
2543@end multitable
2544
2545@item @emph{Fortran}:
2546@multitable @columnfractions .20 .80
2547@item @emph{Interface}: @tab @code{subroutine acc_set_device_num(devicenum, devicetype)}
2548@item @tab @code{integer devicenum}
2549@item @tab @code{integer(kind=acc_device_kind) devicetype}
2550@end multitable
2551
2552@item @emph{Reference}:
e464fc90 2553@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
25543.2.4.
2555@end table
2556
2557
2558
2559@node acc_get_device_num
2560@section @code{acc_get_device_num} -- Get device number to be used.
2561@table @asis
2562@item @emph{Description}
2563This function returns which device number associated with the specified device
2564type @var{devicetype}, will be used when executing a parallel or kernels
2565region.
2566
2567@item @emph{C/C++}:
2568@multitable @columnfractions .20 .80
2569@item @emph{Prototype}: @tab @code{int acc_get_device_num(acc_device_t devicetype);}
2570@end multitable
2571
2572@item @emph{Fortran}:
2573@multitable @columnfractions .20 .80
2574@item @emph{Interface}: @tab @code{function acc_get_device_num(devicetype)}
2575@item @tab @code{integer(kind=acc_device_kind) devicetype}
2576@item @tab @code{integer acc_get_device_num}
2577@end multitable
2578
2579@item @emph{Reference}:
e464fc90 2580@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
25813.2.5.
2582@end table
2583
2584
2585
6c84c8bf
MR
2586@node acc_get_property
2587@section @code{acc_get_property} -- Get device property.
2588@cindex acc_get_property
2589@cindex acc_get_property_string
2590@table @asis
2591@item @emph{Description}
2592These routines return the value of the specified @var{property} for the
2593device being queried according to @var{devicenum} and @var{devicetype}.
2594Integer-valued and string-valued properties are returned by
2595@code{acc_get_property} and @code{acc_get_property_string} respectively.
2596The Fortran @code{acc_get_property_string} subroutine returns the string
2597retrieved in its fourth argument while the remaining entry points are
2598functions, which pass the return value as their result.
2599
8d1a1cb1
TB
2600Note for Fortran, only: the OpenACC technical committee corrected and, hence,
2601modified the interface introduced in OpenACC 2.6. The kind-value parameter
2602@code{acc_device_property} has been renamed to @code{acc_device_property_kind}
2603for consistency and the return type of the @code{acc_get_property} function is
2604now a @code{c_size_t} integer instead of a @code{acc_device_property} integer.
2605The parameter @code{acc_device_property} will continue to be provided,
2606but might be removed in a future version of GCC.
2607
6c84c8bf
MR
2608@item @emph{C/C++}:
2609@multitable @columnfractions .20 .80
2610@item @emph{Prototype}: @tab @code{size_t acc_get_property(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
2611@item @emph{Prototype}: @tab @code{const char *acc_get_property_string(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
2612@end multitable
2613
2614@item @emph{Fortran}:
2615@multitable @columnfractions .20 .80
2616@item @emph{Interface}: @tab @code{function acc_get_property(devicenum, devicetype, property)}
2617@item @emph{Interface}: @tab @code{subroutine acc_get_property_string(devicenum, devicetype, property, string)}
8d1a1cb1 2618@item @tab @code{use ISO_C_Binding, only: c_size_t}
6c84c8bf
MR
2619@item @tab @code{integer devicenum}
2620@item @tab @code{integer(kind=acc_device_kind) devicetype}
8d1a1cb1
TB
2621@item @tab @code{integer(kind=acc_device_property_kind) property}
2622@item @tab @code{integer(kind=c_size_t) acc_get_property}
6c84c8bf
MR
2623@item @tab @code{character(*) string}
2624@end multitable
2625
2626@item @emph{Reference}:
2627@uref{https://www.openacc.org, OpenACC specification v2.6}, section
26283.2.6.
2629@end table
2630
2631
2632
cdf6119d
JN
2633@node acc_async_test
2634@section @code{acc_async_test} -- Test for completion of a specific asynchronous operation.
2635@table @asis
2636@item @emph{Description}
93d90219 2637This function tests for completion of the asynchronous operation specified
cdf6119d
JN
2638in @var{arg}. In C/C++, a non-zero value will be returned to indicate
2639the specified asynchronous operation has completed. While Fortran will return
93d90219 2640a @code{true}. If the asynchronous operation has not completed, C/C++ returns
cdf6119d
JN
2641a zero and Fortran returns a @code{false}.
2642
2643@item @emph{C/C++}:
2644@multitable @columnfractions .20 .80
2645@item @emph{Prototype}: @tab @code{int acc_async_test(int arg);}
2646@end multitable
2647
2648@item @emph{Fortran}:
2649@multitable @columnfractions .20 .80
2650@item @emph{Interface}: @tab @code{function acc_async_test(arg)}
2651@item @tab @code{integer(kind=acc_handle_kind) arg}
2652@item @tab @code{logical acc_async_test}
2653@end multitable
2654
2655@item @emph{Reference}:
e464fc90
TB
2656@uref{https://www.openacc.org, OpenACC specification v2.6}, section
26573.2.9.
cdf6119d
JN
2658@end table
2659
2660
2661
2662@node acc_async_test_all
2663@section @code{acc_async_test_all} -- Tests for completion of all asynchronous operations.
2664@table @asis
2665@item @emph{Description}
93d90219 2666This function tests for completion of all asynchronous operations.
cdf6119d
JN
2667In C/C++, a non-zero value will be returned to indicate all asynchronous
2668operations have completed. While Fortran will return a @code{true}. If
2669any asynchronous operation has not completed, C/C++ returns a zero and
2670Fortran returns a @code{false}.
2671
2672@item @emph{C/C++}:
2673@multitable @columnfractions .20 .80
2674@item @emph{Prototype}: @tab @code{int acc_async_test_all(void);}
2675@end multitable
2676
2677@item @emph{Fortran}:
2678@multitable @columnfractions .20 .80
2679@item @emph{Interface}: @tab @code{function acc_async_test()}
2680@item @tab @code{logical acc_get_device_num}
2681@end multitable
2682
2683@item @emph{Reference}:
e464fc90
TB
2684@uref{https://www.openacc.org, OpenACC specification v2.6}, section
26853.2.10.
cdf6119d
JN
2686@end table
2687
2688
2689
2690@node acc_wait
2691@section @code{acc_wait} -- Wait for completion of a specific asynchronous operation.
2692@table @asis
2693@item @emph{Description}
2694This function waits for completion of the asynchronous operation
2695specified in @var{arg}.
2696
2697@item @emph{C/C++}:
2698@multitable @columnfractions .20 .80
2699@item @emph{Prototype}: @tab @code{acc_wait(arg);}
7ce64403 2700@item @emph{Prototype (OpenACC 1.0 compatibility)}: @tab @code{acc_async_wait(arg);}
cdf6119d
JN
2701@end multitable
2702
2703@item @emph{Fortran}:
2704@multitable @columnfractions .20 .80
2705@item @emph{Interface}: @tab @code{subroutine acc_wait(arg)}
2706@item @tab @code{integer(acc_handle_kind) arg}
7ce64403
TS
2707@item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait(arg)}
2708@item @tab @code{integer(acc_handle_kind) arg}
cdf6119d
JN
2709@end multitable
2710
2711@item @emph{Reference}:
e464fc90
TB
2712@uref{https://www.openacc.org, OpenACC specification v2.6}, section
27133.2.11.
cdf6119d
JN
2714@end table
2715
2716
2717
2718@node acc_wait_all
2719@section @code{acc_wait_all} -- Waits for completion of all asynchronous operations.
2720@table @asis
2721@item @emph{Description}
2722This function waits for the completion of all asynchronous operations.
2723
2724@item @emph{C/C++}:
2725@multitable @columnfractions .20 .80
2726@item @emph{Prototype}: @tab @code{acc_wait_all(void);}
7ce64403 2727@item @emph{Prototype (OpenACC 1.0 compatibility)}: @tab @code{acc_async_wait_all(void);}
cdf6119d
JN
2728@end multitable
2729
2730@item @emph{Fortran}:
2731@multitable @columnfractions .20 .80
7ce64403
TS
2732@item @emph{Interface}: @tab @code{subroutine acc_wait_all()}
2733@item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait_all()}
cdf6119d
JN
2734@end multitable
2735
2736@item @emph{Reference}:
e464fc90
TB
2737@uref{https://www.openacc.org, OpenACC specification v2.6}, section
27383.2.13.
cdf6119d
JN
2739@end table
2740
2741
2742
2743@node acc_wait_all_async
2744@section @code{acc_wait_all_async} -- Wait for completion of all asynchronous operations.
2745@table @asis
2746@item @emph{Description}
2747This function enqueues a wait operation on the queue @var{async} for any
2748and all asynchronous operations that have been previously enqueued on
2749any queue.
2750
2751@item @emph{C/C++}:
2752@multitable @columnfractions .20 .80
2753@item @emph{Prototype}: @tab @code{acc_wait_all_async(int async);}
2754@end multitable
2755
2756@item @emph{Fortran}:
2757@multitable @columnfractions .20 .80
2758@item @emph{Interface}: @tab @code{subroutine acc_wait_all_async(async)}
2759@item @tab @code{integer(acc_handle_kind) async}
2760@end multitable
2761
2762@item @emph{Reference}:
e464fc90
TB
2763@uref{https://www.openacc.org, OpenACC specification v2.6}, section
27643.2.14.
cdf6119d
JN
2765@end table
2766
2767
2768
2769@node acc_wait_async
2770@section @code{acc_wait_async} -- Wait for completion of asynchronous operations.
2771@table @asis
2772@item @emph{Description}
2773This function enqueues a wait operation on queue @var{async} for any and all
2774asynchronous operations enqueued on queue @var{arg}.
2775
2776@item @emph{C/C++}:
2777@multitable @columnfractions .20 .80
2778@item @emph{Prototype}: @tab @code{acc_wait_async(int arg, int async);}
2779@end multitable
2780
2781@item @emph{Fortran}:
2782@multitable @columnfractions .20 .80
2783@item @emph{Interface}: @tab @code{subroutine acc_wait_async(arg, async)}
2784@item @tab @code{integer(acc_handle_kind) arg, async}
2785@end multitable
2786
2787@item @emph{Reference}:
e464fc90
TB
2788@uref{https://www.openacc.org, OpenACC specification v2.6}, section
27893.2.12.
cdf6119d
JN
2790@end table
2791
2792
2793
2794@node acc_init
2795@section @code{acc_init} -- Initialize runtime for a specific device type.
2796@table @asis
2797@item @emph{Description}
2798This function initializes the runtime for the device type specified in
2799@var{devicetype}.
2800
2801@item @emph{C/C++}:
2802@multitable @columnfractions .20 .80
2803@item @emph{Prototype}: @tab @code{acc_init(acc_device_t devicetype);}
2804@end multitable
2805
2806@item @emph{Fortran}:
2807@multitable @columnfractions .20 .80
2808@item @emph{Interface}: @tab @code{subroutine acc_init(devicetype)}
2809@item @tab @code{integer(acc_device_kind) devicetype}
2810@end multitable
2811
2812@item @emph{Reference}:
e464fc90
TB
2813@uref{https://www.openacc.org, OpenACC specification v2.6}, section
28143.2.7.
cdf6119d
JN
2815@end table
2816
2817
2818
2819@node acc_shutdown
2820@section @code{acc_shutdown} -- Shuts down the runtime for a specific device type.
2821@table @asis
2822@item @emph{Description}
2823This function shuts down the runtime for the device type specified in
2824@var{devicetype}.
2825
2826@item @emph{C/C++}:
2827@multitable @columnfractions .20 .80
2828@item @emph{Prototype}: @tab @code{acc_shutdown(acc_device_t devicetype);}
2829@end multitable
2830
2831@item @emph{Fortran}:
2832@multitable @columnfractions .20 .80
2833@item @emph{Interface}: @tab @code{subroutine acc_shutdown(devicetype)}
2834@item @tab @code{integer(acc_device_kind) devicetype}
2835@end multitable
2836
2837@item @emph{Reference}:
e464fc90
TB
2838@uref{https://www.openacc.org, OpenACC specification v2.6}, section
28393.2.8.
cdf6119d
JN
2840@end table
2841
2842
2843
2844@node acc_on_device
2845@section @code{acc_on_device} -- Whether executing on a particular device
2846@table @asis
2847@item @emph{Description}:
2848This function returns whether the program is executing on a particular
2849device specified in @var{devicetype}. In C/C++ a non-zero value is
93d90219 2850returned to indicate the device is executing on the specified device type.
cdf6119d
JN
2851In Fortran, @code{true} will be returned. If the program is not executing
2852on the specified device type C/C++ will return a zero, while Fortran will
2853return @code{false}.
2854
2855@item @emph{C/C++}:
2856@multitable @columnfractions .20 .80
2857@item @emph{Prototype}: @tab @code{acc_on_device(acc_device_t devicetype);}
2858@end multitable
2859
2860@item @emph{Fortran}:
2861@multitable @columnfractions .20 .80
2862@item @emph{Interface}: @tab @code{function acc_on_device(devicetype)}
2863@item @tab @code{integer(acc_device_kind) devicetype}
2864@item @tab @code{logical acc_on_device}
2865@end multitable
2866
2867
2868@item @emph{Reference}:
e464fc90
TB
2869@uref{https://www.openacc.org, OpenACC specification v2.6}, section
28703.2.17.
cdf6119d
JN
2871@end table
2872
2873
2874
2875@node acc_malloc
2876@section @code{acc_malloc} -- Allocate device memory.
2877@table @asis
2878@item @emph{Description}
2879This function allocates @var{len} bytes of device memory. It returns
2880the device address of the allocated memory.
2881
2882@item @emph{C/C++}:
2883@multitable @columnfractions .20 .80
2884@item @emph{Prototype}: @tab @code{d_void* acc_malloc(size_t len);}
2885@end multitable
2886
2887@item @emph{Reference}:
e464fc90
TB
2888@uref{https://www.openacc.org, OpenACC specification v2.6}, section
28893.2.18.
cdf6119d
JN
2890@end table
2891
2892
2893
2894@node acc_free
2895@section @code{acc_free} -- Free device memory.
2896@table @asis
2897@item @emph{Description}
2898Free previously allocated device memory at the device address @code{a}.
2899
2900@item @emph{C/C++}:
2901@multitable @columnfractions .20 .80
2902@item @emph{Prototype}: @tab @code{acc_free(d_void *a);}
2903@end multitable
2904
2905@item @emph{Reference}:
e464fc90
TB
2906@uref{https://www.openacc.org, OpenACC specification v2.6}, section
29073.2.19.
cdf6119d
JN
2908@end table
2909
2910
2911
2912@node acc_copyin
2913@section @code{acc_copyin} -- Allocate device memory and copy host memory to it.
2914@table @asis
2915@item @emph{Description}
2916In C/C++, this function allocates @var{len} bytes of device memory
2917and maps it to the specified host address in @var{a}. The device
2918address of the newly allocated device memory is returned.
2919
2920In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
2921a contiguous array section. The second form @var{a} specifies a
2922variable or array element and @var{len} specifies the length in bytes.
2923
2924@item @emph{C/C++}:
2925@multitable @columnfractions .20 .80
2926@item @emph{Prototype}: @tab @code{void *acc_copyin(h_void *a, size_t len);}
e464fc90 2927@item @emph{Prototype}: @tab @code{void *acc_copyin_async(h_void *a, size_t len, int async);}
cdf6119d
JN
2928@end multitable
2929
2930@item @emph{Fortran}:
2931@multitable @columnfractions .20 .80
2932@item @emph{Interface}: @tab @code{subroutine acc_copyin(a)}
2933@item @tab @code{type, dimension(:[,:]...) :: a}
2934@item @emph{Interface}: @tab @code{subroutine acc_copyin(a, len)}
2935@item @tab @code{type, dimension(:[,:]...) :: a}
2936@item @tab @code{integer len}
e464fc90
TB
2937@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, async)}
2938@item @tab @code{type, dimension(:[,:]...) :: a}
2939@item @tab @code{integer(acc_handle_kind) :: async}
2940@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, len, async)}
2941@item @tab @code{type, dimension(:[,:]...) :: a}
2942@item @tab @code{integer len}
2943@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
2944@end multitable
2945
2946@item @emph{Reference}:
e464fc90
TB
2947@uref{https://www.openacc.org, OpenACC specification v2.6}, section
29483.2.20.
cdf6119d
JN
2949@end table
2950
2951
2952
2953@node acc_present_or_copyin
2954@section @code{acc_present_or_copyin} -- If the data is not present on the device, allocate device memory and copy from host memory.
2955@table @asis
2956@item @emph{Description}
c1030b5c 2957This function tests if the host data specified by @var{a} and of length
cdf6119d
JN
2958@var{len} is present or not. If it is not present, then device memory
2959will be allocated and the host memory copied. The device address of
2960the newly allocated device memory is returned.
2961
2962In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
2963a contiguous array section. The second form @var{a} specifies a variable or
2964array element and @var{len} specifies the length in bytes.
2965
e464fc90
TB
2966Note that @code{acc_present_or_copyin} and @code{acc_pcopyin} exist for
2967backward compatibility with OpenACC 2.0; use @ref{acc_copyin} instead.
2968
cdf6119d
JN
2969@item @emph{C/C++}:
2970@multitable @columnfractions .20 .80
2971@item @emph{Prototype}: @tab @code{void *acc_present_or_copyin(h_void *a, size_t len);}
2972@item @emph{Prototype}: @tab @code{void *acc_pcopyin(h_void *a, size_t len);}
2973@end multitable
2974
2975@item @emph{Fortran}:
2976@multitable @columnfractions .20 .80
2977@item @emph{Interface}: @tab @code{subroutine acc_present_or_copyin(a)}
2978@item @tab @code{type, dimension(:[,:]...) :: a}
2979@item @emph{Interface}: @tab @code{subroutine acc_present_or_copyin(a, len)}
2980@item @tab @code{type, dimension(:[,:]...) :: a}
2981@item @tab @code{integer len}
2982@item @emph{Interface}: @tab @code{subroutine acc_pcopyin(a)}
2983@item @tab @code{type, dimension(:[,:]...) :: a}
2984@item @emph{Interface}: @tab @code{subroutine acc_pcopyin(a, len)}
2985@item @tab @code{type, dimension(:[,:]...) :: a}
2986@item @tab @code{integer len}
2987@end multitable
2988
2989@item @emph{Reference}:
e464fc90
TB
2990@uref{https://www.openacc.org, OpenACC specification v2.6}, section
29913.2.20.
cdf6119d
JN
2992@end table
2993
2994
2995
2996@node acc_create
2997@section @code{acc_create} -- Allocate device memory and map it to host memory.
2998@table @asis
2999@item @emph{Description}
3000This function allocates device memory and maps it to host memory specified
3001by the host address @var{a} with a length of @var{len} bytes. In C/C++,
3002the function returns the device address of the allocated device memory.
3003
3004In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
3005a contiguous array section. The second form @var{a} specifies a variable or
3006array element and @var{len} specifies the length in bytes.
3007
3008@item @emph{C/C++}:
3009@multitable @columnfractions .20 .80
3010@item @emph{Prototype}: @tab @code{void *acc_create(h_void *a, size_t len);}
e464fc90 3011@item @emph{Prototype}: @tab @code{void *acc_create_async(h_void *a, size_t len, int async);}
cdf6119d
JN
3012@end multitable
3013
3014@item @emph{Fortran}:
3015@multitable @columnfractions .20 .80
3016@item @emph{Interface}: @tab @code{subroutine acc_create(a)}
3017@item @tab @code{type, dimension(:[,:]...) :: a}
3018@item @emph{Interface}: @tab @code{subroutine acc_create(a, len)}
3019@item @tab @code{type, dimension(:[,:]...) :: a}
3020@item @tab @code{integer len}
e464fc90
TB
3021@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, async)}
3022@item @tab @code{type, dimension(:[,:]...) :: a}
3023@item @tab @code{integer(acc_handle_kind) :: async}
3024@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, len, async)}
3025@item @tab @code{type, dimension(:[,:]...) :: a}
3026@item @tab @code{integer len}
3027@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
3028@end multitable
3029
3030@item @emph{Reference}:
e464fc90
TB
3031@uref{https://www.openacc.org, OpenACC specification v2.6}, section
30323.2.21.
cdf6119d
JN
3033@end table
3034
3035
3036
3037@node acc_present_or_create
3038@section @code{acc_present_or_create} -- If the data is not present on the device, allocate device memory and map it to host memory.
3039@table @asis
3040@item @emph{Description}
c1030b5c 3041This function tests if the host data specified by @var{a} and of length
cdf6119d
JN
3042@var{len} is present or not. If it is not present, then device memory
3043will be allocated and mapped to host memory. In C/C++, the device address
3044of the newly allocated device memory is returned.
3045
3046In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
3047a contiguous array section. The second form @var{a} specifies a variable or
3048array element and @var{len} specifies the length in bytes.
3049
e464fc90
TB
3050Note that @code{acc_present_or_create} and @code{acc_pcreate} exist for
3051backward compatibility with OpenACC 2.0; use @ref{acc_create} instead.
cdf6119d
JN
3052
3053@item @emph{C/C++}:
3054@multitable @columnfractions .20 .80
3055@item @emph{Prototype}: @tab @code{void *acc_present_or_create(h_void *a, size_t len)}
3056@item @emph{Prototype}: @tab @code{void *acc_pcreate(h_void *a, size_t len)}
3057@end multitable
3058
3059@item @emph{Fortran}:
3060@multitable @columnfractions .20 .80
3061@item @emph{Interface}: @tab @code{subroutine acc_present_or_create(a)}
3062@item @tab @code{type, dimension(:[,:]...) :: a}
3063@item @emph{Interface}: @tab @code{subroutine acc_present_or_create(a, len)}
3064@item @tab @code{type, dimension(:[,:]...) :: a}
3065@item @tab @code{integer len}
3066@item @emph{Interface}: @tab @code{subroutine acc_pcreate(a)}
3067@item @tab @code{type, dimension(:[,:]...) :: a}
3068@item @emph{Interface}: @tab @code{subroutine acc_pcreate(a, len)}
3069@item @tab @code{type, dimension(:[,:]...) :: a}
3070@item @tab @code{integer len}
3071@end multitable
3072
3073@item @emph{Reference}:
e464fc90
TB
3074@uref{https://www.openacc.org, OpenACC specification v2.6}, section
30753.2.21.
cdf6119d
JN
3076@end table
3077
3078
3079
3080@node acc_copyout
3081@section @code{acc_copyout} -- Copy device memory to host memory.
3082@table @asis
3083@item @emph{Description}
3084This function copies mapped device memory to host memory which is specified
3085by host address @var{a} for a length @var{len} bytes in C/C++.
3086
3087In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
3088a contiguous array section. The second form @var{a} specifies a variable or
3089array element and @var{len} specifies the length in bytes.
3090
3091@item @emph{C/C++}:
3092@multitable @columnfractions .20 .80
3093@item @emph{Prototype}: @tab @code{acc_copyout(h_void *a, size_t len);}
e464fc90
TB
3094@item @emph{Prototype}: @tab @code{acc_copyout_async(h_void *a, size_t len, int async);}
3095@item @emph{Prototype}: @tab @code{acc_copyout_finalize(h_void *a, size_t len);}
3096@item @emph{Prototype}: @tab @code{acc_copyout_finalize_async(h_void *a, size_t len, int async);}
cdf6119d
JN
3097@end multitable
3098
3099@item @emph{Fortran}:
3100@multitable @columnfractions .20 .80
3101@item @emph{Interface}: @tab @code{subroutine acc_copyout(a)}
3102@item @tab @code{type, dimension(:[,:]...) :: a}
3103@item @emph{Interface}: @tab @code{subroutine acc_copyout(a, len)}
3104@item @tab @code{type, dimension(:[,:]...) :: a}
3105@item @tab @code{integer len}
e464fc90
TB
3106@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, async)}
3107@item @tab @code{type, dimension(:[,:]...) :: a}
3108@item @tab @code{integer(acc_handle_kind) :: async}
3109@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, len, async)}
3110@item @tab @code{type, dimension(:[,:]...) :: a}
3111@item @tab @code{integer len}
3112@item @tab @code{integer(acc_handle_kind) :: async}
3113@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a)}
3114@item @tab @code{type, dimension(:[,:]...) :: a}
3115@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a, len)}
3116@item @tab @code{type, dimension(:[,:]...) :: a}
3117@item @tab @code{integer len}
3118@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, async)}
3119@item @tab @code{type, dimension(:[,:]...) :: a}
3120@item @tab @code{integer(acc_handle_kind) :: async}
3121@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, len, async)}
3122@item @tab @code{type, dimension(:[,:]...) :: a}
3123@item @tab @code{integer len}
3124@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
3125@end multitable
3126
3127@item @emph{Reference}:
e464fc90
TB
3128@uref{https://www.openacc.org, OpenACC specification v2.6}, section
31293.2.22.
cdf6119d
JN
3130@end table
3131
3132
3133
3134@node acc_delete
3135@section @code{acc_delete} -- Free device memory.
3136@table @asis
3137@item @emph{Description}
3138This function frees previously allocated device memory specified by
3139the device address @var{a} and the length of @var{len} bytes.
3140
3141In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
3142a contiguous array section. The second form @var{a} specifies a variable or
3143array element and @var{len} specifies the length in bytes.
3144
3145@item @emph{C/C++}:
3146@multitable @columnfractions .20 .80
3147@item @emph{Prototype}: @tab @code{acc_delete(h_void *a, size_t len);}
e464fc90
TB
3148@item @emph{Prototype}: @tab @code{acc_delete_async(h_void *a, size_t len, int async);}
3149@item @emph{Prototype}: @tab @code{acc_delete_finalize(h_void *a, size_t len);}
3150@item @emph{Prototype}: @tab @code{acc_delete_finalize_async(h_void *a, size_t len, int async);}
cdf6119d
JN
3151@end multitable
3152
3153@item @emph{Fortran}:
3154@multitable @columnfractions .20 .80
3155@item @emph{Interface}: @tab @code{subroutine acc_delete(a)}
3156@item @tab @code{type, dimension(:[,:]...) :: a}
3157@item @emph{Interface}: @tab @code{subroutine acc_delete(a, len)}
3158@item @tab @code{type, dimension(:[,:]...) :: a}
3159@item @tab @code{integer len}
e464fc90
TB
3160@item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, async)}
3161@item @tab @code{type, dimension(:[,:]...) :: a}
3162@item @tab @code{integer(acc_handle_kind) :: async}
3163@item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, len, async)}
3164@item @tab @code{type, dimension(:[,:]...) :: a}
3165@item @tab @code{integer len}
3166@item @tab @code{integer(acc_handle_kind) :: async}
3167@item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a)}
3168@item @tab @code{type, dimension(:[,:]...) :: a}
3169@item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a, len)}
3170@item @tab @code{type, dimension(:[,:]...) :: a}
3171@item @tab @code{integer len}
3172@item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, async)}
3173@item @tab @code{type, dimension(:[,:]...) :: a}
3174@item @tab @code{integer(acc_handle_kind) :: async}
3175@item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, len, async)}
3176@item @tab @code{type, dimension(:[,:]...) :: a}
3177@item @tab @code{integer len}
3178@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
3179@end multitable
3180
3181@item @emph{Reference}:
e464fc90
TB
3182@uref{https://www.openacc.org, OpenACC specification v2.6}, section
31833.2.23.
cdf6119d
JN
3184@end table
3185
3186
3187
3188@node acc_update_device
3189@section @code{acc_update_device} -- Update device memory from mapped host memory.
3190@table @asis
3191@item @emph{Description}
3192This function updates the device copy from the previously mapped host memory.
3193The host memory is specified with the host address @var{a} and a length of
3194@var{len} bytes.
3195
3196In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
3197a contiguous array section. The second form @var{a} specifies a variable or
3198array element and @var{len} specifies the length in bytes.
3199
3200@item @emph{C/C++}:
3201@multitable @columnfractions .20 .80
3202@item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len);}
e464fc90 3203@item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len, async);}
cdf6119d
JN
3204@end multitable
3205
3206@item @emph{Fortran}:
3207@multitable @columnfractions .20 .80
3208@item @emph{Interface}: @tab @code{subroutine acc_update_device(a)}
3209@item @tab @code{type, dimension(:[,:]...) :: a}
3210@item @emph{Interface}: @tab @code{subroutine acc_update_device(a, len)}
3211@item @tab @code{type, dimension(:[,:]...) :: a}
3212@item @tab @code{integer len}
e464fc90
TB
3213@item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, async)}
3214@item @tab @code{type, dimension(:[,:]...) :: a}
3215@item @tab @code{integer(acc_handle_kind) :: async}
3216@item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, len, async)}
3217@item @tab @code{type, dimension(:[,:]...) :: a}
3218@item @tab @code{integer len}
3219@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
3220@end multitable
3221
3222@item @emph{Reference}:
e464fc90
TB
3223@uref{https://www.openacc.org, OpenACC specification v2.6}, section
32243.2.24.
cdf6119d
JN
3225@end table
3226
3227
3228
3229@node acc_update_self
3230@section @code{acc_update_self} -- Update host memory from mapped device memory.
3231@table @asis
3232@item @emph{Description}
3233This function updates the host copy from the previously mapped device memory.
3234The host memory is specified with the host address @var{a} and a length of
3235@var{len} bytes.
3236
3237In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
3238a contiguous array section. The second form @var{a} specifies a variable or
3239array element and @var{len} specifies the length in bytes.
3240
3241@item @emph{C/C++}:
3242@multitable @columnfractions .20 .80
3243@item @emph{Prototype}: @tab @code{acc_update_self(h_void *a, size_t len);}
e464fc90 3244@item @emph{Prototype}: @tab @code{acc_update_self_async(h_void *a, size_t len, int async);}
cdf6119d
JN
3245@end multitable
3246
3247@item @emph{Fortran}:
3248@multitable @columnfractions .20 .80
3249@item @emph{Interface}: @tab @code{subroutine acc_update_self(a)}
3250@item @tab @code{type, dimension(:[,:]...) :: a}
3251@item @emph{Interface}: @tab @code{subroutine acc_update_self(a, len)}
3252@item @tab @code{type, dimension(:[,:]...) :: a}
3253@item @tab @code{integer len}
e464fc90
TB
3254@item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, async)}
3255@item @tab @code{type, dimension(:[,:]...) :: a}
3256@item @tab @code{integer(acc_handle_kind) :: async}
3257@item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, len, async)}
3258@item @tab @code{type, dimension(:[,:]...) :: a}
3259@item @tab @code{integer len}
3260@item @tab @code{integer(acc_handle_kind) :: async}
cdf6119d
JN
3261@end multitable
3262
3263@item @emph{Reference}:
e464fc90
TB
3264@uref{https://www.openacc.org, OpenACC specification v2.6}, section
32653.2.25.
cdf6119d
JN
3266@end table
3267
3268
3269
3270@node acc_map_data
3271@section @code{acc_map_data} -- Map previously allocated device memory to host memory.
3272@table @asis
3273@item @emph{Description}
3274This function maps previously allocated device and host memory. The device
3275memory is specified with the device address @var{d}. The host memory is
3276specified with the host address @var{h} and a length of @var{len}.
3277
3278@item @emph{C/C++}:
3279@multitable @columnfractions .20 .80
3280@item @emph{Prototype}: @tab @code{acc_map_data(h_void *h, d_void *d, size_t len);}
3281@end multitable
3282
3283@item @emph{Reference}:
e464fc90
TB
3284@uref{https://www.openacc.org, OpenACC specification v2.6}, section
32853.2.26.
cdf6119d
JN
3286@end table
3287
3288
3289
3290@node acc_unmap_data
3291@section @code{acc_unmap_data} -- Unmap device memory from host memory.
3292@table @asis
3293@item @emph{Description}
3294This function unmaps previously mapped device and host memory. The latter
3295specified by @var{h}.
3296
3297@item @emph{C/C++}:
3298@multitable @columnfractions .20 .80
3299@item @emph{Prototype}: @tab @code{acc_unmap_data(h_void *h);}
3300@end multitable
3301
3302@item @emph{Reference}:
e464fc90
TB
3303@uref{https://www.openacc.org, OpenACC specification v2.6}, section
33043.2.27.
cdf6119d
JN
3305@end table
3306
3307
3308
3309@node acc_deviceptr
3310@section @code{acc_deviceptr} -- Get device pointer associated with specific host address.
3311@table @asis
3312@item @emph{Description}
3313This function returns the device address that has been mapped to the
3314host address specified by @var{h}.
3315
3316@item @emph{C/C++}:
3317@multitable @columnfractions .20 .80
3318@item @emph{Prototype}: @tab @code{void *acc_deviceptr(h_void *h);}
3319@end multitable
3320
3321@item @emph{Reference}:
e464fc90
TB
3322@uref{https://www.openacc.org, OpenACC specification v2.6}, section
33233.2.28.
cdf6119d
JN
3324@end table
3325
3326
3327
3328@node acc_hostptr
3329@section @code{acc_hostptr} -- Get host pointer associated with specific device address.
3330@table @asis
3331@item @emph{Description}
3332This function returns the host address that has been mapped to the
3333device address specified by @var{d}.
3334
3335@item @emph{C/C++}:
3336@multitable @columnfractions .20 .80
3337@item @emph{Prototype}: @tab @code{void *acc_hostptr(d_void *d);}
3338@end multitable
3339
3340@item @emph{Reference}:
e464fc90
TB
3341@uref{https://www.openacc.org, OpenACC specification v2.6}, section
33423.2.29.
cdf6119d
JN
3343@end table
3344
3345
3346
3347@node acc_is_present
3348@section @code{acc_is_present} -- Indicate whether host variable / array is present on device.
3349@table @asis
3350@item @emph{Description}
3351This function indicates whether the specified host address in @var{a} and a
3352length of @var{len} bytes is present on the device. In C/C++, a non-zero
3353value is returned to indicate the presence of the mapped memory on the
3354device. A zero is returned to indicate the memory is not mapped on the
3355device.
3356
3357In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
3358a contiguous array section. The second form @var{a} specifies a variable or
3359array element and @var{len} specifies the length in bytes. If the host
3360memory is mapped to device memory, then a @code{true} is returned. Otherwise,
3361a @code{false} is return to indicate the mapped memory is not present.
3362
3363@item @emph{C/C++}:
3364@multitable @columnfractions .20 .80
3365@item @emph{Prototype}: @tab @code{int acc_is_present(h_void *a, size_t len);}
3366@end multitable
3367
3368@item @emph{Fortran}:
3369@multitable @columnfractions .20 .80
3370@item @emph{Interface}: @tab @code{function acc_is_present(a)}
3371@item @tab @code{type, dimension(:[,:]...) :: a}
3372@item @tab @code{logical acc_is_present}
3373@item @emph{Interface}: @tab @code{function acc_is_present(a, len)}
3374@item @tab @code{type, dimension(:[,:]...) :: a}
3375@item @tab @code{integer len}
3376@item @tab @code{logical acc_is_present}
3377@end multitable
3378
3379@item @emph{Reference}:
e464fc90
TB
3380@uref{https://www.openacc.org, OpenACC specification v2.6}, section
33813.2.30.
cdf6119d
JN
3382@end table
3383
3384
3385
3386@node acc_memcpy_to_device
3387@section @code{acc_memcpy_to_device} -- Copy host memory to device memory.
3388@table @asis
3389@item @emph{Description}
3390This function copies host memory specified by host address of @var{src} to
3391device memory specified by the device address @var{dest} for a length of
3392@var{bytes} bytes.
3393
3394@item @emph{C/C++}:
3395@multitable @columnfractions .20 .80
3396@item @emph{Prototype}: @tab @code{acc_memcpy_to_device(d_void *dest, h_void *src, size_t bytes);}
3397@end multitable
3398
3399@item @emph{Reference}:
e464fc90
TB
3400@uref{https://www.openacc.org, OpenACC specification v2.6}, section
34013.2.31.
cdf6119d
JN
3402@end table
3403
3404
3405
3406@node acc_memcpy_from_device
3407@section @code{acc_memcpy_from_device} -- Copy device memory to host memory.
3408@table @asis
3409@item @emph{Description}
3410This function copies host memory specified by host address of @var{src} from
3411device memory specified by the device address @var{dest} for a length of
3412@var{bytes} bytes.
3413
3414@item @emph{C/C++}:
3415@multitable @columnfractions .20 .80
3416@item @emph{Prototype}: @tab @code{acc_memcpy_from_device(d_void *dest, h_void *src, size_t bytes);}
3417@end multitable
3418
3419@item @emph{Reference}:
e464fc90
TB
3420@uref{https://www.openacc.org, OpenACC specification v2.6}, section
34213.2.32.
3422@end table
3423
3424
3425
3426@node acc_attach
3427@section @code{acc_attach} -- Let device pointer point to device-pointer target.
3428@table @asis
3429@item @emph{Description}
3430This function updates a pointer on the device from pointing to a host-pointer
3431address to pointing to the corresponding device data.
3432
3433@item @emph{C/C++}:
3434@multitable @columnfractions .20 .80
3435@item @emph{Prototype}: @tab @code{acc_attach(h_void **ptr);}
3436@item @emph{Prototype}: @tab @code{acc_attach_async(h_void **ptr, int async);}
3437@end multitable
3438
3439@item @emph{Reference}:
3440@uref{https://www.openacc.org, OpenACC specification v2.6}, section
34413.2.34.
3442@end table
3443
3444
3445
3446@node acc_detach
3447@section @code{acc_detach} -- Let device pointer point to host-pointer target.
3448@table @asis
3449@item @emph{Description}
3450This function updates a pointer on the device from pointing to a device-pointer
3451address to pointing to the corresponding host data.
3452
3453@item @emph{C/C++}:
3454@multitable @columnfractions .20 .80
3455@item @emph{Prototype}: @tab @code{acc_detach(h_void **ptr);}
3456@item @emph{Prototype}: @tab @code{acc_detach_async(h_void **ptr, int async);}
3457@item @emph{Prototype}: @tab @code{acc_detach_finalize(h_void **ptr);}
3458@item @emph{Prototype}: @tab @code{acc_detach_finalize_async(h_void **ptr, int async);}
3459@end multitable
3460
3461@item @emph{Reference}:
3462@uref{https://www.openacc.org, OpenACC specification v2.6}, section
34633.2.35.
cdf6119d
JN
3464@end table
3465
3466
3467
3468@node acc_get_current_cuda_device
3469@section @code{acc_get_current_cuda_device} -- Get CUDA device handle.
3470@table @asis
3471@item @emph{Description}
3472This function returns the CUDA device handle. This handle is the same
3473as used by the CUDA Runtime or Driver API's.
3474
3475@item @emph{C/C++}:
3476@multitable @columnfractions .20 .80
3477@item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_device(void);}
3478@end multitable
3479
3480@item @emph{Reference}:
e464fc90 3481@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
3482A.2.1.1.
3483@end table
3484
3485
3486
3487@node acc_get_current_cuda_context
3488@section @code{acc_get_current_cuda_context} -- Get CUDA context handle.
3489@table @asis
3490@item @emph{Description}
3491This function returns the CUDA context handle. This handle is the same
3492as used by the CUDA Runtime or Driver API's.
3493
3494@item @emph{C/C++}:
3495@multitable @columnfractions .20 .80
18c247cc 3496@item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_context(void);}
cdf6119d
JN
3497@end multitable
3498
3499@item @emph{Reference}:
e464fc90 3500@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
3501A.2.1.2.
3502@end table
3503
3504
3505
3506@node acc_get_cuda_stream
3507@section @code{acc_get_cuda_stream} -- Get CUDA stream handle.
3508@table @asis
3509@item @emph{Description}
18c247cc
TS
3510This function returns the CUDA stream handle for the queue @var{async}.
3511This handle is the same as used by the CUDA Runtime or Driver API's.
cdf6119d
JN
3512
3513@item @emph{C/C++}:
3514@multitable @columnfractions .20 .80
18c247cc 3515@item @emph{Prototype}: @tab @code{void *acc_get_cuda_stream(int async);}
cdf6119d
JN
3516@end multitable
3517
3518@item @emph{Reference}:
e464fc90 3519@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
3520A.2.1.3.
3521@end table
3522
3523
3524
3525@node acc_set_cuda_stream
3526@section @code{acc_set_cuda_stream} -- Set CUDA stream handle.
3527@table @asis
3528@item @emph{Description}
3529This function associates the stream handle specified by @var{stream} with
18c247cc
TS
3530the queue @var{async}.
3531
3532This cannot be used to change the stream handle associated with
3533@code{acc_async_sync}.
3534
3535The return value is not specified.
cdf6119d
JN
3536
3537@item @emph{C/C++}:
3538@multitable @columnfractions .20 .80
18c247cc 3539@item @emph{Prototype}: @tab @code{int acc_set_cuda_stream(int async, void *stream);}
cdf6119d
JN
3540@end multitable
3541
3542@item @emph{Reference}:
e464fc90 3543@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
3544A.2.1.4.
3545@end table
3546
3547
3548
5fae049d
TS
3549@node acc_prof_register
3550@section @code{acc_prof_register} -- Register callbacks.
3551@table @asis
3552@item @emph{Description}:
3553This function registers callbacks.
3554
3555@item @emph{C/C++}:
3556@multitable @columnfractions .20 .80
3557@item @emph{Prototype}: @tab @code{void acc_prof_register (acc_event_t, acc_prof_callback, acc_register_t);}
3558@end multitable
3559
3560@item @emph{See also}:
3561@ref{OpenACC Profiling Interface}
3562
3563@item @emph{Reference}:
3564@uref{https://www.openacc.org, OpenACC specification v2.6}, section
35655.3.
3566@end table
3567
3568
3569
3570@node acc_prof_unregister
3571@section @code{acc_prof_unregister} -- Unregister callbacks.
3572@table @asis
3573@item @emph{Description}:
3574This function unregisters callbacks.
3575
3576@item @emph{C/C++}:
3577@multitable @columnfractions .20 .80
3578@item @emph{Prototype}: @tab @code{void acc_prof_unregister (acc_event_t, acc_prof_callback, acc_register_t);}
3579@end multitable
3580
3581@item @emph{See also}:
3582@ref{OpenACC Profiling Interface}
3583
3584@item @emph{Reference}:
3585@uref{https://www.openacc.org, OpenACC specification v2.6}, section
35865.3.
3587@end table
3588
3589
3590
3591@node acc_prof_lookup
3592@section @code{acc_prof_lookup} -- Obtain inquiry functions.
3593@table @asis
3594@item @emph{Description}:
3595Function to obtain inquiry functions.
3596
3597@item @emph{C/C++}:
3598@multitable @columnfractions .20 .80
3599@item @emph{Prototype}: @tab @code{acc_query_fn acc_prof_lookup (const char *);}
3600@end multitable
3601
3602@item @emph{See also}:
3603@ref{OpenACC Profiling Interface}
3604
3605@item @emph{Reference}:
3606@uref{https://www.openacc.org, OpenACC specification v2.6}, section
36075.3.
3608@end table
3609
3610
3611
3612@node acc_register_library
3613@section @code{acc_register_library} -- Library registration.
3614@table @asis
3615@item @emph{Description}:
3616Function for library registration.
3617
3618@item @emph{C/C++}:
3619@multitable @columnfractions .20 .80
3620@item @emph{Prototype}: @tab @code{void acc_register_library (acc_prof_reg, acc_prof_reg, acc_prof_lookup_func);}
3621@end multitable
3622
3623@item @emph{See also}:
3624@ref{OpenACC Profiling Interface}, @ref{ACC_PROFLIB}
3625
3626@item @emph{Reference}:
3627@uref{https://www.openacc.org, OpenACC specification v2.6}, section
36285.3.
3629@end table
3630
3631
3632
cdf6119d
JN
3633@c ---------------------------------------------------------------------
3634@c OpenACC Environment Variables
3635@c ---------------------------------------------------------------------
3636
3637@node OpenACC Environment Variables
3638@chapter OpenACC Environment Variables
3639
3640The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
de9f5e0c 3641are defined by section 4 of the OpenACC specification in version 2.6.
5fae049d
TS
3642The variable @env{ACC_PROFLIB}
3643is defined by section 4 of the OpenACC specification in version 2.6.
cdf6119d
JN
3644The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
3645
3646@menu
3647* ACC_DEVICE_TYPE::
3648* ACC_DEVICE_NUM::
5fae049d 3649* ACC_PROFLIB::
cdf6119d
JN
3650* GCC_ACC_NOTIFY::
3651@end menu
3652
3653
3654
3655@node ACC_DEVICE_TYPE
3656@section @code{ACC_DEVICE_TYPE}
3657@table @asis
3658@item @emph{Reference}:
e464fc90 3659@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
36604.1.
3661@end table
3662
3663
3664
3665@node ACC_DEVICE_NUM
3666@section @code{ACC_DEVICE_NUM}
3667@table @asis
3668@item @emph{Reference}:
e464fc90 3669@uref{https://www.openacc.org, OpenACC specification v2.6}, section
cdf6119d
JN
36704.2.
3671@end table
3672
3673
3674
5fae049d
TS
3675@node ACC_PROFLIB
3676@section @code{ACC_PROFLIB}
3677@table @asis
3678@item @emph{See also}:
3679@ref{acc_register_library}, @ref{OpenACC Profiling Interface}
3680
3681@item @emph{Reference}:
3682@uref{https://www.openacc.org, OpenACC specification v2.6}, section
36834.3.
3684@end table
3685
3686
3687
cdf6119d
JN
3688@node GCC_ACC_NOTIFY
3689@section @code{GCC_ACC_NOTIFY}
3690@table @asis
3691@item @emph{Description}:
3692Print debug information pertaining to the accelerator.
3693@end table
3694
3695
3696
3697@c ---------------------------------------------------------------------
3698@c CUDA Streams Usage
3699@c ---------------------------------------------------------------------
3700
3701@node CUDA Streams Usage
3702@chapter CUDA Streams Usage
3703
3704This applies to the @code{nvptx} plugin only.
3705
3706The library provides elements that perform asynchronous movement of
3707data and asynchronous operation of computing constructs. This
3708asynchronous functionality is implemented by making use of CUDA
3709streams@footnote{See "Stream Management" in "CUDA Driver API",
3710TRM-06703-001, Version 5.5, for additional information}.
3711
c1030b5c 3712The primary means by that the asynchronous functionality is accessed
cdf6119d
JN
3713is through the use of those OpenACC directives which make use of the
3714@code{async} and @code{wait} clauses. When the @code{async} clause is
3715first used with a directive, it creates a CUDA stream. If an
3716@code{async-argument} is used with the @code{async} clause, then the
3717stream is associated with the specified @code{async-argument}.
3718
3719Following the creation of an association between a CUDA stream and the
3720@code{async-argument} of an @code{async} clause, both the @code{wait}
3721clause and the @code{wait} directive can be used. When either the
3722clause or directive is used after stream creation, it creates a
3723rendezvous point whereby execution waits until all operations
3724associated with the @code{async-argument}, that is, stream, have
3725completed.
3726
3727Normally, the management of the streams that are created as a result of
3728using the @code{async} clause, is done without any intervention by the
3729caller. This implies the association between the @code{async-argument}
3730and the CUDA stream will be maintained for the lifetime of the program.
3731However, this association can be changed through the use of the library
3732function @code{acc_set_cuda_stream}. When the function
3733@code{acc_set_cuda_stream} is called, the CUDA stream that was
3734originally associated with the @code{async} clause will be destroyed.
3735Caution should be taken when changing the association as subsequent
3736references to the @code{async-argument} refer to a different
3737CUDA stream.
3738
3739
3740
3741@c ---------------------------------------------------------------------
3742@c OpenACC Library Interoperability
3743@c ---------------------------------------------------------------------
3744
3745@node OpenACC Library Interoperability
3746@chapter OpenACC Library Interoperability
3747
3748@section Introduction
3749
3750The OpenACC library uses the CUDA Driver API, and may interact with
3751programs that use the Runtime library directly, or another library
3752based on the Runtime library, e.g., CUBLAS@footnote{See section 2.26,
3753"Interactions with the CUDA Driver API" in
3754"CUDA Runtime API", Version 5.5, and section 2.27, "VDPAU
3755Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
3756for additional information on library interoperability.}.
3757This chapter describes the use cases and what changes are
3758required in order to use both the OpenACC library and the CUBLAS and Runtime
3759libraries within a program.
3760
3761@section First invocation: NVIDIA CUBLAS library API
3762
3763In this first use case (see below), a function in the CUBLAS library is called
3764prior to any of the functions in the OpenACC library. More specifically, the
3765function @code{cublasCreate()}.
3766
3767When invoked, the function initializes the library and allocates the
3768hardware resources on the host and the device on behalf of the caller. Once
3769the initialization and allocation has completed, a handle is returned to the
3770caller. The OpenACC library also requires initialization and allocation of
3771hardware resources. Since the CUBLAS library has already allocated the
3772hardware resources for the device, all that is left to do is to initialize
3773the OpenACC library and acquire the hardware resources on the host.
3774
3775Prior to calling the OpenACC function that initializes the library and
3776allocate the host hardware resources, you need to acquire the device number
3777that was allocated during the call to @code{cublasCreate()}. The invoking of the
3778runtime library function @code{cudaGetDevice()} accomplishes this. Once
3779acquired, the device number is passed along with the device type as
3780parameters to the OpenACC library function @code{acc_set_device_num()}.
3781
3782Once the call to @code{acc_set_device_num()} has completed, the OpenACC
3783library uses the context that was created during the call to
3784@code{cublasCreate()}. In other words, both libraries will be sharing the
3785same context.
3786
3787@smallexample
3788 /* Create the handle */
3789 s = cublasCreate(&h);
3790 if (s != CUBLAS_STATUS_SUCCESS)
3791 @{
3792 fprintf(stderr, "cublasCreate failed %d\n", s);
3793 exit(EXIT_FAILURE);
3794 @}
3795
3796 /* Get the device number */
3797 e = cudaGetDevice(&dev);
3798 if (e != cudaSuccess)
3799 @{
3800 fprintf(stderr, "cudaGetDevice failed %d\n", e);
3801 exit(EXIT_FAILURE);
3802 @}
3803
3804 /* Initialize OpenACC library and use device 'dev' */
3805 acc_set_device_num(dev, acc_device_nvidia);
3806
3807@end smallexample
3808@center Use Case 1
3809
3810@section First invocation: OpenACC library API
3811
3812In this second use case (see below), a function in the OpenACC library is
3813called prior to any of the functions in the CUBLAS library. More specificially,
3814the function @code{acc_set_device_num()}.
3815
3816In the use case presented here, the function @code{acc_set_device_num()}
3817is used to both initialize the OpenACC library and allocate the hardware
3818resources on the host and the device. In the call to the function, the
3819call parameters specify which device to use and what device
3820type to use, i.e., @code{acc_device_nvidia}. It should be noted that this
3821is but one method to initialize the OpenACC library and allocate the
3822appropriate hardware resources. Other methods are available through the
3823use of environment variables and these will be discussed in the next section.
3824
3825Once the call to @code{acc_set_device_num()} has completed, other OpenACC
3826functions can be called as seen with multiple calls being made to
3827@code{acc_copyin()}. In addition, calls can be made to functions in the
3828CUBLAS library. In the use case a call to @code{cublasCreate()} is made
3829subsequent to the calls to @code{acc_copyin()}.
3830As seen in the previous use case, a call to @code{cublasCreate()}
3831initializes the CUBLAS library and allocates the hardware resources on the
3832host and the device. However, since the device has already been allocated,
3833@code{cublasCreate()} will only initialize the CUBLAS library and allocate
3834the appropriate hardware resources on the host. The context that was created
3835as part of the OpenACC initialization is shared with the CUBLAS library,
3836similarly to the first use case.
3837
3838@smallexample
3839 dev = 0;
3840
3841 acc_set_device_num(dev, acc_device_nvidia);
3842
3843 /* Copy the first set to the device */
3844 d_X = acc_copyin(&h_X[0], N * sizeof (float));
3845 if (d_X == NULL)
3846 @{
3847 fprintf(stderr, "copyin error h_X\n");
3848 exit(EXIT_FAILURE);
3849 @}
3850
3851 /* Copy the second set to the device */
3852 d_Y = acc_copyin(&h_Y1[0], N * sizeof (float));
3853 if (d_Y == NULL)
3854 @{
3855 fprintf(stderr, "copyin error h_Y1\n");
3856 exit(EXIT_FAILURE);
3857 @}
3858
3859 /* Create the handle */
3860 s = cublasCreate(&h);
3861 if (s != CUBLAS_STATUS_SUCCESS)
3862 @{
3863 fprintf(stderr, "cublasCreate failed %d\n", s);
3864 exit(EXIT_FAILURE);
3865 @}
3866
3867 /* Perform saxpy using CUBLAS library function */
3868 s = cublasSaxpy(h, N, &alpha, d_X, 1, d_Y, 1);
3869 if (s != CUBLAS_STATUS_SUCCESS)
3870 @{
3871 fprintf(stderr, "cublasSaxpy failed %d\n", s);
3872 exit(EXIT_FAILURE);
3873 @}
3874
3875 /* Copy the results from the device */
3876 acc_memcpy_from_device(&h_Y1[0], d_Y, N * sizeof (float));
3877
3878@end smallexample
3879@center Use Case 2
3880
3881@section OpenACC library and environment variables
3882
3883There are two environment variables associated with the OpenACC library
3884that may be used to control the device type and device number:
8d1a1cb1
TB
3885@env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}, respectively. These two
3886environment variables can be used as an alternative to calling
cdf6119d
JN
3887@code{acc_set_device_num()}. As seen in the second use case, the device
3888type and device number were specified using @code{acc_set_device_num()}.
3889If however, the aforementioned environment variables were set, then the
3890call to @code{acc_set_device_num()} would not be required.
3891
3892
3893The use of the environment variables is only relevant when an OpenACC function
3894is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()}
3895is called prior to a call to an OpenACC function, then you must call
3896@code{acc_set_device_num()}@footnote{More complete information
3897about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
9651fbaf 3898sections 4.1 and 4.2 of the @uref{https://www.openacc.org, OpenACC}
e464fc90 3899Application Programming Interface”, Version 2.6.}
cdf6119d
JN
3900
3901
3902
5fae049d
TS
3903@c ---------------------------------------------------------------------
3904@c OpenACC Profiling Interface
3905@c ---------------------------------------------------------------------
3906
3907@node OpenACC Profiling Interface
3908@chapter OpenACC Profiling Interface
3909
3910@section Implementation Status and Implementation-Defined Behavior
3911
3912We're implementing the OpenACC Profiling Interface as defined by the
3913OpenACC 2.6 specification. We're clarifying some aspects here as
3914@emph{implementation-defined behavior}, while they're still under
3915discussion within the OpenACC Technical Committee.
3916
3917This implementation is tuned to keep the performance impact as low as
3918possible for the (very common) case that the Profiling Interface is
3919not enabled. This is relevant, as the Profiling Interface affects all
3920the @emph{hot} code paths (in the target code, not in the offloaded
3921code). Users of the OpenACC Profiling Interface can be expected to
3922understand that performance will be impacted to some degree once the
3923Profiling Interface has gotten enabled: for example, because of the
3924@emph{runtime} (libgomp) calling into a third-party @emph{library} for
3925every event that has been registered.
3926
3927We're not yet accounting for the fact that @cite{OpenACC events may
3928occur during event processing}.
b52643ab
KCY
3929We just handle one case specially, as required by CUDA 9.0
3930@command{nvprof}, that @code{acc_get_device_type}
3931(@ref{acc_get_device_type})) may be called from
3932@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
3933callbacks.
5fae049d 3934
5fae049d
TS
3935As currently there are no inquiry functions defined, calls to
3936@code{acc_prof_lookup} will always return @code{NULL}.
3937
3938There aren't separate @emph{start}, @emph{stop} events defined for the
3939event types @code{acc_ev_create}, @code{acc_ev_delete},
3940@code{acc_ev_alloc}, @code{acc_ev_free}. It's not clear if these
3941should be triggered before or after the actual device-specific call is
3942made. We trigger them after.
3943
3944Remarks about data provided to callbacks:
3945
3946@table @asis
3947
3948@item @code{acc_prof_info.event_type}
3949It's not clear if for @emph{nested} event callbacks (for example,
3950@code{acc_ev_enqueue_launch_start} as part of a parent compute
3951construct), this should be set for the nested event
3952(@code{acc_ev_enqueue_launch_start}), or if the value of the parent
3953construct should remain (@code{acc_ev_compute_construct_start}). In
3954this implementation, the value will generally correspond to the
3955innermost nested event type.
3956
3957@item @code{acc_prof_info.device_type}
3958@itemize
3959
3960@item
3961For @code{acc_ev_compute_construct_start}, and in presence of an
3962@code{if} clause with @emph{false} argument, this will still refer to
3963the offloading device type.
3964It's not clear if that's the expected behavior.
3965
3966@item
3967Complementary to the item before, for
3968@code{acc_ev_compute_construct_end}, this is set to
3969@code{acc_device_host} in presence of an @code{if} clause with
3970@emph{false} argument.
3971It's not clear if that's the expected behavior.
3972
3973@end itemize
3974
3975@item @code{acc_prof_info.thread_id}
3976Always @code{-1}; not yet implemented.
3977
3978@item @code{acc_prof_info.async}
3979@itemize
3980
3981@item
3982Not yet implemented correctly for
3983@code{acc_ev_compute_construct_start}.
3984
3985@item
3986In a compute construct, for host-fallback
3987execution/@code{acc_device_host} it will always be
3988@code{acc_async_sync}.
3989It's not clear if that's the expected behavior.
3990
3991@item
3992For @code{acc_ev_device_init_start} and @code{acc_ev_device_init_end},
3993it will always be @code{acc_async_sync}.
3994It's not clear if that's the expected behavior.
3995
3996@end itemize
3997
3998@item @code{acc_prof_info.async_queue}
3999There is no @cite{limited number of asynchronous queues} in libgomp.
4000This will always have the same value as @code{acc_prof_info.async}.
4001
4002@item @code{acc_prof_info.src_file}
4003Always @code{NULL}; not yet implemented.
4004
4005@item @code{acc_prof_info.func_name}
4006Always @code{NULL}; not yet implemented.
4007
4008@item @code{acc_prof_info.line_no}
4009Always @code{-1}; not yet implemented.
4010
4011@item @code{acc_prof_info.end_line_no}
4012Always @code{-1}; not yet implemented.
4013
4014@item @code{acc_prof_info.func_line_no}
4015Always @code{-1}; not yet implemented.
4016
4017@item @code{acc_prof_info.func_end_line_no}
4018Always @code{-1}; not yet implemented.
4019
4020@item @code{acc_event_info.event_type}, @code{acc_event_info.*.event_type}
4021Relating to @code{acc_prof_info.event_type} discussed above, in this
4022implementation, this will always be the same value as
4023@code{acc_prof_info.event_type}.
4024
4025@item @code{acc_event_info.*.parent_construct}
4026@itemize
4027
4028@item
4029Will be @code{acc_construct_parallel} for all OpenACC compute
4030constructs as well as many OpenACC Runtime API calls; should be the
4031one matching the actual construct, or
4032@code{acc_construct_runtime_api}, respectively.
4033
4034@item
4035Will be @code{acc_construct_enter_data} or
4036@code{acc_construct_exit_data} when processing variable mappings
4037specified in OpenACC @emph{declare} directives; should be
4038@code{acc_construct_declare}.
4039
4040@item
4041For implicit @code{acc_ev_device_init_start},
4042@code{acc_ev_device_init_end}, and explicit as well as implicit
4043@code{acc_ev_alloc}, @code{acc_ev_free},
4044@code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
4045@code{acc_ev_enqueue_download_start}, and
4046@code{acc_ev_enqueue_download_end}, will be
4047@code{acc_construct_parallel}; should reflect the real parent
4048construct.
4049
4050@end itemize
4051
4052@item @code{acc_event_info.*.implicit}
4053For @code{acc_ev_alloc}, @code{acc_ev_free},
4054@code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
4055@code{acc_ev_enqueue_download_start}, and
4056@code{acc_ev_enqueue_download_end}, this currently will be @code{1}
4057also for explicit usage.
4058
4059@item @code{acc_event_info.data_event.var_name}
4060Always @code{NULL}; not yet implemented.
4061
4062@item @code{acc_event_info.data_event.host_ptr}
4063For @code{acc_ev_alloc}, and @code{acc_ev_free}, this is always
4064@code{NULL}.
4065
4066@item @code{typedef union acc_api_info}
4067@dots{} as printed in @cite{5.2.3. Third Argument: API-Specific
4068Information}. This should obviously be @code{typedef @emph{struct}
4069acc_api_info}.
4070
4071@item @code{acc_api_info.device_api}
4072Possibly not yet implemented correctly for
4073@code{acc_ev_compute_construct_start},
4074@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}:
4075will always be @code{acc_device_api_none} for these event types.
4076For @code{acc_ev_enter_data_start}, it will be
4077@code{acc_device_api_none} in some cases.
4078
4079@item @code{acc_api_info.device_type}
4080Always the same as @code{acc_prof_info.device_type}.
4081
4082@item @code{acc_api_info.vendor}
4083Always @code{-1}; not yet implemented.
4084
4085@item @code{acc_api_info.device_handle}
4086Always @code{NULL}; not yet implemented.
4087
4088@item @code{acc_api_info.context_handle}
4089Always @code{NULL}; not yet implemented.
4090
4091@item @code{acc_api_info.async_handle}
4092Always @code{NULL}; not yet implemented.
4093
4094@end table
4095
4096Remarks about certain event types:
4097
4098@table @asis
4099
4100@item @code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
4101@itemize
4102
4103@item
4104@c See 'DEVICE_INIT_INSIDE_COMPUTE_CONSTRUCT' in
4105@c 'libgomp.oacc-c-c++-common/acc_prof-kernels-1.c',
4106@c 'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'.
ff7bc505 4107When a compute construct triggers implicit
5fae049d
TS
4108@code{acc_ev_device_init_start} and @code{acc_ev_device_init_end}
4109events, they currently aren't @emph{nested within} the corresponding
4110@code{acc_ev_compute_construct_start} and
4111@code{acc_ev_compute_construct_end}, but they're currently observed
4112@emph{before} @code{acc_ev_compute_construct_start}.
4113It's not clear what to do: the standard asks us provide a lot of
4114details to the @code{acc_ev_compute_construct_start} callback, without
4115(implicitly) initializing a device before?
4116
4117@item
4118Callbacks for these event types will not be invoked for calls to the
4119@code{acc_set_device_type} and @code{acc_set_device_num} functions.
4120It's not clear if they should be.
4121
4122@end itemize
4123
4124@item @code{acc_ev_enter_data_start}, @code{acc_ev_enter_data_end}, @code{acc_ev_exit_data_start}, @code{acc_ev_exit_data_end}
4125@itemize
4126
4127@item
4128Callbacks for these event types will also be invoked for OpenACC
4129@emph{host_data} constructs.
4130It's not clear if they should be.
4131
4132@item
4133Callbacks for these event types will also be invoked when processing
4134variable mappings specified in OpenACC @emph{declare} directives.
4135It's not clear if they should be.
4136
4137@end itemize
4138
4139@end table
4140
4141Callbacks for the following event types will be invoked, but dispatch
4142and information provided therein has not yet been thoroughly reviewed:
4143
4144@itemize
4145@item @code{acc_ev_alloc}
4146@item @code{acc_ev_free}
4147@item @code{acc_ev_update_start}, @code{acc_ev_update_end}
4148@item @code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end}
4149@item @code{acc_ev_enqueue_download_start}, @code{acc_ev_enqueue_download_end}
4150@end itemize
4151
4152During device initialization, and finalization, respectively,
4153callbacks for the following event types will not yet be invoked:
4154
4155@itemize
4156@item @code{acc_ev_alloc}
4157@item @code{acc_ev_free}
4158@end itemize
4159
4160Callbacks for the following event types have not yet been implemented,
4161so currently won't be invoked:
4162
4163@itemize
4164@item @code{acc_ev_device_shutdown_start}, @code{acc_ev_device_shutdown_end}
4165@item @code{acc_ev_runtime_shutdown}
4166@item @code{acc_ev_create}, @code{acc_ev_delete}
4167@item @code{acc_ev_wait_start}, @code{acc_ev_wait_end}
4168@end itemize
4169
4170For the following runtime library functions, not all expected
4171callbacks will be invoked (mostly concerning implicit device
4172initialization):
4173
4174@itemize
4175@item @code{acc_get_num_devices}
4176@item @code{acc_set_device_type}
4177@item @code{acc_get_device_type}
4178@item @code{acc_set_device_num}
4179@item @code{acc_get_device_num}
4180@item @code{acc_init}
4181@item @code{acc_shutdown}
4182@end itemize
4183
4184Aside from implicit device initialization, for the following runtime
4185library functions, no callbacks will be invoked for shared-memory
4186offloading devices (it's not clear if they should be):
4187
4188@itemize
4189@item @code{acc_malloc}
4190@item @code{acc_free}
4191@item @code{acc_copyin}, @code{acc_present_or_copyin}, @code{acc_copyin_async}
4192@item @code{acc_create}, @code{acc_present_or_create}, @code{acc_create_async}
4193@item @code{acc_copyout}, @code{acc_copyout_async}, @code{acc_copyout_finalize}, @code{acc_copyout_finalize_async}
4194@item @code{acc_delete}, @code{acc_delete_async}, @code{acc_delete_finalize}, @code{acc_delete_finalize_async}
4195@item @code{acc_update_device}, @code{acc_update_device_async}
4196@item @code{acc_update_self}, @code{acc_update_self_async}
4197@item @code{acc_map_data}, @code{acc_unmap_data}
4198@item @code{acc_memcpy_to_device}, @code{acc_memcpy_to_device_async}
4199@item @code{acc_memcpy_from_device}, @code{acc_memcpy_from_device_async}
4200@end itemize
4201
4202
4203
3721b9e1
DF
4204@c ---------------------------------------------------------------------
4205@c The libgomp ABI
4206@c ---------------------------------------------------------------------
4207
4208@node The libgomp ABI
4209@chapter The libgomp ABI
4210
4211The following sections present notes on the external ABI as
6a2ba183 4212presented by libgomp. Only maintainers should need them.
3721b9e1
DF
4213
4214@menu
4215* Implementing MASTER construct::
4216* Implementing CRITICAL construct::
4217* Implementing ATOMIC construct::
4218* Implementing FLUSH construct::
4219* Implementing BARRIER construct::
4220* Implementing THREADPRIVATE construct::
4221* Implementing PRIVATE clause::
4222* Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses::
4223* Implementing REDUCTION clause::
4224* Implementing PARALLEL construct::
4225* Implementing FOR construct::
4226* Implementing ORDERED construct::
4227* Implementing SECTIONS construct::
4228* Implementing SINGLE construct::
cdf6119d 4229* Implementing OpenACC's PARALLEL construct::
3721b9e1
DF
4230@end menu
4231
4232
4233@node Implementing MASTER construct
4234@section Implementing MASTER construct
4235
4236@smallexample
4237if (omp_get_thread_num () == 0)
4238 block
4239@end smallexample
4240
4241Alternately, we generate two copies of the parallel subfunction
432de084 4242and only include this in the version run by the primary thread.
6a2ba183 4243Surely this is not worthwhile though...
3721b9e1
DF
4244
4245
4246
4247@node Implementing CRITICAL construct
4248@section Implementing CRITICAL construct
4249
4250Without a specified name,
4251
4252@smallexample
4253 void GOMP_critical_start (void);
4254 void GOMP_critical_end (void);
4255@end smallexample
4256
4257so that we don't get COPY relocations from libgomp to the main
4258application.
4259
4260With a specified name, use omp_set_lock and omp_unset_lock with
4261name being transformed into a variable declared like
4262
4263@smallexample
4264 omp_lock_t gomp_critical_user_<name> __attribute__((common))
4265@end smallexample
4266
4267Ideally the ABI would specify that all zero is a valid unlocked
6a2ba183 4268state, and so we wouldn't need to initialize this at
3721b9e1
DF
4269startup.
4270
4271
4272
4273@node Implementing ATOMIC construct
4274@section Implementing ATOMIC construct
4275
4276The target should implement the @code{__sync} builtins.
4277
4278Failing that we could add
4279
4280@smallexample
4281 void GOMP_atomic_enter (void)
4282 void GOMP_atomic_exit (void)
4283@end smallexample
4284
4285which reuses the regular lock code, but with yet another lock
4286object private to the library.
4287
4288
4289
4290@node Implementing FLUSH construct
4291@section Implementing FLUSH construct
4292
4293Expands to the @code{__sync_synchronize} builtin.
4294
4295
4296
4297@node Implementing BARRIER construct
4298@section Implementing BARRIER construct
4299
4300@smallexample
4301 void GOMP_barrier (void)
4302@end smallexample
4303
4304
4305@node Implementing THREADPRIVATE construct
4306@section Implementing THREADPRIVATE construct
4307
4308In _most_ cases we can map this directly to @code{__thread}. Except
4309that OMP allows constructors for C++ objects. We can either
4310refuse to support this (how often is it used?) or we can
4311implement something akin to .ctors.
4312
4313Even more ideally, this ctor feature is handled by extensions
4314to the main pthreads library. Failing that, we can have a set
4315of entry points to register ctor functions to be called.
4316
4317
4318
4319@node Implementing PRIVATE clause
4320@section Implementing PRIVATE clause
4321
4322In association with a PARALLEL, or within the lexical extent
4323of a PARALLEL block, the variable becomes a local variable in
4324the parallel subfunction.
4325
4326In association with FOR or SECTIONS blocks, create a new
4327automatic variable within the current function. This preserves
4328the semantic of new variable creation.
4329
4330
4331
4332@node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
4333@section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
4334
6a2ba183
AH
4335This seems simple enough for PARALLEL blocks. Create a private
4336struct for communicating between the parent and subfunction.
3721b9e1
DF
4337In the parent, copy in values for scalar and "small" structs;
4338copy in addresses for others TREE_ADDRESSABLE types. In the
4339subfunction, copy the value into the local variable.
4340
6a2ba183
AH
4341It is not clear what to do with bare FOR or SECTION blocks.
4342The only thing I can figure is that we do something like:
3721b9e1
DF
4343
4344@smallexample
4345#pragma omp for firstprivate(x) lastprivate(y)
4346for (int i = 0; i < n; ++i)
4347 body;
4348@end smallexample
4349
4350which becomes
4351
4352@smallexample
4353@{
4354 int x = x, y;
4355
4356 // for stuff
4357
4358 if (i == n)
4359 y = y;
4360@}
4361@end smallexample
4362
4363where the "x=x" and "y=y" assignments actually have different
4364uids for the two variables, i.e. not something you could write
4365directly in C. Presumably this only makes sense if the "outer"
4366x and y are global variables.
4367
4368COPYPRIVATE would work the same way, except the structure
4369broadcast would have to happen via SINGLE machinery instead.
4370
4371
4372
4373@node Implementing REDUCTION clause
4374@section Implementing REDUCTION clause
4375
4376The private struct mentioned in the previous section should have
4377a pointer to an array of the type of the variable, indexed by the
4378thread's @var{team_id}. The thread stores its final value into the
432de084 4379array, and after the barrier, the primary thread iterates over the
3721b9e1
DF
4380array to collect the values.
4381
4382
4383@node Implementing PARALLEL construct
4384@section Implementing PARALLEL construct
4385
4386@smallexample
4387 #pragma omp parallel
4388 @{
4389 body;
4390 @}
4391@end smallexample
4392
4393becomes
4394
4395@smallexample
4396 void subfunction (void *data)
4397 @{
4398 use data;
4399 body;
4400 @}
4401
4402 setup data;
4403 GOMP_parallel_start (subfunction, &data, num_threads);
4404 subfunction (&data);
4405 GOMP_parallel_end ();
4406@end smallexample
4407
4408@smallexample
4409 void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads)
4410@end smallexample
4411
4412The @var{FN} argument is the subfunction to be run in parallel.
4413
4414The @var{DATA} argument is a pointer to a structure used to
4415communicate data in and out of the subfunction, as discussed
f1b0882e 4416above with respect to FIRSTPRIVATE et al.
3721b9e1
DF
4417
4418The @var{NUM_THREADS} argument is 1 if an IF clause is present
4419and false, or the value of the NUM_THREADS clause, if
4420present, or 0.
4421
4422The function needs to create the appropriate number of
4423threads and/or launch them from the dock. It needs to
4424create the team structure and assign team ids.
4425
4426@smallexample
4427 void GOMP_parallel_end (void)
4428@end smallexample
4429
4430Tears down the team and returns us to the previous @code{omp_in_parallel()} state.
4431
4432
4433
4434@node Implementing FOR construct
4435@section Implementing FOR construct
4436
4437@smallexample
4438 #pragma omp parallel for
4439 for (i = lb; i <= ub; i++)
4440 body;
4441@end smallexample
4442
4443becomes
4444
4445@smallexample
4446 void subfunction (void *data)
4447 @{
4448 long _s0, _e0;
4449 while (GOMP_loop_static_next (&_s0, &_e0))
4450 @{
4451 long _e1 = _e0, i;
4452 for (i = _s0; i < _e1; i++)
4453 body;
4454 @}
4455 GOMP_loop_end_nowait ();
4456 @}
4457
4458 GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0);
4459 subfunction (NULL);
4460 GOMP_parallel_end ();
4461@end smallexample
4462
4463@smallexample
4464 #pragma omp for schedule(runtime)
4465 for (i = 0; i < n; i++)
4466 body;
4467@end smallexample
4468
4469becomes
4470
4471@smallexample
4472 @{
4473 long i, _s0, _e0;
4474 if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0))
4475 do @{
4476 long _e1 = _e0;
4477 for (i = _s0, i < _e0; i++)
4478 body;
4479 @} while (GOMP_loop_runtime_next (&_s0, _&e0));
4480 GOMP_loop_end ();
4481 @}
4482@end smallexample
4483
6a2ba183 4484Note that while it looks like there is trickiness to propagating
3721b9e1
DF
4485a non-constant STEP, there isn't really. We're explicitly allowed
4486to evaluate it as many times as we want, and any variables involved
4487should automatically be handled as PRIVATE or SHARED like any other
4488variables. So the expression should remain evaluable in the
4489subfunction. We can also pull it into a local variable if we like,
4490but since its supposed to remain unchanged, we can also not if we like.
4491
4492If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be
4493able to get away with no work-sharing context at all, since we can
4494simply perform the arithmetic directly in each thread to divide up
4495the iterations. Which would mean that we wouldn't need to call any
4496of these routines.
4497
4498There are separate routines for handling loops with an ORDERED
4499clause. Bookkeeping for that is non-trivial...
4500
4501
4502
4503@node Implementing ORDERED construct
4504@section Implementing ORDERED construct
4505
4506@smallexample
4507 void GOMP_ordered_start (void)
4508 void GOMP_ordered_end (void)
4509@end smallexample
4510
4511
4512
4513@node Implementing SECTIONS construct
4514@section Implementing SECTIONS construct
4515
4516A block as
4517
4518@smallexample
4519 #pragma omp sections
4520 @{
4521 #pragma omp section
4522 stmt1;
4523 #pragma omp section
4524 stmt2;
4525 #pragma omp section
4526 stmt3;
4527 @}
4528@end smallexample
4529
4530becomes
4531
4532@smallexample
4533 for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ())
4534 switch (i)
4535 @{
4536 case 1:
4537 stmt1;
4538 break;
4539 case 2:
4540 stmt2;
4541 break;
4542 case 3:
4543 stmt3;
4544 break;
4545 @}
4546 GOMP_barrier ();
4547@end smallexample
4548
4549
4550@node Implementing SINGLE construct
4551@section Implementing SINGLE construct
4552
4553A block like
4554
4555@smallexample
4556 #pragma omp single
4557 @{
4558 body;
4559 @}
4560@end smallexample
4561
4562becomes
4563
4564@smallexample
4565 if (GOMP_single_start ())
4566 body;
4567 GOMP_barrier ();
4568@end smallexample
4569
4570while
4571
4572@smallexample
4573 #pragma omp single copyprivate(x)
4574 body;
4575@end smallexample
4576
4577becomes
4578
4579@smallexample
4580 datap = GOMP_single_copy_start ();
4581 if (datap == NULL)
4582 @{
4583 body;
4584 data.x = x;
4585 GOMP_single_copy_end (&data);
4586 @}
4587 else
4588 x = datap->x;
4589 GOMP_barrier ();
4590@end smallexample
4591
4592
4593
cdf6119d
JN
4594@node Implementing OpenACC's PARALLEL construct
4595@section Implementing OpenACC's PARALLEL construct
4596
4597@smallexample
4598 void GOACC_parallel ()
4599@end smallexample
4600
4601
4602
3721b9e1 4603@c ---------------------------------------------------------------------
f1f3453e 4604@c Reporting Bugs
3721b9e1
DF
4605@c ---------------------------------------------------------------------
4606
4607@node Reporting Bugs
4608@chapter Reporting Bugs
4609
f1f3453e 4610Bugs in the GNU Offloading and Multi Processing Runtime Library should
c1030b5c 4611be reported via @uref{https://gcc.gnu.org/bugzilla/, Bugzilla}. Please add
41dbbb37
TS
4612"openacc", or "openmp", or both to the keywords field in the bug
4613report, as appropriate.
3721b9e1
DF
4614
4615
4616
4617@c ---------------------------------------------------------------------
4618@c GNU General Public License
4619@c ---------------------------------------------------------------------
4620
e6fdc918 4621@include gpl_v3.texi
3721b9e1
DF
4622
4623
4624
4625@c ---------------------------------------------------------------------
4626@c GNU Free Documentation License
4627@c ---------------------------------------------------------------------
4628
4629@include fdl.texi
4630
4631
4632
4633@c ---------------------------------------------------------------------
4634@c Funding Free Software
4635@c ---------------------------------------------------------------------
4636
4637@include funding.texi
4638
4639@c ---------------------------------------------------------------------
4640@c Index
4641@c ---------------------------------------------------------------------
4642
3d3949df
SL
4643@node Library Index
4644@unnumbered Library Index
3721b9e1
DF
4645
4646@printindex cp
4647
4648@bye
This page took 1.809805 seconds and 5 git commands to generate.