This applies to the
nvptx plugin only.
The library provides elements that perform asynchronous movement of data and asynchronous operation of computing constructs. This asynchronous functionality is implemented by making use of CUDA streams1.
The primary means by that the asynchronous functionality is accessed
is through the use of those OpenACC directives which make use of the
wait clauses. When the
async clause is
first used with a directive, it creates a CUDA stream. If an
async-argument is used with the
async clause, then the
stream is associated with the specified
Following the creation of an association between a CUDA stream and the
async-argument of an
async clause, both the
clause and the
wait directive can be used. When either the
clause or directive is used after stream creation, it creates a
rendezvous point whereby execution waits until all operations
associated with the
async-argument, that is, stream, have
Normally, the management of the streams that are created as a result of
async clause, is done without any intervention by the
caller. This implies the association between the
and the CUDA stream will be maintained for the lifetime of the program.
However, this association can be changed through the use of the library
acc_set_cuda_stream. When the function
acc_set_cuda_stream is called, the CUDA stream that was
originally associated with the
async clause will be destroyed.
Caution should be taken when changing the association as subsequent
references to the
async-argument refer to a different
See "Stream Management" in "CUDA Driver API", TRM-06703-001, Version 5.5, for additional information