This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Document OpenACC status for GCC 6
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Thomas Schwinge <thomas at codesourcery dot com>
- Cc: Sandra Loosemore <sandra at codesourcery dot com>, Gerald Pfeifer <gerald at pfeifer dot com>, gcc-patches at gcc dot gnu dot org
- Date: Mon, 25 Apr 2016 18:46:35 +0200
- Subject: Re: Document OpenACC status for GCC 6
- Authentication-results: sourceware.org; auth=none
- References: <87h9ev0w6c dot fsf at kepler dot schwinge dot homeip dot net> <571919B3 dot 8060609 at codesourcery dot com> <87wpnq0zbg dot fsf at hertz dot schwinge dot homeip dot net>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Fri, Apr 22, 2016 at 11:26:11AM +0200, Thomas Schwinge wrote:
> Index: htdocs/gcc-6/changes.html
> ===================================================================
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
> retrieving revision 1.75
> diff -u -p -r1.75 changes.html
LGTM.
> --- htdocs/gcc-6/changes.html 21 Apr 2016 15:57:43 -0000 1.75
> +++ htdocs/gcc-6/changes.html 22 Apr 2016 09:22:19 -0000
> @@ -124,6 +124,52 @@ For more information, see the
> <!-- .................................................................. -->
> <h2 id="languages">New Languages and Language specific improvements</h2>
>
> +<!-- <ul>
> + <li> -->Compared to GCC 5, the GCC 6 release series includes a much improved
> + implementation of the <a href="http://www.openacc.org/">OpenACC 2.0a
> + specification</a>. Highlights are:
> + <ul>
> + <li>In addition to single-threaded host-fallback execution, offloading is
> + supported for nvptx (Nvidia GPUs) on x86_64 and PowerPC 64-bit
> + little-endian GNU/Linux host systems. For nvptx offloading, with the
> + OpenACC parallel construct, the execution model allows for an arbitrary
> + number of gangs, up to 32 workers, and 32 vectors.</li>
> + <li>Initial support for parallelized execution of OpenACC kernels
> + constructs:
> + <ul>
> + <li>Parallelization of a kernels region is switched on
> + by <code>-fopenacc</code> combined with <code>-O2</code> or
> + higher.</li>
> + <li>Code is offloaded onto multiple gangs, but executes with just one
> + worker, and a vector length of 1.</li>
> + <li>Directives inside a kernels region are not supported.</li>
> + <li>Loops with reductions can be parallelized.</li>
> + <li>Only kernels regions with one loop nest are parallelized.</li>
> + <li>Only the outer-most loop of a loop nest can be parallelized.</li>
> + <li>Loop nests containing sibling loops are not parallelized.</li>
> + </ul>
> + Typically, using the OpenACC parallel construct gives much better
> + performance, compared to the initial support of the OpenACC kernels
> + construct.
> + <li>The <code>device_type</code> clause is not supported.
> + The <code>bind</code> and <code>nohost</code> clauses are not
> + supported. The <code>host_data</code> directive is not supported in
> + Fortran.</li>
> + <li>Nested parallelism (cf. CUDA dynamic parallelism) is not
> + supported.</li>
> + <li>Usage of OpenACC constructs inside multithreaded contexts (such as
> + created by OpenMP, or pthread programming) is not supported.</li>
> + <li>If a call to the <code>acc_on_device</code> function has a
> + compile-time constant argument, the function call evaluates to a
> + compile-time constant value only for C and C++ but not for
> + Fortran.</li>
> + </ul>
> + See the <a href="https://gcc.gnu.org/wiki/OpenACC">OpenACC</a>
> + and <a href="https://gcc.gnu.org/wiki/Offloading">Offloading</a> wiki pages
> + for further information.
> + <!-- </li>
> +</ul> -->
> +
> <!-- <h3 id="ada">Ada</h3> -->
>
> <h3 id="c-family">C family</h3>
Jakub