This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH 0/6] [og9] OpenACC worker partitioning in middle end (AMD GCN)
- From: Julian Brown <julian at codesourcery dot com>
- To: <gcc-patches at gcc dot gnu dot org>
- Cc: Andrew Stubbs <andrew_stubbs at mentor dot com>
- Date: Wed, 4 Sep 2019 18:45:49 -0700
- Subject: [PATCH 0/6] [og9] OpenACC worker partitioning in middle end (AMD GCN)
- Ironport-sdr: j9wF4RIpEqdxcVoY8HU8Q+O63e5F8eitBQOhJg04LFMLvLTgFGPCv9rwvjOgOvNZC4Y2Z2KsTX zeTSPyDSqbLs9Q1s8a5ouA7qEVjTuGBRGtLTCZg6Zfnb6RjiB1oTHCiMpg2Il1M9D0c8oMSO7i 7e00RAfi5HM9etVS2dNYhLKHHsNQ/pcGiDq/1l/9IJiWOF4xH1EGcEz5PKvn8kFfYbI/LM5DUN i9GpXqPuqRZcr9HmZdBR3B1u9o9wouOZOiuFWWOKHWhVgzPSe7aDj7nlkIWRroEUK2W78RfqNZ y2U=
- Ironport-sdr: azr8wIag/2XDpI2YNIzGBP1LU7GZCOPCxWzPDjEUpyuEoQRzIefoUdAHvxqGEMA4XB2d+zN5IT KYyWZw1qeMSs4wtDDwV4vF0OtDSSPRoAM22whrx3d/t3Af2QNmh9bTIJAhwd+l+05xohlwUwKm tFNM92SiGQX29Qcb9Yt7pGvRbxrRhDE83sY+KWBdleSN5nFq7DF90S6pniLrz/xvMYWHnWJLpj LXN0MWd3GVWbmi562dVADhs/ukP8GO35tegEou9QPYryXrwWcoYrUX4/n4SyLf1kxuTyxB/GgG MfM=
This patch series provides support for worker partitioning in the middle
end. The OpenACC device-lowering pass (oaccdevlow) is split into three
passes: the first assigns parallelism levels to loops, the second (new)
part rewrites basic blocks to implement a neutering/broadcasting scheme
for the OpenACC worker-partitioned execution mode, and the third part
performs the rest of the previous device-lowering pass.
Also included are patches to add support for placing gang-private
variables in special memory (e.g. LDS, "local-data share", on AMD GCN),
and to rewrite reductions targeting reference variables to use temporary
local scalar variables instead.
Further commentary is provided alongside individual patches.
Tested with offloading to AMD GCN. I will apply to the
openacc-gcc-9-branch shortly.
Thanks,
Julian
Julian Brown (6):
[og9] Target-dependent gang-private variable decl rewriting
[og9] OpenACC middle-end worker-partitioning support
[og9] AMD GCN adjustments for middle-end worker partitioning
[og9] Fix up tests for oaccdevlow pass splitting
[og9] Reference reduction localization
[og9] Enable worker partitioning for AMD GCN
gcc/ChangeLog.openacc | 83 +
gcc/Makefile.in | 1 +
gcc/config/gcn/gcn-protos.h | 2 +-
gcc/config/gcn/gcn-tree.c | 6 +-
gcc/config/gcn/gcn.c | 15 +-
gcc/config/gcn/gcn.opt | 2 +-
gcc/doc/tm.texi | 14 +
gcc/doc/tm.texi.in | 6 +
gcc/gimplify.c | 102 +
gcc/omp-builtins.def | 8 +
gcc/omp-low.c | 47 +-
gcc/omp-offload.c | 290 ++-
gcc/omp-offload.h | 1 +
gcc/omp-sese.c | 2036 +++++++++++++++++
gcc/omp-sese.h | 26 +
gcc/passes.def | 2 +
gcc/target.def | 19 +
gcc/targhooks.h | 1 +
gcc/testsuite/ChangeLog.openacc | 12 +
.../goacc/classify-kernels-unparallelized.c | 8 +-
.../c-c++-common/goacc/classify-kernels.c | 8 +-
.../c-c++-common/goacc/classify-parallel.c | 8 +-
.../c-c++-common/goacc/classify-routine.c | 8 +-
.../goacc/classify-kernels-unparallelized.f95 | 8 +-
.../gfortran.dg/goacc/classify-kernels.f95 | 8 +-
.../gfortran.dg/goacc/classify-parallel.f95 | 8 +-
.../gfortran.dg/goacc/classify-routine.f95 | 8 +-
gcc/tree-core.h | 4 +-
gcc/tree-pass.h | 2 +
gcc/tree.c | 11 +-
gcc/tree.h | 2 +
libgomp/ChangeLog.openacc | 5 +
libgomp/plugin/plugin-gcn.c | 4 +-
33 files changed, 2660 insertions(+), 105 deletions(-)
create mode 100644 gcc/omp-sese.c
create mode 100644 gcc/omp-sese.h
--
2.22.0