This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] nvptx -mgomp multilib
- From: Alexander Monakov <amonakov at ispras dot ru>
- To: gcc-patches at gcc dot gnu dot org
- Cc: Nathan Sidwell <nathan at acm dot org>
- Date: Wed, 20 Apr 2016 20:03:48 +0300 (MSK)
- Subject: [PATCH] nvptx -mgomp multilib
- Authentication-results: sourceware.org; auth=none
Wire up -mgomp multilib for OpenMP offloading, in a straightforward way.
Changes in nvptx.opt and invoke.texi adding -msoft-stack and -muniform-simt
are originally from the patches that introduced those.
Doc additions in invoke.texi will probably change; I've asked Sandra Loosemore
to have a look.
Previously posted here:
[gomp-nvptx 4/9] nvptx backend: add -mgomp option and multilib
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00115.html
2015-12-09 Alexander Monakov <amonakov@ispras.ru>
* config/nvptx/nvptx.c (nvptx_option_override): Handle TARGET_GOMP.
* config/nvptx/nvptx.opt (mgomp): New option.
* config/nvptx/t-nvptx (MULTILIB_OPTIONS): New.
* doc/invoke.texi (mgomp): Document.
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 2d4dad1..e9e4d06 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -184,6 +171,9 @@ nvptx_option_override (void)
worker_red_sym = gen_rtx_SYMBOL_REF (Pmode, "__worker_red");
SET_SYMBOL_DATA_AREA (worker_red_sym, DATA_AREA_SHARED);
worker_red_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
+
+ if (TARGET_GOMP)
+ target_flags |= MASK_SOFT_STACK | MASK_UNIFORM_SIMT;
}
/* Return a ptx type for MODE. If PROMOTE, then use .u32 for QImode to
diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index 056b9b2..1a7608b 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -32,3 +32,15 @@ Link in code for a __main kernel.
moptimize
Target Report Var(nvptx_optimize) Init(-1)
Optimize partition neutering
+
+msoft-stack
+Target Report Mask(SOFT_STACK)
+Use custom stacks instead of local memory for automatic storage.
+
+muniform-simt
+Target Report Mask(UNIFORM_SIMT)
+Generate code that executes all threads in a warp as if one was active.
+
+mgomp
+Target Report Mask(GOMP)
+Generate code for OpenMP offloading: enables -msoft-stack and -muniform-simt.
diff --git a/gcc/config/nvptx/t-nvptx b/gcc/config/nvptx/t-nvptx
index e2580c9..6c1010d 100644
--- a/gcc/config/nvptx/t-nvptx
+++ b/gcc/config/nvptx/t-nvptx
@@ -8,3 +8,5 @@ ALL_HOST_OBJS += mkoffload.o
mkoffload$(exeext): mkoffload.o collect-utils.o libcommon-target.a $(LIBIBERTY) $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ \
mkoffload.o collect-utils.o libcommon-target.a $(LIBIBERTY) $(LIBS)
+
+MULTILIB_OPTIONS = mgomp
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d281975..a02a852 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -19341,6 +19341,32 @@ offloading execution.
Apply partitioned execution optimizations. This is the default when any
level of optimization is selected.
+@item -msoft-stack
+@opindex msoft-stack
+Do not use @code{.local} memory for automatic storage. Instead, use pointer
+in shared memory array @code{char *__nvptx_stacks[]} at position @code{tid.y}
+as the stack pointer. This is for placing automatic variables into storage
+that can be accessed from other threads, or modified with atomic instructions.
+
+@item -muniform-simt
+@opindex muniform-simt
+Switch to code generation variant that allows to execute all threads in each
+warp, while maintaining memory state and side effects as if only one thread
+in each warp was active outside of OpenMP SIMD regions. All atomic operations
+and calls to runtime (malloc, free, vprintf) are conditionally executed (iff
+current lane index equals the master lane index), and the register being
+assigned is copied via a shuffle instruction from the master lane. Outside of
+SIMD regions lane 0 is the master; inside, each thread sees itself as the
+master. Shared memory array @code{int __nvptx_uni[]} stores all-zeros or
+all-ones bitmasks for each warp, indicating current mode (0 outside of SIMD
+regions). Each thread can bitwise-and the bitmask at position @code{tid.y}
+with current lane index to compute the master lane index.
+
+@item -mgomp
+@opindex mgomp
+Generate code for use in OpenMP offloading: enables @option{-msoft-stack} and
+@option{-muniform-simt} options, and selects corresponding multilib variant.
+
@end table
@node PDP-11 Options