This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[nvptx, libgomp, testsuite, PR85519] Reduce recursion depth in declare_target-{1,2}.f90


Hi,

when running the libgomp tests with nvptx accelerator on an Nvidia Titan V, we run into these failures:
...
FAIL: libgomp.fortran/examples-4/declare_target-1.f90   -O1  execution test
FAIL: libgomp.fortran/examples-4/declare_target-1.f90   -O2  execution test
FAIL: libgomp.fortran/examples-4/declare_target-1.f90   -Os  execution test
FAIL: libgomp.fortran/examples-4/declare_target-2.f90   -O1  execution test
FAIL: libgomp.fortran/examples-4/declare_target-2.f90   -O2  execution test
FAIL: libgomp.fortran/examples-4/declare_target-2.f90   -Os  execution test
...

These tests contain recursive functions, and the failures are due to the fact that during execution it runs out of thread stack. The symptom is:
...
libgomp: cuCtxSynchronize error: an illegal memory access was encountered
...
which we can turn into this symptom:
...
libgomp: cuStreamSynchronize error: an illegal instruction was encountered
...
by using GOMP_NVPTX_JIT=-O0, which inserts a valid thread stack check after the thread stack decrement at the start of each function.

The thread stack limit defaults to 1024 on all the boards that I've checked, including Titan V. The tests have a recursion depth of ~25, so when the frame size of the recursive function exceeds ~40, we can be sure to run out off thread stack. [ It also may happen at a smaller frame size, given that some thread stack space may have already been consumed before calling the recursive function. ]

[ The nvptx libgomp port uses a 128k per-warp stack in the global memory, avoiding the use of the .local directive in offloading functions, which would be mapped onto thread stack. But doing so does not eliminate the thread stack usage. F.i., device routine parameters can be stored on thread stack. ]


Concluding, these tests run out thread stack on Nvidia Titan V because the recursive functions have a larger frame size than we've seen for the Nvidia architecture flavours that we've tested before.

The patch fixes this by reducing the recursion depth.

OK for stage4 trunk?

Thanks,
- Tom
[nvptx, libgomp, testsuite] Reduce recursion depth in declare_target-{1,2}.f90

2018-04-25  Tom de Vries  <tom@codesourcery.com>

	PR target/85519
	* testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Reduce
	recursion depth from 25 to 23.
	* testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same.

---
 libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90 | 4 +++-
 libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90 | 6 ++++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90 b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
index df941ee..51de6b2 100644
--- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
+++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
@@ -27,5 +27,7 @@ end module
 program e_53_1
   use e_53_1_mod, only : fib, fib_wrapper
   if (fib (15) /= fib_wrapper (15)) STOP 1
-  if (fib (25) /= fib_wrapper (25)) STOP 2
+  ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+  ! Nvidia Titan V.
+  if (fib (23) /= fib_wrapper (23)) STOP 2
 end program
diff --git a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90 b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
index 9c31569..76cce01 100644
--- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
+++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
@@ -4,9 +4,11 @@ program e_53_2
   !$omp declare target (fib)
   integer :: x, fib
   !$omp target map(from: x)
-    x = fib (25)
+    ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+    ! Nvidia Titan V.
+    x = fib (23)
   !$omp end target
-  if (x /= fib (25)) STOP 1
+  if (x /= fib (23)) STOP 1
 end program
 
 integer recursive function fib (n) result (f)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]