The following code when run using OpenMP crashes with a "stack size exceeded" error (and, sometimes, a segfault) under gfortran 7.0.0 (r242500). This happens most times I run the code, but not every time. When compiled with gfortran 6.0.0 it runs successfully every time. program p double precision :: b b=0.1d0 !$omp parallel call s(b) !$omp end parallel contains subroutine s(a) implicit none double precision, intent(in) :: a character(len=10) :: f,c write (f,'(a5,i1,a1)') "(e10.",2,")" write (c,f) a write (0,*) trim(c) return end subroutine s end program p $ gfortran -v Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/home/abenson/Galacticus/Tools/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-trunk/configure --prefix=/home/abenson/Galacticus/Tools --enable-languages=c,c++,fortran --disable-multilib Thread model: posix gcc version 7.0.0 20161116 (experimental) (GCC) $ gfortran t.F90 -g -fopenmp [abenson@mies v0.9.4]$ a.out 0.10E+00 0.10E+00 0.10E+00 0.10E+00 At line 12 of file t.F90 Internal Error: stash_internal_unit(): Stack Size Exceeded Error termination. Backtrace: 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 #0 0x400a7b in s at /home/abenson/Galacticus/v0.9.4/t.F90:12 #1 0x400c53 in MAIN__._omp_fn.0 at /home/abenson/Galacticus/v0.9.4/t.F90:5 #2 0x7f786cbc955d in gomp_thread_start at ../../../gcc-trunk/libgomp/team.c:119 #3 0x7f786bdbdc39 in start_thread at /data001/abenson/Galacticus/Tools/glibc-2.12.1/nptl/pthread_create.c:301 #4 0x7f786c4c361c in ??? at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 #5 0xffffffffffffffff in ??? The frequency with which the code crashes increases as the number of OpenMP threads is increased (running with 2 threads it almost always runs successfully, with 4 threads it crashes on 1 run in 10-20, with 16 threads it crashes almost every time). The error is sometimes reported at line 12, sometimes at line 13 - but always the same "stack size exceeded" error.
Hi Andrew, I/O operations are not thread safe. If you enclose them in !$omp critical, you will get the expected result: ig25@linux-fd1f:/tmp> cat omp.f90 program p double precision :: b b=0.1d0 !$omp parallel call s(b) !$omp end parallel contains subroutine s(a) implicit none double precision, intent(in) :: a character(len=10) :: f,c !$omp critical write (f,'(a5,i1,a1)') "(e10.",2,")" write (c,f) a write (0,*) trim(c) !$omp end critical return end subroutine s end program p ig25@linux-fd1f:/tmp> gfortran -fopenmp omp.f90 ig25@linux-fd1f:/tmp> ./a.out 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00 0.10E+00
OK - thanks. I hadn't realized that the internal I/O operations weren't threadsafe - I guess I've just been fortunate to avoid this with previous versions of gfortran. I'll update my code to use critical sections as necessary. Thanks for the fast response!
I thought they are supposed to be, at least that is what we had unit_lock, u->lock etc. for. So has something in libgfortran changed so that it doesn't properly lock the units anymore?
(In reply to Jakub Jelinek from comment #3) > I thought they are supposed to be, at least that is what we had unit_lock, > u->lock etc. for. So has something in libgfortran changed so that it > doesn't properly lock the units anymore? Our intent was to maintain the correct locking. Recent changes for the DTIO patch may have broken something. We did touch a little there and thought we had it correct.
I couldn't find anything in the OpenMP specifications which addresses this issue - so presumably it's undefined behavior as far as OpenMP is concerned. But it seems that internal file writes were intended to be thread safe in gfortran - am I correct in understanding that? From what I can tell Intel fortran also makes internal file writes thread safe, for what it's worth.
OK, in that case, maybe I was hasty :-) The original test case has an external write, but the error does not depend on this. So, confirming.
The testcase can trigger the bug fixed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81195 if the number of threads is high enough. It also triggers an unrelated problem which gives the stack size exceeded error. Looking at this issue, I see that in libgfortran/io/unit.c, the newunit_alloc function starts with newunits = xcalloc (16, 1); newunit_size = 16; and then if we need more of them, it does newunit_size *= 2; newunits = xrealloc (newunits, newunit_size); However, for stash_internal_unit, we have as global definitions #define NEWUNIT_STACK_SIZE 16 static gfc_saved_unit newunit_stack[NEWUNIT_STACK_SIZE]; and inside the function we have if (newunit_tos >= NEWUNIT_STACK_SIZE) When the testcase fails, newunit_size is 32. So we have allocated more units than we can stash. It looks like the fix is to change stash_internal_unit to dynamically resize newunit_stack, same as the newunit_alloc function already does for newunits. This appears related to bug 48587, which added the code to increase the size of newunits array.
(In reply to Jim Wilson from comment #7) --- snip --- > > However, for stash_internal_unit, we have as global definitions > #define NEWUNIT_STACK_SIZE 16 > static gfc_saved_unit newunit_stack[NEWUNIT_STACK_SIZE]; > and inside the function we have > if (newunit_tos >= NEWUNIT_STACK_SIZE) That size is fairly arbitrary. Short term we can bump this limit up to some higher value. Longer term, I wonder if this needs to be more dynamic. I will get on it.
Related to pr81195 and pr81241? Works on darwin with 8 threads.
newunit_stack needs to be allocated dynamically, just like the newunits array. I modified libgfortran to compute the highest value of newunit_tos, and print it when the program exits. I set NEWUNIT_STACK_SIZE to 128 so I could go up to 128 threads. weathertop:2188$ OMP_NUM_THREADS=8 ./a.out 2> /dev/null newunit_tos_max = 8 weathertop:2188$ OMP_NUM_THREADS=16 ./a.out 2> /dev/null newunit_tos_max = 16 weathertop:2189$ OMP_NUM_THREADS=32 ./a.out 2> /dev/null newunit_tos_max = 32 weathertop:2190$ OMP_NUM_THREADS=64 ./a.out 2> /dev/null newunit_tos_max = 64 weathertop:2191$ OMP_NUM_THREADS=128 ./a.out 2> /dev/null newunit_tos_max = 128 Segmentation fault (core dumped) weathertop:2192$ So the size of newunit stack depends on the number of threads, and can be any number. If you don't want to put a hard limit of the number of threads that gfortran can support, then you need to dynamically allocate it. There is some variation from one run to the next, so using 16 threads doesn't guarantee failure, and using 8 threads doesn't guarantee success. weathertop:2192$ OMP_NUM_THREADS=8 ./a.out 2> /dev/null newunit_tos_max = 8 weathertop:2193$ OMP_NUM_THREADS=8 ./a.out 2> /dev/null newunit_tos_max = 8 weathertop:2193$ OMP_NUM_THREADS=8 ./a.out 2> /dev/null newunit_tos_max = 8 weathertop:2193$ OMP_NUM_THREADS=8 ./a.out 2> /dev/null newunit_tos_max = 8 weathertop:2193$ OMP_NUM_THREADS=8 ./a.out 2> /dev/null newunit_tos_max = 3 weathertop:2193$ OMP_NUM_THREADS=8 ./a.out 2> /dev/null newunit_tos_max = 8 weathertop:2193$ OMP_NUM_THREADS=8 ./a.out 2> /dev/null newunit_tos_max = 5 weathertop:2193$ OMP_NUM_THREADS=8 ./a.out 2> /dev/null newunit_tos_max = 8 weathertop:2193$ OMP_NUM_THREADS=8 ./a.out 2> /dev/null newunit_tos_max = 8 weathertop:2193$
After thinking about this, I think I have a better idea which gets rid of the stash completely and will avoid allocating a unit structure for internal units. To accomplish this, I intend to pass the string, string length and string array descriptor directly to the dtio child procedures vis hidden arguments. Just want to capture the idea here for others heads up.
Patch submitted here: https://gcc.gnu.org/ml/fortran/2017-08/msg00045.html
Author: jvdelisle Date: Mon Aug 28 03:42:47 2017 New Revision: 251374 URL: https://gcc.gnu.org/viewcvs?rev=251374&root=gcc&view=rev Log: 2017-08-27 Jerry DeLisle <jvdelisle@gcc.gnu.org> PR libgfortran/78387 * io/list_read.c (nml_read_obj): Remove use of stash. * io/transfer.c (st_read_done, st_write_done): Likewise. * io/unit.c (stash_internal_unit): Delete function. (get_unit): Remove use of stash. (init_units): Likewise. (close_units): Likewise. * io/write.c (nml_write_obj): Likewise: Modified: trunk/libgfortran/ChangeLog trunk/libgfortran/io/list_read.c trunk/libgfortran/io/transfer.c trunk/libgfortran/io/unit.c trunk/libgfortran/io/write.c
Author: aldyh Date: Wed Sep 13 17:31:53 2017 New Revision: 252585 URL: https://gcc.gnu.org/viewcvs?rev=252585&root=gcc&view=rev Log: 2017-08-27 Jerry DeLisle <jvdelisle@gcc.gnu.org> PR libgfortran/78387 * io/list_read.c (nml_read_obj): Remove use of stash. * io/transfer.c (st_read_done, st_write_done): Likewise. * io/unit.c (stash_internal_unit): Delete function. (get_unit): Remove use of stash. (init_units): Likewise. (close_units): Likewise. * io/write.c (nml_write_obj): Likewise: Modified: branches/range-gen2/libgfortran/ChangeLog branches/range-gen2/libgfortran/io/list_read.c branches/range-gen2/libgfortran/io/transfer.c branches/range-gen2/libgfortran/io/unit.c branches/range-gen2/libgfortran/io/write.c
Author: jvdelisle Date: Wed Sep 20 01:32:59 2017 New Revision: 252992 URL: https://gcc.gnu.org/viewcvs?rev=252992&root=gcc&view=rev Log: 2017-09-19 Jerry DeLisle <jvdelisle@gcc.gnu.org> Backport from trunk PR libgfortran/78387 * io/list_read.c (nml_read_obj): Remove use of stash. * io/transfer.c (st_read_done, st_write_done): Likewise. * io/unit.c (stash_internal_unit): Delete function. (get_unit): Remove use of stash. (init_units): Likewise. (close_units): Likewise. * io/write.c (nml_write_obj): Likewise: Modified: branches/gcc-7-branch/libgfortran/ChangeLog branches/gcc-7-branch/libgfortran/io/list_read.c branches/gcc-7-branch/libgfortran/io/transfer.c branches/gcc-7-branch/libgfortran/io/unit.c branches/gcc-7-branch/libgfortran/io/write.c
Fixed on trunk and 7. Closing
*** Bug 82555 has been marked as a duplicate of this bug. ***