This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug fortran/66640] New: Symbolic (addr2line) backtrace handler sometimes does not terminate when using OpenMP
- From: "bugs at stellardeath dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 23 Jun 2015 13:20:51 +0000
- Subject: [Bug fortran/66640] New: Symbolic (addr2line) backtrace handler sometimes does not terminate when using OpenMP
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66640
Bug ID: 66640
Summary: Symbolic (addr2line) backtrace handler sometimes does
not terminate when using OpenMP
Product: gcc
Version: 5.1.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: fortran
Assignee: unassigned at gcc dot gnu.org
Reporter: bugs at stellardeath dot org
Target Milestone: ---
Symbolic backtraces seem to be implemented by a fork()/execve() to addr2line.
when this is done from within an OpenMP parallel region, the fork()ed
addr2line somehow never terminates and the program hangs forever in the
backtrace.
Small example program that triggers a divide-by-zero:
#######################################################
program test
use, intrinsic :: iso_c_binding
real(kind=C_DOUBLE) :: a
integer i
!$omp parallel private(a)
a = 2.0_C_DOUBLE
do i = 2, 0, -1
a = a / i
end do
write(*,*) a
!$omp end parallel
end program
#######################################################
Compile with
gfortran -g -fopenmp test.F90 -ffpe-trap=zero
With one thread it produces a backtrace and terminates, as expected:
#######################################################
$> OMP_NUM_THREADS=1 ./test
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic
operation.
Backtrace for this error:
#0 0x7F76637514B7
#1 0x7F76637506B0
#2 0x7F766282D43F
#3 0x400A6D in MAIN__._omp_fn.0 at test.F90:9
#4 0x400985 in test at test.F90:6
[1] 17635 floating point exception OMP_NUM_THREADS=1 ./test
$>
#######################################################
While with more than one thread it _sometimes_ does not terminate
(here enforced by calling it as often as it takes in the "while true" loop):
#######################################################
$> while true; do clear; OMP_NUM_THREADS=2 ./test; done
^L
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic
operation.
Backtrace for this error:
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic
operation.
Backtrace for this error:
#0 0x7F7B3D2834B7
#1 0x7F7B3D2826B0
#2 0x7F7B3C35F43F
#0 0x7F7B3D2834B7
#1 0x7F7B3D2826B0
#3 0x400A6D in MAIN__._omp_fn.0 at test.F90:9
#2 0x7F7B3C35F43F
#4 0x7F7B3CD4FFAD
#5 0x7F7B3C6D4483
#6 0x7F7B3C412A4C
#3 0x400A6D in MAIN__._omp_fn.0 at test.F90:9
#7 0xFFFFFFFFFFFFFFFF
#4 0x400985 in test at test.F90:6
[Hangs here]
#######################################################
This might also be interesting:
#######################################################
$> ps uf | grep -n "addr2line\|test"
12:lorenz 18346 0.0 0.0 29532 1392 pts/4 Sl+ 15:10 0:00 \_ ./test
13:lorenz 18348 0.0 0.0 13832 2000 pts/4 S+ 15:10 0:00 \_
/usr/bin/addr2line -e /home/lorenz/dev/addr2line_bug/test -f -s -C
14:lorenz 18349 0.0 0.0 13832 2048 pts/4 S+ 15:10 0:00 \_
/usr/bin/addr2line -e /home/lorenz/dev/addr2line_bug/test -f -s -C
$>
$> gdb -p 18348 -ex bt -ex detach -ex q 2>/dev/null | tail -n 13
#0 0x00007f04bb171580 in __read_nocancel () at
../sysdeps/unix/syscall-template.S:81
#1 0x00007f04bb109f00 in _IO_new_file_underflow (fp=0x7f04bb4324c0
<_IO_2_1_stdin_>) at fileops.c:580
#2 0x00007f04bb10ad6e in __GI__IO_default_uflow (fp=0x7f04bb4324c0
<_IO_2_1_stdin_>) at genops.c:426
#3 0x00007f04bb0ffc94 in __GI__IO_getline_info (fp=fp@entry=0x7f04bb4324c0
<_IO_2_1_stdin_>, buf=buf@entry=0x7ffdd8139e60 'F' <repeats 16 times>, "\n",
n=99, delim=delim@entry=10,
extract_delim=extract_delim@entry=1, eof=eof@entry=0x0) at iogetline.c:69
#4 0x00007f04bb0ffd88 in __GI__IO_getline (fp=fp@entry=0x7f04bb4324c0
<_IO_2_1_stdin_>, buf=buf@entry=0x7ffdd8139e60 'F' <repeats 16 times>, "\n",
n=<optimized out>,
delim=delim@entry=10, extract_delim=extract_delim@entry=1) at
iogetline.c:38
#5 0x00007f04bb0fec34 in _IO_fgets (buf=0x7ffdd8139e60 'F' <repeats 16 times>,
"\n", n=<optimized out>, fp=0x7f04bb4324c0 <_IO_2_1_stdin_>) at iofgets.c:56
#6 0x000000000040230b in ?? ()
#7 0x00007f04bb0b78c5 in __libc_start_main (main=0x401fc0, argc=6,
argv=0x7ffdd8139fe8, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>,
stack_end=0x7ffdd8139fd8) at libc-start.c:289
#8 0x0000000000402855 in ?? ()
Detaching from program: /usr/bin/addr2line, process 18348
$>
$> gdb -p 18346 -ex bt -ex "info threads" -ex detach -ex q 2>/dev/null | tail
-n 15
0x00007fda31d43f44 in __libc_wait (stat_loc=0x0) at
../sysdeps/unix/sysv/linux/wait.c:35
#0 0x00007fda31d43f44 in __libc_wait (stat_loc=0x0) at
../sysdeps/unix/sysv/linux/wait.c:35
#1 0x00007fda328eb4e9 in _gfortrani_backtrace () at
../../../libgfortran/runtime/backtrace.c:263
#2 0x00007fda328ea6b1 in _gfortrani_backtrace_handler (signum=8) at
../../../libgfortran/runtime/compile_options.c:129
#3 <signal handler called>
#4 0x0000000000400a6d in MAIN__._omp_fn.0 () at test.F90:9
#5 0x0000000000400986 in test () at test.F90:6
#6 0x00000000004009cb in main (argc=1, argv=0x7ffc70796e8f) at test.F90:15
#7 0x00007fda319b48c5 in __libc_start_main (main=0x40098d <main>, argc=1,
argv=0x7ffc707969c8, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>,
stack_end=0x7ffc707969b8) at libc-start.c:289
#8 0x0000000000400899 in _start () at ../sysdeps/x86_64/start.S:118
Id Target Id Frame
2 Thread 0x7fda3178f700 (LWP 18347) "test" 0x00007fda31d43f44 in
__libc_wait (stat_loc=0x0) at ../sysdeps/unix/sysv/linux/wait.c:35
* 1 Thread 0x7fda32dd8780 (LWP 18346) "test" 0x00007fda31d43f44 in
__libc_wait (stat_loc=0x0) at ../sysdeps/unix/sysv/linux/wait.c:35
Detaching from program: /home/lorenz/dev/addr2line_bug/test, process 18346
$>
#######################################################
I would guess there is some thread-unsafety in libgfortran/runtime/backtrace.c?
Kind regards,
Lorenz