This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug libfortran/81195] New: SPEC CPU2017 621.wrf_s failure with 40+ openmp threads
- From: "wilson at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sat, 24 Jun 2017 03:01:45 +0000
- Subject: [Bug libfortran/81195] New: SPEC CPU2017 621.wrf_s failure with 40+ openmp threads
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81195
Bug ID: 81195
Summary: SPEC CPU2017 621.wrf_s failure with 40+ openmp threads
Product: gcc
Version: 8.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libfortran
Assignee: unassigned at gcc dot gnu.org
Reporter: wilson at gcc dot gnu.org
Target Milestone: ---
I'm seeing many different kinds of failures when running a wrf_s binary
compiled with gcc mainline. Double free aborts. Segfaults. Fortran runtime
error: End of file. Etc. This uses openmp, and I'm running on an aarch64
machine with over 40 processors, using over 40 threads.
Debugging, I tracked it down to a problem in libgfortran/io/unit.c There is a
stack, newunit_stack, used to hold malloc structures not currently in use,
apparently to avoid lots of malloc/free calls. The code locks this stack when
pushing. However, it does not lock the stack when popping. If multiple
threads try to pop at the same time, they can end up using the same structure,
and then bad things happen. I have confirmed this behavior in gdb. The more
threads you have, the more likely you run into the problem.
wrf_s works if I add code to lock newunit_stack when popping. We also need to
lock around uses of newunit_tos. I'm not sure if my patch is the best solution
though.
I haven't used Fortran in 30+ years, so I don't know how to write a testcase.
GCC 7 appears to have the same code, so may have the same problem. I haven't
tried to reproduce with gcc 7.