Bug 102845 - Memory leak with nested OpenMP parallelism
Summary: Memory leak with nested OpenMP parallelism
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 12.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: openmp
Depends on:
Blocks:
 
Reported: 2021-10-19 18:42 UTC by Andrew Benson
Modified: 2021-10-20 07:11 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Benson 2021-10-19 18:42:00 UTC
The following code seems to cause a memory leak when using nested OpenMP parallelism when compiled and run with gfortran-12.

module nestedMod

  type :: n
     integer, allocatable, dimension(:) :: i
  end type n

contains

  subroutine nested()
    implicit none
    type(n), save :: a
    !$omp threadprivate(a)
    
    !$omp parallel
    if (.not.allocated(a%i)) allocate(a%i(10000))    
    !$omp end parallel
    return
  end subroutine nested

end module nestedMod
  
program nestedLeak
  use :: OMP_Lib, only :  OMP_Set_Nested
  use :: nestedMod
  implicit none
  integer :: i,  unit, valueRSS
  character(len=80)  :: line
  call OMP_Set_Nested(.true.)
  !$omp parallel
  do i=1,1000
     call nested()
     !$omp single
     open(NEWUNIT=unit, FILE='/proc/self/status', ACTION='read')
     do
        read (unit, '(a)', END=120) line
        if (line(1:6) == 'VmRSS:') then
           read (line(7:), *) valueRSS
           exit
        endif
     enddo
120  continue
     close(unit)
     write (0,*) i,valueRSS
     !$omp end single
  end do
  !$omp end parallel

end program


This calls a subroutine from within an OpenMP parallel region. That subroutine then allocates within its own, nested parallel region. 

The output is (middle parts removed for brevity):

           1      197900
           2      376964
           3      385280
.
.
.
         998     1463436
         999     1466208
        1000     1475456


If I set OMP_Set_Nested(.false.) instead then the output is instead:

           1      308304
           2      337852
           3      306816
.
.
.
         998      376840
         999      352196
        1000      376848

Running this through valgrind (with the number of iterations reduced to 2 so it doesn't take too long to run) the output is:

==25999== Memcheck, a memory error detector
==25999== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==25999== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==25999== Command: ./a.out
==25999== 
           1      291692
           2      298840
==25999== 
==25999== HEAP SUMMARY:
==25999==     in use at exit: 19,850,144 bytes in 516 blocks
==25999==   total heap usage: 979 allocs, 463 frees, 20,201,536 bytes allocated
==25999== 
==25999== 4,800 bytes in 15 blocks are possibly lost in loss record 6 of 10
==25999==    at 0x4810218: calloc (vg_replace_malloc.c:1117)
==25999==    by 0x3810C11952: _dl_allocate_tls (in /lib64/ld-2.12.so)
==25999==    by 0x55845EE: allocate_stack (allocatestack.c:570)
==25999==    by 0x55845EE: pthread_create@@GLIBC_2.2.5 (pthread_create.c:453)
==25999==    by 0x4EFA4D3: gomp_team_start (team.c:841)
==25999==    by 0x4EF251C: GOMP_parallel (parallel.c:169)
==25999==    by 0x4017C8: MAIN__ (nested_leak.F90:29)
==25999==    by 0x4017FF: main (nested_leak.F90:23)
==25999== 
==25999== 120,000 bytes in 3 blocks are possibly lost in loss record 8 of 10
==25999==    at 0x480B7AB: malloc (vg_replace_malloc.c:380)
==25999==    by 0x40188E: __nestedmod_MOD_nested._omp_fn.0 (nested_leak.F90:15)
==25999==    by 0x4EF9E09: gomp_thread_start (team.c:108)
==25999==    by 0x5583C39: start_thread (pthread_create.c:301)
==25999==    by 0x586A61C: clone (clone.S:115)
==25999== 
==25999== 19,080,000 bytes in 477 blocks are definitely lost in loss record 10 of 10
==25999==    at 0x480B7AB: malloc (vg_replace_malloc.c:380)
==25999==    by 0x40188E: __nestedmod_MOD_nested._omp_fn.0 (nested_leak.F90:15)
==25999==    by 0x4EF9E09: gomp_thread_start (team.c:108)
==25999==    by 0x5583C39: start_thread (pthread_create.c:301)
==25999==    by 0x586A61C: clone (clone.S:115)
==25999== 
==25999== LEAK SUMMARY:
==25999==    definitely lost: 19,080,000 bytes in 477 blocks
==25999==    indirectly lost: 0 bytes in 0 blocks
==25999==      possibly lost: 124,800 bytes in 18 blocks
==25999==    still reachable: 645,344 bytes in 21 blocks
==25999==         suppressed: 0 bytes in 0 blocks
==25999== Reachable blocks (those to which a pointer was found) are not shown.
==25999== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==25999== 
==25999== For lists of detected and suppressed errors, rerun with: -s
==25999== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 4 from 4)

where valgrind claims the allocations in nested() are definitely lost.

I'm aware that valgrind can report "possibly lost" and "still reachable" blocks of memory when OpenMP is used (e.g. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36298) but this seems like an actual memory leak as far as I can tell.