I am having trouble identifying a problem in an OpenMP parallelized Fortran code. In doing so, I created a small reproducible that mimics the derived types I am using in the code. The small reproducible shows some unexpected behavior that could point to the root of the problem, so I would like to understand what is happening. I am posting the code below:
module my_subroutines
use, intrinsic :: iso_fortran_env
implicit none
type array_int32
integer(kind=int32), dimension(:), allocatable :: arr
end type array_int32
type array_int32_ptr
type(array_int32), pointer :: p => NULL()
end type array_int32_ptr
contains
subroutine print_rows(tid, row_ptr)
implicit none
integer(int32), intent(in) :: tid
type(array_int32_ptr), dimension(2), target, intent(inout) :: row_ptr
integer(kind=int32), pointer :: index => NULL()
integer(kind=int32) :: i, j
j = 1
row_ptr(j)%p%arr(1) = 1000
index => row_ptr(j)%p%arr(1)
write(*,'(2(A,I0))') ' - row_ptr(1)p%arr(1) on thread ', tid, ': ', index
j = 2
row_ptr(j)%p%arr(1) = 1000
index => row_ptr(j)%p%arr(1)
write(*,'(2(A,I0))') ' - row_ptr(2)p%arr(1) on thread ', tid, ': ', index
end subroutine print_rows
end module my_subroutines
program test
use, intrinsic :: iso_fortran_env
use my_subroutines
use omp_lib
implicit none
type(array_int32_ptr), dimension(2) :: row_ptr
type(array_int32), dimension(2), target :: row
integer(kind=int32) :: i, tid
!$omp parallel default(none) &
!$omp private(i, tid, row, row_ptr)
tid = omp_get_thread_num()
allocate(row(1)%arr(10))
allocate(row(2)%arr(20))
row(1)%arr = (/ (i, i=1,size(row(1)%arr) )/)
row(2)%arr = (/ (i, i=1,size(row(2)%arr) )/)
row_ptr(1)%p => row(1)
row_ptr(2)%p => row(2)
call print_rows(tid, row_ptr)
row_ptr(1)%p => row(2)
row_ptr(2)%p => row(1)
call print_rows(tid, row_ptr)
deallocate(row(1)%arr, row(2)%arr)
!$omp end parallel
end program test
and here is an example output running on 3 threads (which is not deterministic, pointing to some memory leak within the code):
- row_ptr(1)p%arr(1) on thread 0: 1000
- row_ptr(1)p%arr(1) on thread 1: 1000
- row_ptr(2)p%arr(1) on thread 1: 1000
- row_ptr(1)p%arr(1) on thread 1: 1000
- row_ptr(2)p%arr(1) on thread 1: 1000
- row_ptr(1)p%arr(1) on thread 2: 1987284000
- row_ptr(2)p%arr(1) on thread 2: 1000
- row_ptr(1)p%arr(1) on thread 2: 1000
- row_ptr(2)p%arr(1) on thread 2: 1000
- row_ptr(2)p%arr(1) on thread 0: 2121468960
- row_ptr(1)p%arr(1) on thread 0: 1000
- row_ptr(2)p%arr(1) on thread 0: 1000
indexin theprint_rowssubroutine, so giving it the save attribute, so making it shared. Hence the race. There's a duplicate of this somewhere .... But stackoverflow.com/questions/35347944/… might be useful