Bug 103039 - [12 Regression] Segfault with openmp block within a select type block since r12-1016-g0e3b3b77e13cac76
Summary: [12 Regression] Segfault with openmp block within a select type block since r...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 12.0
: P1 normal
Target Milestone: 12.0
Assignee: Not yet assigned to anyone
URL:
Keywords: openmp, wrong-code
Depends on:
Blocks:
 
Reported: 2021-11-02 09:08 UTC by martin
Modified: 2022-03-21 11:59 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2022-03-09 00:00:00


Attachments
seltype.f90 (214 bytes, text/plain)
2021-11-02 09:08 UTC, martin
Details
seemingly same problem but with associate instead of select type (245 bytes, text/plain)
2021-11-02 10:26 UTC, martin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description martin 2021-11-02 09:08:49 UTC
Created attachment 51720 [details]
seltype.f90

The attached code compiles and executes fine with gfortran-11.1, but it first shows warnings during compilation and then fails with a segfault with the current master branch.
Compilation is done with: gfortran-12 -fopenmp -Wall seltype.f90
Output should be just the number 110.

In our code I have also seen lots of similar warnings with openmp blocks within associate(.. => ...) blocks and related segfaults, but I could not come up with a simple testcase.
Comment 1 martin 2021-11-02 09:12:55 UTC
The warning which is shown during compilation is:
seltype.f90:16:59:

   16 | !$omp parallel do default(shared) private(k) reduction(+:s)
      |                                                           ^
Warning: ‘__tmp_class_a’ is used uninitialized [-Wuninitialized]
seltype.f90:15:12:

   15 | class is (a)
      |            ^
note: ‘__tmp_class_a’ declared here
Comment 2 Martin Liška 2021-11-02 09:20:47 UTC
Started with r12-1016-g0e3b3b77e13cac76.
Comment 3 martin 2021-11-02 10:26:00 UTC
Created attachment 51722 [details]
seemingly same problem but with associate instead of select type

A similar problem with associate shows the same problem and probably has the seem root cause. It is slightly more complicated than the variant with select type, as it requires a class(..) component provided in the target of the associate statement.
Otherwise warnings and subsequent segfault are similar.
Comment 4 martin 2022-01-25 08:28:55 UTC
Is there any chance of fixing this regression (in particular the associate block variant) till gfortran-12 release? Having an openmp block within an associate block should not be that uncommon in modern code, right?
Comment 5 Richard Biener 2022-03-09 14:27:49 UTC
in .ompexp I see

void MAIN__._omp_fn.0 (struct .omp_data_s.9 & restrict .omp_data_i)
{
...
  <bb 6> :
  s = 0;
  D.4352 = .omp_data_i->__tmp_class_a;
  __tmp_class_a.10 = __tmp_class_a;
  D.4355 = D.4352->_vptr;
  __tmp_class_a.10->_vptr = D.4355;
  D.4356 = .omp_data_i->__tmp_class_a;
  D.4357 = D.4356->_vptr;
  D.4358 = D.4357->_size;
  D.4359 = (unsigned long) D.4358;
  __tmp_class_a.11 = __tmp_class_a;
  D.4361 = __builtin_malloc (D.4359);
  __tmp_class_a.11->_data = D.4361;
  D.4362 = .omp_data_i->__tmp_class_a;
  D.4363 = D.4362->_vptr;
  D.4364 = D.4363->_copy;
  __tmp_class_a.12 = __tmp_class_a;
  D.4366 = .omp_data_i->__tmp_class_a;
  D.4367 = D.4366->_data;
  D.4364 (D.4367, __tmp_class_a.12);

so .omp_data_i->__tmp_class_a is accessed but then the local __tmp_class_a
is used in the end.  So sth is amiss here.

I suspect the unshare_expr is playing havoc with some later lookups.
Comment 6 Jakub Jelinek 2022-03-17 16:41:28 UTC
unshare_expr doesn't do anything wrong.
The problem is that because of the select we have firstprivate(__tmp_class_a)
clause where __tmp_class_a has type which has struct __class_seltype_A_p *
type.  Before the r12-1016 change, we clearly firstprivatized that var
because gfc_omp_clause_copy_ctor just did:
return build2_v (MODIFY_EXPR, dest, src);
for that case, so emitted:
                    D.4011 = .omp_data_i->__tmp_class_a;
                    __tmp_class_a = D.4011;
which essentially means that the __tmp_class_a var (the pointer) was privatized, but what it points to was shared.
Now, since r12-1016 it copies various elements the pointer points to
from what the source pointer points to to what the destination pointer points to.
But we don't really initialize what dest points to first.
I bet we want to privatize what the pointer points to, so we need to emit something like:
  __class_seltype_A_p temp;
  __tmp_class_a = &temp;
  __tmp_class_a->vptr = .omp_data_i->__tmp_class_a->vptr;
etc. (not gimplified form).
I think the way to achieve this would be make sure to return true from the
gfc_omp_privatize_by_reference hook.

Tobias, can you please have a look?
Comment 7 Tobias Burnus 2022-03-18 12:29:49 UTC
(In reply to Jakub Jelinek from comment #6)
> I think the way to achieve this would be make sure to return true from the
> gfc_omp_privatize_by_reference hook.

What puzzles me is actually the reverse:

  class(a), pointer :: x
  ...
  select type (x)
  class is (a)
  !$omp parallel do default(shared) private(k) reduction(+:s)

Looking at the dump, one sees in the gimple dump:

  #pragma omp parallel private(k) reduction(+:s) default(shared)
              firstprivate(__tmp_class_a)

I think it is okay that __tmp_class_a is firstprivate – but in any case, no deep copy should then be done in gfc_omp_clause_copy_ctor.

 * * *

Regarding why the firstprivate shows up:
gfc_omp_predetermined_sharing returns OMP_CLAUSE_DEFAULT_FIRSTPRIVATE because of:

  /* Associate names preserve the association established during ASSOCIATE.
     As they are implemented either as pointers to the selector or array
     descriptor and shouldn't really change in the ASSOCIATE region,
     this decl can be either shared or firstprivate.  If it is a pointer,
     use firstprivate, as it is cheaper that way, otherwise make it shared.  */
  if (GFC_DECL_ASSOCIATE_VAR_P (decl))
    {
      if (TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE)
        return OMP_CLAUSE_DEFAULT_FIRSTPRIVATE;
      else
        return OMP_CLAUSE_DEFAULT_SHARED;
    }

Added in commit r5-1190-g92d28cbb59cc5a611af41342c5b224fbf779a44d

 * * *

BTW, using:   firstprivate(x)   fails with
  Error: ASSOCIATE name ‘__tmp_class_a’ in FIRSTPRIVATE clause at (1)

While

  select type (y => x)
  class is (a)
  !$omp parallel do default(shared) private(k) reduction(+:s) firstprivate(x)

is accepted but while the original dump has 'firstprivate(x)', it is removed
(as unused) during gimplify, resulting the output above.

Note: Associate names can appear for 'associate', 'select type', 'select rank' and 'change team'. — For 'select type/rank', the selector name is the same as the selector (wihch must be a variable) when left out.

 * * *

OpenMP has for "predetermined data-sharing attributes":
"An associate name that may appear in a variable definition context is shared if its association occurs outside of the construct and otherwise it has the same data-sharing attribute as the selector with which it is associated."
Comment 8 Jakub Jelinek 2022-03-18 12:37:09 UTC
So it means the firstprivate clause is added implicitly during gimplification (with OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT?) for it.
If we want to firstprivatize just the pointer and not what it points to,
perhaps we should arrange for OMP_CLAUSE_FIRSTPRIVATE_NO_REFERENCE to be set on it and then take it into account in gfc_omp_clause_copy_ctor etc. and just do what we used to do before (i.e. copy the pointer and nothing else).
Comment 9 GCC Commits 2022-03-18 13:50:55 UTC
The master branch has been updated by Tobias Burnus <burnus@gcc.gnu.org>:

https://gcc.gnu.org/g:6393122d271a92d5d9d8656a57ea167e92498871

commit r12-7701-g6393122d271a92d5d9d8656a57ea167e92498871
Author: Tobias Burnus <tobias@codesourcery.com>
Date:   Fri Mar 18 14:50:36 2022 +0100

    Fortran/OpenMP: Improve associate-name diagnostic [PR103039]
    
    gcc/fortran/ChangeLog:
    
            PR fortran/103039
            * openmp.cc (resolve_omp_clauses): Improve associate-name diagnostic
            for select type/rank.
    
    gcc/testsuite/ChangeLog:
    
            PR fortran/103039
            * gfortran.dg/gomp/associate1.f90: Update dg-error.
            * gfortran.dg/gomp/associate2.f90: New test.
Comment 10 GCC Commits 2022-03-18 16:40:55 UTC
The master branch has been updated by Tobias Burnus <burnus@gcc.gnu.org>:

https://gcc.gnu.org/g:c133bdfa9e7d9225510d00dbb7270cc052e4e4ee

commit r12-7709-gc133bdfa9e7d9225510d00dbb7270cc052e4e4ee
Author: Tobias Burnus <tobias@codesourcery.com>
Date:   Fri Mar 18 17:40:22 2022 +0100

    Fortran/OpenMP: Fix privatization of associated names
    
    gfc_omp_predetermined_sharing cases the associate-name pointer variable
    to be OMP_CLAUSE_DEFAULT_FIRSTPRIVATE, which is fine. However, the associated
    selector is shared. Thus, the target of associate-name pointer should not get
    copied. (It was before but because of gfc_omp_privatize_by_reference returning
    false, the selector was not only wrongly copied but this was also not done
    properly.)
    
    gcc/fortran/ChangeLog:
    
            PR fortran/103039
            * trans-openmp.cc (gfc_omp_clause_copy_ctor, gfc_omp_clause_dtor):
            Only privatize pointer for associate names.
    
    libgomp/ChangeLog:
    
            PR fortran/103039
            * testsuite/libgomp.fortran/associate4.f90: New test.
Comment 11 Tobias Burnus 2022-03-18 16:42:48 UTC
Should be FIXED (on mainline, which will become 12.1)

Thanks for the report, Martin – sorry for fixing it not more quickly.
Thanks Richard & Jakub for the debugging!
Comment 12 martin 2022-03-21 11:59:35 UTC
Thanks for fixing! I just checked with the real code and there it also looks good.