Bug 46408 - [OOP] Segfault when running gfortran.dg/class_allocate_6.f03
Summary: [OOP] Segfault when running gfortran.dg/class_allocate_6.f03
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: ---
Assignee: janus
URL:
Keywords: wrong-code
: 47085 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-11-10 10:39 UTC by Tobias Burnus
Modified: 2011-01-02 21:20 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-11-10 13:05:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Burnus 2010-11-10 10:39:23 UTC
This might be a known problem, but the test case
  gfortran.dg/class_allocate_6.f03
fails for me. (The test was added for PR 45451 / PR 46174, which does a deep copy.)

To be precise, I get a segfault using when executing the line:
  allocate(y, source=x)

A "PRINT" statement placed directly after that line will not be printed, i.e. it is not a finally/cleanup issue.

The segfault disappears if I unset MALLOC_PERTURB_ which indicates the use of an uninitialized variable.

Valgrind shows:
 Conditional jump or move depends on uninitialised value(s)
    at 0x400C37: __copy_MAIN___t2.1547 (class_allocate_6.f03:7)
    by 0x400ED4: MAIN__ (class_allocate_6.f03:48)
    by 0x400FD7: main (class_allocate_6.f03:48)


Related: PR 46321 (deep-freeing of polymorphic variables)
Comment 1 janus 2010-11-10 13:05:00 UTC
(In reply to comment #0)
> This might be a known problem, but the test case
>   gfortran.dg/class_allocate_6.f03
> fails for me.

I was not aware of this problem up to now. The test case does not fail for me, but I can reproduce the valgrind error.


> To be precise, I get a segfault using when executing the line:
>   allocate(y, source=x)

The error goes away when removing the SOURCE tag or making x TYPE(t), but not when making x TYPE(t2).


> Valgrind shows:
>  Conditional jump or move depends on uninitialised value(s)
>     at 0x400C37: __copy_MAIN___t2.1547 (class_allocate_6.f03:7)
>     by 0x400ED4: MAIN__ (class_allocate_6.f03:48)
>     by 0x400FD7: main (class_allocate_6.f03:48)

Apparently this happens inside the copying routine, but I'm not quite sure where exactly the problem is.
Comment 2 Tobias Burnus 2010-11-11 10:09:15 UTC
(In reply to comment #1)
> Apparently this happens inside the copying routine, but I'm not quite sure
> where exactly the problem is.

Well, that's simple. If one looks at the dump (for a scalar "a" to make it more readable), one finds in __MAIN for "allocate(t2 :: x)":


  if (y._data == 0B)
    {
      D.1558 = (integer(kind=8)) x._vptr->_size;
      D.1559 = (void * restrict) __builtin_malloc (MAX_EXPR <D.1558, 1>);
      D.1557 = (struct t *) D.1559;
      y._data = D.1557;
    }
  x._vptr->_copy (x._data, y._data);

And in __copy_MAIN___t2:

    D.1520 = *dst;
    if (D.1520.a != 0B)
      __builtin_free ((void *) D.1520.a);

Thus, if MALLOC returns a nullified data, "(y._data->a)" will be false and everything is fine. However, if "y._data->a" contains random data, one tries to free the memory for y._data->a, which leads to a crash.

Thus, it is not surprising that it worked for you as GLIBC by default returns NULL memory. However, if one uses the environment variable

  export MALLOC_PERTURB_=$(($RANDOM % 255 + 1))

The memory returned by malloc will be prefilled by the bytes of MALLOC_PERTURB_ and the program will crash.

As also due to other reasons the memory might be uninitialized and valgrind also complains, the bug should be fixed. Possibly such that that there is no overhead (also not with Paul's (re)allocate patch.)
Comment 3 John David Anglin 2010-11-20 21:49:50 UTC
Segmentation fault also occurs ob hppa*-*-hpux*.
Comment 4 janus 2011-01-02 15:58:13 UTC
Reduced test case:

type t
end type

type, extends(t) :: t2
  integer, allocatable :: a
end type

class(t), allocatable :: x, y

allocate(t2 :: x)
allocate(y, source=x)

end



As far as I can see, the copying routine itself is alright. The problem seems to be with the default initialization of x%a. From the dump:


MAIN__ ()
{
  static struct t __def_init_MAIN___t;
  static struct t2 __def_init_MAIN___t2 = {};
  static struct __vtype_MAIN___t __vtab_MAIN___t = {._hash=6232021, ._size=0, ._extends=0B, ._def_init=&__def_init_MAIN___t, ._copy=__copy_MAIN___t};
  static struct __vtype_MAIN___t2 __vtab_MAIN___t2 = {._hash=43807140, ._size=8, ._extends=&__vtab_MAIN___t, ._def_init=&__def_init_MAIN___t2, ._copy=__copy_MAIN___t2};
  static struct __class_MAIN___t_a x = {};
  static struct __class_MAIN___t_a y = {};
  static void __copy_MAIN___t (struct t & restrict, struct t & restrict);
  static void __copy_MAIN___t2 (struct t2 & restrict, struct t2 & restrict);

  try
    {
      x._data = 0B;
      y._data = 0B;

      ! [ ... allocation of x._data ... ]

      (void) __builtin_memcpy ((void *) x._data, (void *) x._vptr->_def_init, (<unnamed-unsigned:64>) x._vptr->_size);

      ! [ ... allocation of y.data ... ]

      x._vptr->_copy (x._data, y._data);
      (struct __vtype_MAIN___t *) y._vptr = (struct __vtype_MAIN___t *) x._vptr;
    }
  finally
    {
      [...]
    }
}


x%a should get default-initialized to NULL via the memcpy call from x._vptr->_def_init. The memcpy itself is done alright, but apparently the _def_init variable is not properly initialized (is it?):

  static struct t2 __def_init_MAIN___t2 = {};
Comment 5 janus 2011-01-02 16:24:00 UTC
(In reply to comment #4)
> x%a should get default-initialized to NULL via the memcpy call from
> x._vptr->_def_init. The memcpy itself is done alright, but apparently the
> _def_init variable is not properly initialized (is it?):
> 
>   static struct t2 __def_init_MAIN___t2 = {};


No, it's not. The problem is:  __def_init_MAIN___t2 is declared as SAVE in gfc_find_derived_vtab. Therefore it's allocatable components do not get default-initialized to NULL. However, if we don't give it the SAVE attribute, we get:

Error: Pointer initialization target at (1) must have the SAVE attribute

Because the def_init symbol is used as pointer init target for the vtab.

:(
Comment 6 janus 2011-01-02 17:51:39 UTC
*** Bug 47085 has been marked as a duplicate of this bug. ***
Comment 7 janus 2011-01-02 19:02:31 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > x%a should get default-initialized to NULL via the memcpy call from
> > x._vptr->_def_init. The memcpy itself is done alright, but apparently the
> > _def_init variable is not properly initialized (is it?):
> > 
> >   static struct t2 __def_init_MAIN___t2 = {};
> 
> 
> No, it's not. The problem is:  __def_init_MAIN___t2 is declared as SAVE in
> gfc_find_derived_vtab. Therefore it's allocatable components do not get
> default-initialized to NULL.

However, if I add a static initializer for this guy, i.e.

  static struct t2 __def_init_MAIN___t2 = {.a=0B};

I still get the same valgrind error. What the hell?


Conclusion: The error must come from the second part of the copying routine's dump:

        if (D.1545.a != 0B)
          {
            __builtin_free ((void *) D.1545.a);
          }
        D.1545.a = 0B;

Here D.1545 is the old value of the destination. Since this was just freshly allocated before calling the copy routine, it's technically undefined (unless the malloc zeroes it). Now, that really is it!

Two possible solutions:

1) Initialize with _def_init, before calling _copy (which means extra work since we initialize twice).

2) Forget about freeing the old value of 'dst' (which is fine as long as we only call _copy on allocation, where dst is undefined; this is currently the case).
Comment 8 janus 2011-01-02 19:23:55 UTC
(In reply to comment #7)
> 2) Forget about freeing the old value of 'dst' (which is fine as long as we
> only call _copy on allocation, where dst is undefined; this is currently the
> case).

This is definitely the proper thing to do: We only need _copy for ALLOCATE with SOURCE. And the best thing is, it's really easily done:

Index: gcc/fortran/class.c
===================================================================
--- gcc/fortran/class.c	(revision 168379)
+++ gcc/fortran/class.c	(working copy)
@@ -548,7 +548,7 @@ gfc_find_derived_vtab (gfc_symbol *derived)
 		  copy->formal->next->sym = dst;
 		  /* Set up code.  */
 		  sub_ns->code = gfc_get_code ();
-		  sub_ns->code->op = EXEC_ASSIGN;
+		  sub_ns->code->op = EXEC_INIT_ASSIGN;
 		  sub_ns->code->expr1 = gfc_lval_expr_from_sym (dst);
 		  sub_ns->code->expr2 = gfc_lval_expr_from_sym (src);
 		  /* Set initializer.  */


It took me a long while to see this, but in the end it's completely trivial. I'll commit after regtesting.
Comment 9 janus 2011-01-02 21:01:53 UTC
Author: janus
Date: Sun Jan  2 21:01:50 2011
New Revision: 168409

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168409
Log:
2011-01-02  Janus Weil  <janus@gcc.gnu.org>

	PR fortran/46408
	* class.c (gfc_find_derived_vtab): Use EXEC_INIT_ASSIGN for __copy_
	routine.


2011-01-02  Janus Weil  <janus@gcc.gnu.org>

	PR fortran/46408
	* gfortran.dg/class_19.f03: Adjust counting of __builtin_free.

Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/class.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gfortran.dg/class_19.f03
Comment 10 janus 2011-01-02 21:20:33 UTC
Fixed with r168409. Closing.