Hello, After a lot (a LOT) of work, I've come up with this test case. The test case *appears* to run fine, but valgrind shows something is amiss, and in the full application (much more complex) there follow multiple segfaults. I did not find (so far) a way to have a test case both small and exibiting the error out of valgrind, hope you'll pardon me. The error: the ALLOCATABLE components of the ATX%A variable are detected as ALLOCATED, and seemingly are printed OK, but VALGRIND says they were already freed. =============================================================================== [sfilippo@donald bug23]$ gfortran -v Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/usr/local/gnu46/libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc/configure --prefix=/usr/local/gnu46 --enable-languages=c,c++,fortran Thread model: posix gcc version 4.6.0 20100830 (experimental) (GCC) [sfilippo@donald bug23]$ gfortran -ggdb -o bug23 bug23.f03 [sfilippo@donald bug23]$ ./bug23 New version Allocation status: T T T 0 0 4 5 0 0 2 3 0 0 0 0 0 0 0 0 0.0000000000000000 1.0000000000000000 2.0000000000000000 3.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 [sfilippo@donald bug23]$ valgrind ./bug23 ==6940== Memcheck, a memory error detector ==6940== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al. ==6940== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info ==6940== Command: ./bug23 ==6940== New version Allocation status: T T T ==6940== Invalid read of size 4 ==6940== at 0x4CC19B8: extract_int (write.c:450) ==6940== by 0x4CC27B1: write_integer (write.c:1260) ==6940== by 0x4CC5BDE: _gfortrani_list_formatted_write (write.c:1552) ==6940== by 0x4CBACA7: _gfortran_transfer_array (transfer.c:2000) ==6940== by 0x401834: MAIN__ (bug23.f03:308) ==6940== by 0x40197A: main (bug23.f03:284) ==6940== Address 0x5141450 is 0 bytes inside a block of size 16 free'd ==6940== at 0x4A04D72: free (vg_replace_malloc.c:325) ==6940== by 0x4CCB5A8: _gfortran_move_alloc (move_alloc.c:41) ==6940== by 0x400B90: __psb_d_csr_mat_mod_MOD_psb_d_mv_csr_from_fmt (bug23.f03:238) ==6940== by 0x401449: __psb_d_mat_mod_MOD_psb_d_mv_from (bug23.f03:277) ==6940== by 0x4016C6: MAIN__ (bug23.f03:302) ==6940== by 0x40197A: main (bug23.f03:284) ==6940== 1 3 4 5 1 1 2 3 0 0 0 0 0 0 0 0 ==6940== Invalid read of size 8 ==6940== at 0x4CC3795: write_float (write_float.def:1050) ==6940== by 0x4CC508C: _gfortrani_write_real (write.c:1470) ==6940== by 0x4CC5BAE: _gfortrani_list_formatted_write (write.c:1561) ==6940== by 0x4CBACA7: _gfortran_transfer_array (transfer.c:2000) ==6940== by 0x401904: MAIN__ (bug23.f03:310) ==6940== by 0x40197A: main (bug23.f03:284) ==6940== Address 0x5141510 is 0 bytes inside a block of size 96 free'd ==6940== at 0x4A04D72: free (vg_replace_malloc.c:325) ==6940== by 0x4CCB5A8: _gfortran_move_alloc (move_alloc.c:41) ==6940== by 0x400BDC: __psb_d_csr_mat_mod_MOD_psb_d_mv_csr_from_fmt (bug23.f03:240) ==6940== by 0x401449: __psb_d_mat_mod_MOD_psb_d_mv_from (bug23.f03:277) ==6940== by 0x4016C6: MAIN__ (bug23.f03:302) ==6940== by 0x40197A: main (bug23.f03:284) ==6940== 1.0000000000000000 1.0000000000000000 2.0000000000000000 3.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 ==6940== ==6940== HEAP SUMMARY: ==6940== in use at exit: 0 bytes in 0 blocks ==6940== total heap usage: 33 allocs, 33 frees, 4,938 bytes allocated ==6940== ==6940== All heap blocks were freed -- no leaks are possible ==6940== ==6940== For counts of detected and suppressed errors, rerun with: -v ==6940== ERROR SUMMARY: 28 errors from 2 contexts (suppressed: 6 from 6)
Created attachment 21592 [details] test case
(In reply to comment #0) > Hello, > After a lot (a LOT) of work, I've come up with this test case. The test case > *appears* to run fine, but valgrind shows something is amiss, and in the full > application (much more complex) there follow multiple segfaults. I did not find > (so far) a way to have a test case both small and exibiting the error out of > valgrind, hope you'll pardon me. > The error: the ALLOCATABLE components of the ATX%A variable are detected as > ALLOCATED, and seemingly are printed OK, but VALGRIND says they were already > freed. > > Should be more careful before talking: the output is visibly wrong. > [sfilippo@donald bug23]$ ./bug23 > New version > Allocation status: T T T > 0 0 4 5 > 0 0 2 3 0 0 > 0 0 0 0 0 0 > 0.0000000000000000 1.0000000000000000 2.0000000000000000 > 3.0000000000000000 0.0000000000000000 0.0000000000000000 > 0.0000000000000000 0.0000000000000000 0.0000000000000000 > 0.0000000000000000 0.0000000000000000 0.0000000000000000 > [sfilippo@donald bug23]$ valgrind ./bug23 > ==6940== Memcheck, a memory error detector > ==6940== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al. > ==6940== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info > ==6940== Command: ./bug23 > ==6940== > New version > Allocation status: T T T > ==6940== Invalid read of size 4 > ==6940== at 0x4CC19B8: extract_int (write.c:450) > ==6940== by 0x4CC27B1: write_integer (write.c:1260) > ==6940== by 0x4CC5BDE: _gfortrani_list_formatted_write (write.c:1552) > ==6940== by 0x4CBACA7: _gfortran_transfer_array (transfer.c:2000) > ==6940== by 0x401834: MAIN__ (bug23.f03:308) > ==6940== by 0x40197A: main (bug23.f03:284) > ==6940== Address 0x5141450 is 0 bytes inside a block of size 16 free'd > ==6940== at 0x4A04D72: free (vg_replace_malloc.c:325) > ==6940== by 0x4CCB5A8: _gfortran_move_alloc (move_alloc.c:41) > ==6940== by 0x400B90: __psb_d_csr_mat_mod_MOD_psb_d_mv_csr_from_fmt > (bug23.f03:238) > ==6940== by 0x401449: __psb_d_mat_mod_MOD_psb_d_mv_from (bug23.f03:277) > ==6940== by 0x4016C6: MAIN__ (bug23.f03:302) > ==6940== by 0x40197A: main (bug23.f03:284) > ==6940== > 1 3 4 5 > 1 1 2 3 0 0 > 0 0 0 0 0 0 > ==6940== Invalid read of size 8 > ==6940== at 0x4CC3795: write_float (write_float.def:1050) > ==6940== by 0x4CC508C: _gfortrani_write_real (write.c:1470) > ==6940== by 0x4CC5BAE: _gfortrani_list_formatted_write (write.c:1561) > ==6940== by 0x4CBACA7: _gfortran_transfer_array (transfer.c:2000) > ==6940== by 0x401904: MAIN__ (bug23.f03:310) > ==6940== by 0x40197A: main (bug23.f03:284) > ==6940== Address 0x5141510 is 0 bytes inside a block of size 96 free'd > ==6940== at 0x4A04D72: free (vg_replace_malloc.c:325) > ==6940== by 0x4CCB5A8: _gfortran_move_alloc (move_alloc.c:41) > ==6940== by 0x400BDC: __psb_d_csr_mat_mod_MOD_psb_d_mv_csr_from_fmt > (bug23.f03:240) > ==6940== by 0x401449: __psb_d_mat_mod_MOD_psb_d_mv_from (bug23.f03:277) > ==6940== by 0x4016C6: MAIN__ (bug23.f03:302) > ==6940== by 0x40197A: main (bug23.f03:284) > ==6940== > 1.0000000000000000 1.0000000000000000 2.0000000000000000 > 3.0000000000000000 0.0000000000000000 0.0000000000000000 > 0.0000000000000000 0.0000000000000000 0.0000000000000000 > 0.0000000000000000 0.0000000000000000 0.0000000000000000 > ==6940== > ==6940== HEAP SUMMARY: > ==6940== in use at exit: 0 bytes in 0 blocks > ==6940== total heap usage: 33 allocs, 33 frees, 4,938 bytes allocated > ==6940== > ==6940== All heap blocks were freed -- no leaks are possible > ==6940== > ==6940== For counts of detected and suppressed errors, rerun with: -v > ==6940== ERROR SUMMARY: 28 errors from 2 contexts (suppressed: 6 from 6) >
And here is the (expected) output with XLF. [snfilip@josquin ~]$ xlf2003_r -o bug23 bug23.f03 ** psb_const_mod === End of Compilation 1 === ** psb_base_mat_mod === End of Compilation 2 === ** psb_d_base_mat_mod === End of Compilation 3 === ** psb_d_csr_mat_mod === End of Compilation 4 === ** psb_d_mat_mod === End of Compilation 5 === ** bug23 === End of Compilation 6 === 1501-510 Compilation successful for file bug23.f03. [snfilip@josquin ~]$ ./bug23 New version Allocation status: T T T 1 3 4 5 1 1 2 3 0 0 0 0 0 0 0 0 1.00000000000000000 1.00000000000000000 2.00000000000000000 3.00000000000000000 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00
Ok, I could reduce this quite a bit: program bug23 implicit none type :: psb_base_sparse_mat integer, allocatable :: irp(:) end type psb_base_sparse_mat class(psb_base_sparse_mat), allocatable :: a type(psb_base_sparse_mat) :: acsr allocate(acsr%irp(4)) acsr%irp(1:4) = (/1,3,4,5/) write(*,*) acsr%irp(:) allocate(a,source=acsr) write(*,*) a%irp(:) call move_alloc(acsr%irp, a%irp) write(*,*) a%irp(:) end program bug23 Executing gives: 1 3 4 5 1 3 4 5 0 0 4 5 The last line here should be the same as the first two. Changing the CLASS variable into a TYPE makes it work. Running through valgrind shows: ==11502== Command: ./a.out ==11502== 1 3 4 5 1 3 4 5 ==11502== Invalid read of size 4 ==11502== at 0x4EE59B8: extract_int (write.c:450) ==11502== by 0x4EE67B1: write_integer (write.c:1260) ==11502== by 0x4EE9BDE: _gfortrani_list_formatted_write (write.c:1552) ==11502== by 0x4EDECA7: _gfortran_transfer_array (transfer.c:2000) ==11502== by 0x400B07: MAIN__ (in /home/jweil/GSoC/PRs/45451/a.out) ==11502== by 0x400BA8: main (in /home/jweil/GSoC/PRs/45451/a.out) ==11502== Address 0x5934360 is 0 bytes inside a block of size 16 free'd ==11502== at 0x4C280BD: free (vg_replace_malloc.c:366) ==11502== by 0x4EEF5A8: _gfortran_move_alloc (move_alloc.c:41) ==11502== by 0x400AAC: MAIN__ (in /home/jweil/GSoC/PRs/45451/a.out) ==11502== by 0x400BA8: main (in /home/jweil/GSoC/PRs/45451/a.out) ==11502== 1 3 4 5 ==11502== ==11502== HEAP SUMMARY: ==11502== in use at exit: 0 bytes in 0 blocks ==11502== total heap usage: 18 allocs, 18 frees, 3,844 bytes allocated ==11502== ==11502== All heap blocks were freed -- no leaks are possible ==11502== ==11502== For counts of detected and suppressed errors, rerun with: -v ==11502== ERROR SUMMARY: 4 errors from 1 contexts (suppressed: 4 from 4)
(In reply to comment #4) > Ok, I could reduce this quite a bit: > Good :) In the meantime, I tried with MOLD= in place of SOURCE=, and in the full application it still gives a segfault; I think that variant should be checked as well. Actually, MOLD= is preferrable for the kind of thing I am doing, but since it's an F2008 feature I had to put it under IFDEF for the time being. Salvatore
(In reply to comment #5) > In the meantime, I tried with MOLD= in place of SOURCE=, and in the full > application it still gives a segfault; I think that variant should be checked > as well. Note that for MOLD there is PR 44541 left (which I am about to fix). Up to now MOLD works only with non-polymorphic expressions. Once the PR is fixed, polymorphics should work too. Until this has happened, please refrain from opening further PRs on MOLD.
(In reply to comment #6) > > Note that for MOLD there is PR 44541 left (which I am about to fix). Up to now > MOLD works only with non-polymorphic expressions. Once the PR is fixed, > polymorphics should work too. Until this has happened, please refrain from > opening further PRs on MOLD. > Fine. Waiting for it....
(In reply to comment #7) > (In reply to comment #6) > > Fine. Waiting for it.... > Consider the following variation: upon exit from DOIT, the ACSR variable should be deallocated (since it was MOVE_ALLOCed to atx%a) but it is not, hence double free. =============================================================================== [sfilippo@localhost bug23]$ ./bug23_1 Allocation status acsr: T Allocation status atx: T T T 1 3 4 5 1 1 2 3 0 0 0 0 0 0 0 0 1.0000000000000000 1.0000000000000000 2.0000000000000000 3.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 0.0000000000000000 *** glibc detected *** ./bug23_1: double free or corruption (!prev): 0x00000000023bbfe0 *** ======= Backtrace: ========= /lib64/libc.so.6[0x3d69675676] ./bug23_1[0x401876] ./bug23_1[0x4018da] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3d6961ec5d] ./bug23_1[0x400869] ======= Memory map: ======== 00400000-00402000 r-xp 00000000 08:05 2187330 /home/sfilippo/NUMERICAL/NewPSBLAS/GNUbugs/bug23/bug2 ........................
Created attachment 21613 [details] test case
I think this is a variant of PR 42647: Allocatable components of allocatable scalars are not correctly handled.
(In reply to comment #4) > Ok, I could reduce this quite a bit: > > > 1 3 4 5 > 1 3 4 5 > 0 0 4 5 > > The last line here should be the same as the first two. Changing the CLASS > variable into a TYPE makes it work. Running through valgrind shows: > Strange thing is that guarding with a SELECT TYPE statement as in =========================== allocate(a,source=acsr) write(*,*) acsr%irp(:) select type(aa=>a) type is (psb_base_sparse_mat) call move_alloc(acsr%irp, aa%irp) write(*,*) aa%irp(:) class default write(*,*) 'Wrong class default' end select ================================ still gives the wrong answer [sfilippo@localhost bug23]$ ./bug23_janus 1 3 4 5 1 3 4 5 0 0 4 5
(In reply to comment #4) > Ok, I could reduce this quite a bit: > allocate(acsr%irp(4)) > allocate(a,source=acsr) If one looks at the dump, one sees a similar problem to PR 43018. For the first allocate, one has: D.1530 = (void * restrict) __builtin_malloc (16); which makes sense: 4 * integer(4) = 16 byte. However, for the second allocate one sees: D.1532 = (void * restrict) __builtin_malloc (48); which might be OK. However, the following is definitely invalid - and causes here a segfault (void) __builtin_memcpy ((void *) a.$data, (void *) &acsr, 48);
(In reply to comment #12) > If one looks at the dump Scratch that part. In one case one allocates the component (integr(4)*4) in the other case the type, i.e. the array descriptor. The actual issue is that only a shallow copy is done: (void) __builtin_memcpy ((void *) a.$data, (void *) &acsr, 48); Thus, a.$data->irp.data and acsr.irp.data point to the same memory. Of course, this will fail at finally if (acsr.irp.data != 0B) __builtin_free ((void *) acsr.irp.data); if (a.$data != 0B) if (a.$data->irp.data != 0B) __builtin_free ((void *) a.$data->irp.data); The problem is in trans-stmt.c's gfc_trans_allocate: if (code->expr3 && !code->expr3->mold) { /* Initialization via SOURCE block (or static default initializer). */ gfc_expr *rhs = gfc_copy_expr (code->expr3); if (al->expr->ts.type == BT_CLASS) [...] tmp = gfc_build_memcpy_call (dst.expr, src.expr, memsz); } else tmp = gfc_trans_assignment (gfc_expr_to_initialize (expr), rhs, false, false); The gfc_trans_assignment properly takes care of allocatable components (cf. PR 43018) while a simple memcopy does not. The following works - though I am not sure whether it is the correct patch. Janus, what do you think? --- a/gcc/fortran/trans-stmt.c +++ b/gcc/fortran/trans-stmt.c @@ -4489,15 +4489,9 @@ gfc_trans_allocate (gfc_code * code) gfc_expr *rhs = gfc_copy_expr (code->expr3); if (al->expr->ts.type == BT_CLASS) { - gfc_se dst,src; if (rhs->ts.type == BT_CLASS) gfc_add_component_ref (rhs, "$data"); - gfc_init_se (&dst, NULL); - gfc_init_se (&src, NULL); - gfc_conv_expr (&dst, expr); - gfc_conv_expr (&src, rhs); - gfc_add_block_to_block (&block, &src.pre); - tmp = gfc_build_memcpy_call (dst.expr, src.expr, memsz); + tmp = gfc_trans_assignment (expr, rhs, false, false); } else tmp = gfc_trans_assignment (gfc_expr_to_initialize (expr),
(In reply to comment #13) > The following works - though I am not sure whether it is the correct patch. > Janus, what do you think? I mean in particular gfc_expr_to_initialize (expr) vs. expr and questions regarding the default initializer. Further comments: a) The first example (attachment 21592 [details]) produces the correct numbers (or at least the same as Cray and as in comment 3). Also in comment 0 the numbers are OK for the the valgrind run - only the ones of the non-valgrind run were wrong. If one uses GLIBC's MALLOC checks the program still segfaults; valgrind shows a failure for: Conditional jump or move depends on uninitialised value(s) at 0x4EF7FD2: _gfortran_move_alloc (move_alloc.c:40) by 0x400C8C: __psb_d_csr_mat_mod_MOD_psb_d_mv_csr_from_fmt (long1.f90:238) which is if (to->data) and call move_alloc(b%irp, a%irp) if one checks the allocation status of "b%irp" after the mvoe_alloc call, the result is "T" rather than the expect "F". b) The second example (attachment 21613 [details]) fails with crayftn at it is invalid. (In "call doit(atx,acsr)" the second argument is TYPE instead of CLASS as the *allocatable* dummy is.) gfortran does not detect this. -- See new PR 46161.
Author: burnus Date: Tue Oct 26 06:49:43 2010 New Revision: 165936 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=165936 Log: 2010-10-26 Tobias Burnus <burnus@net-b.de> PR fortran/45451 * trans-stmt.c (gfc_trans_allocate): Do a deep-copy for SOURCE=. PR fortran/43018 * trans-array.c (duplicate_allocatable): Use size of type and not the size of the pointer to the type. 2010-10-26 Tobias Burnus <burnus@net-b.de> PR fortran/45451 * gfortran.dg/class_allocate_5.f90: New. Added: trunk/gcc/testsuite/gfortran.dg/class_allocate_5.f90 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/trans-array.c trunk/gcc/fortran/trans-stmt.c trunk/gcc/testsuite/ChangeLog
The test case of comment 4 is now fixed, namely: ALLOCATE (polymorphic, SOURCE=non-polymorphic) Unfixed is a polymorphic SOURCE= as this requires a deep copying of the effective type, cf. PR 46174. Consequently, the first test case (attachment 21592 [details]) will fail at allocate(a%a,source=b, stat=info) if "b" has any allocatable components. That's the case if "b" is of the effective type "psb_d_csr_sparse_mat". Interestingly, wrapping the ALLOCATE in select type (b) type is (psb_d_csr_sparse_mat) allocate(a%a,source=b, stat=info) end select does not help :-( I do not know whether this SOURCE= problem is the only or only the main bug exposed by the first test case. * * * Unfixed is the diagnostic of passing a non-polymorphic actual to an *allocatable* polymorphic dummy. That accepts-invalid bug is PR 46161 and affects the second attachment (attachment 21613 [details]).
Author: janus Date: Fri Nov 5 18:14:52 2010 New Revision: 166368 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=166368 Log: 2010-11-05 Janus Weil <janus@gcc.gnu.org> PR fortran/45451 PR fortran/46174 * class.c (gfc_find_derived_vtab): Improved search for existing vtab. Add component '$copy' to vtype symbol for polymorphic deep copying. * expr.c (gfc_check_pointer_assign): Make sure the vtab is generated during resolution stage. * resolve.c (resolve_codes): Don't resolve code if namespace is already resolved. * trans-stmt.c (gfc_trans_allocate): Call '$copy' procedure for polymorphic ALLOCATE statements with SOURCE. 2010-11-05 Janus Weil <janus@gcc.gnu.org> PR fortran/45451 PR fortran/46174 * gfortran.dg/class_19.f03: Modified. * gfortran.dg/class_allocate_6.f03: New. Added: trunk/gcc/testsuite/gfortran.dg/class_allocate_6.f03 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/class.c trunk/gcc/fortran/expr.c trunk/gcc/fortran/resolve.c trunk/gcc/fortran/trans-stmt.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gfortran.dg/class_19.f03
r166368 fixes the deep copy issue and makes the original test case give the correct output.