Created attachment 54062 [details] a simple reproducer The attached program crashes when compiled on ubuntu 22.04 (jammy) based on debian bookworm with gfortran-11: $ gfortran bug.f90 $ ./a.out Internal Error: Invalid type in descriptor Error termination. Backtrace: #0 0x7f04aefe9ad0 in ??? #1 0x7f04aefea649 in ??? #2 0x7f04aefeae38 in ??? #3 0x7f04af22c8a4 in ??? #4 0x56490815c24e in ??? #5 0x56490815c1f3 in ??? #6 0x56490815c316 in ??? #7 0x56490815c352 in ??? #8 0x7f04aedc7d8f in __libc_start_call_main at ../sysdeps/nptl/libc_start_call_main.h:58 #9 0x7f04aedc7e3f in __libc_start_main_impl at ../csu/libc-start.c:392 #10 0x56490815c0c4 in ??? #11 0xffffffffffffffff in ??? After some investigations, I found gfortran-11 uses libgfortran built for gfortran-12. I guess the fine folks at debian chose to do so since the version library for libgfortran in 5.0.0 (from gfortran-8 up to gfortran-12, I did not test gfortran-13) and they assume or expect ABI and semantic compatibility. The issue can be reproduced by compiling the attached reproducer with gfortran-11, and running it after tweaking LD_LIBRARY_PATH so it uses libgfortran from gfortran-12. I think there are basically two ways to see this issue: - it should work, this is a bug in libfrotran-12 - libgfortran does not ensure such backward compatibility, and because of the semantic change, the library version should be/have been bumped ( e.g. 5.1.0 or 6.0.0) from gfortran-12. Once it is decided on how to move forward, I will be more than happy to report this to the debian folks. For the records, this issue came to my attention at https://stackoverflow.com/questions/74738981/the-use-mpi-f08-statement-causes-fortran-mpi-programs-to-crash-on-ubuntu-22-04/74742760#74742760
If you compile with gfortran-11 and link with the library that comes with that version of gfortran does it crash? If you compile with gfortran-12 and link with the library that comes with that version of gfortran does it crash? Can you get an actual backtrace?
ubuntu does not ship libgfortran.so from gfortran-11. I tried on a RedHat box, and the issue only occurs when - I compile with gfortran-11 - *and* I force libgfortran-12 (so if i use the same gfortran and libgfortran versions, there is no issue). Here is a stack trace in these conditions: #0 0x00002aaaaba2dd70 in _exit () from /lib64/libc.so.6 #1 0x00002aaaab9a1cab in __run_exit_handlers () from /lib64/libc.so.6 #2 0x00002aaaab9a1d37 in exit () from /lib64/libc.so.6 #3 0x00002aaaaacf0976 in _gfortrani_exit_error (status=3) at ../../../../src/gcc-12.1.0/libgfortran/runtime/error.c:218 #4 0x00002aaaaacf12af in _gfortrani_internal_error (cmp=0x0, message=0x2aaaaafdf2a5 "Invalid type in descriptor") at ../../../../src/gcc-12.1.0/libgfortran/runtime/error.c:534 #5 0x00002aaaaaf787e2 in _gfortran_gfc_desc_to_cfi_desc (d_ptr=0x7fffffffcfa8, s=0x7fffffffcfd0) at ../../../../src/gcc-12.1.0/libgfortran/runtime/ISO_Fortran_binding.c:219 #6 0x0000000000400808 in pub_f08ts (a=<unknown type in /home/usersup/gilles/build/gcc-11.3.0/x86_64-pc-linux-gnu/libgfortran/a.out, CU 0x0, DIE 0xc1>, b=<unknown type in /home/usersup/gilles/build/gcc-11.3.0/x86_64-pc-linux-gnu/libgfortran/a.out, CU 0x0, DIE 0xce>) at /home/usersup/gilles/y.f90:29 #7 0x00000000004007ad in bugsub (a=1, b=-3.67578065e-13) at /home/usersup/gilles/y.f90:35 #8 0x00000000004008d0 in bug () at /home/usersup/gilles/y.f90:43 #9 0x0000000000400907 in main (argc=1, argv=0x7fffffffd4cd) at /home/usersup/gilles/y.f90:44 #10 0x00002aaaab98a555 in __libc_start_main () from /lib64/libc.so.6 #11 0x0000000000400699 in _start () FWIW, here is the value of s when _gfortran_gfc_desc_to_cfi_desc is invoked (gdb) p *s $6 = {base_addr = 0x7fffffffd04c, offset = 38, dtype = {elem_len = 4, version = 0, rank = 0 '\000', type = 11 '\v', attribute = 2}, span = 4, dim = 0x7fffffffcff8} I also noted that if the reproducer is built with gcc-12, _gfortran_gfc_desc_to_cfi_desc() is not invoked at all.
For the sake of completeness, debian/ubuntu ships libgfortran.a (read, the static library) from gfortran-11, so I can get this reproducer work if compiling with -static-libgfortran. I also manually rebuilt gfortran-11 on debian (I did my best to reuse debian patches) and it worked. It took me some time to figure out it worked because I then implicitly used libgfortran-11.
$ gfortran11 -g -fno-backtrace pr108056.f90 $ ./a.out && echo works works <--copy executable to system with gcc-13 trunk--> $ ./a.out Internal Error: Invalid type in descriptor $ gdb ./a.out (gdb) b _gfortrani_internal_error (gdb) b ISO_Fortran_binding.c:219 (gdb) r _gfortran_gfc_desc_to_cfi_desc(d_ptr=0x7fffffffe948, s=0x7fffffffe970) at /gcc_trunk/libgfortran/runtime/ISO_Fortran_binding.c:219 219 internal_error (NULL, "Invalid type in descriptor"); (gdb) where #0 _gfortran_gfc_desc_to_cfi_desc (d_ptr=0x7fffffffe948, s=0x7fffffffe970) at /gcc_trunk/libgfortran/runtime/ISO_Fortran_binding.c:219 #1 0x000055555555524b in pub_f08ts (a=<unknown type in /tmp/pr108056/a.out, CU 0x53d, DIE 0x5fe>, b=<unknown type in /tmp/pr108056/a.out, CU 0x53d, DIE 0x60b>) at pr108056.f90:29 #2 0x00005555555551f0 in bugsub (a=1, b=-3.08878791e-13) at pr108056.f90:35 #3 0x0000555555555313 in bug () at pr108056.f90:43 (gdb) p d->type $1 = 11 (gdb) p type $2 = 11 '\v' (gdb) p *s $3 = {base_addr = 0x7fffffffe9ec, offset = 0, dtype = {elem_len = 4, version = 0, rank = 0 '\000', type = 11 '\v', attribute = 2}, span = 4, dim = 0x7fffffffe998} (gdb) p/d BT_ASSUMED $4 = 11 /* NOTE: Since GCC 12, the FE generates code to do the conversion directly without calling this function. */ void gfc_desc_to_cfi_desc (CFI_cdesc_t **d_ptr, const gfc_array_void *s) { Looks to be backwards compatibility issue, BT_ASSUMED not handled?
CCing folks doing the CFI work for GCC 12.
(In reply to Gilles Gouaillardet from comment #2) > ubuntu does not ship libgfortran.so from gfortran-11. > > I tried on a RedHat box, and the issue only occurs when > - I compile with gfortran-11 > - *and* I force libgfortran-12 Isn't that forward compatibility? Backward compatibility would be something compiled with 12 can use libgfortran from 11. But, I suspect that is also broken, because ... > (so if i use the same gfortran and libgfortran versions, there is no issue). > > Here is a stack trace in these conditions: > > #0 0x00002aaaaba2dd70 in _exit () from /lib64/libc.so.6 > #1 0x00002aaaab9a1cab in __run_exit_handlers () from /lib64/libc.so.6 > #2 0x00002aaaab9a1d37 in exit () from /lib64/libc.so.6 > #3 0x00002aaaaacf0976 in _gfortrani_exit_error (status=3) at > ../../../../src/gcc-12.1.0/libgfortran/runtime/error.c:218 > #4 0x00002aaaaacf12af in _gfortrani_internal_error (cmp=0x0, > message=0x2aaaaafdf2a5 "Invalid type in descriptor") at > ../../../../src/gcc-12.1.0/libgfortran/runtime/error.c:534 > #5 0x00002aaaaaf787e2 in _gfortran_gfc_desc_to_cfi_desc > (d_ptr=0x7fffffffcfa8, s=0x7fffffffcfd0) at > ../../../../src/gcc-12.1.0/libgfortran/runtime/ISO_Fortran_binding.c:219 a lot of work went into fixing problems with ISO_Fortran_binding.[ch]. It seems that that work either was not merged into 11 or was only partially merged. Either way it appears the ABI of the library has been broken. If you're application is using ISO_Fortran_binding.h, then you'll want to use gfortran 12 or newer.
I've swapped out just about all the details on this work after more than a year, but.... we shouldn't be trying to create a CFI descriptor with BT_ASSUMED at all, should we? If the compiler is generating a CFI descriptor for an assumed-type argument it's supposed to use the actual type of the argument passed, not BT_ASSUMED, right? If gcc 11 had a bug that caused it to do that incorrectly, is it necessary to retain ABI compatibility by continuing to reproduce the bug in newer versions of libgfortran? Maybe we should just remove the functions that are allegedly there for compatibility so that users will get a link error instead?
(In reply to sandra from comment #7) If gcc 11 had a bug that > caused it to do that incorrectly, is it necessary to retain ABI > compatibility by continuing to reproduce the bug in newer versions of > libgfortran? Maybe we should just remove the functions that are allegedly > there for compatibility so that users will get a link error instead? No, certainly not. As long as libgfortran.so keeps its SONAME (libgfortran.so.5 right now), it should remain backwards compatible (libgfortran.so.5 from newer gcc should handle programs compiled by older gcc as long as the programs were valid).
The important question is if correct code compiled by gcc 11 was working correctly with libgfortran 11, if yes, then libgfortran 12+ should maintain compatibility (of course, when soname is bumped, that compatibility code can be thrown away).
On Mon, 12 Dec 2022, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108056 > > --- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > The important question is if correct code compiled by gcc 11 was working > correctly with libgfortran 11, if yes, then libgfortran 12+ should maintain > compatibility (of course, when soname is bumped, that compatibility code can be > thrown away). If GCC 11 behaves incorrectly we can of course also fix that on the branch. Nevertheless breaking old working executables isn't a good idea so if we can keep them working then please do so.
gfortran-11 compiles and run correctly when it uses libgfortran-11. To be perfectly clear, compilation always work: the issue occurs at runtime when gfortran-11 compiled code uses libgfortran-12).
First, there were several issue in GCC 12 related to using CFI_. Thus, using GCC 12 is highly recommended. This can be seen when implementing the function using the following code (and removing ', name="sync"' - calling 'void sync(void)' system function): #include <ISO_Fortran_binding.h> void bar_ts (CFI_cdesc_t *a, CFI_cdesc_t *b) { __builtin_printf ("a = %s, b = %s\n", (a->type == CFI_type_float) ? "float" : "something else", (b->type == CFI_type_float) ? "float" : "something else"); } This prints the expected value: a = float, b = float * * * In GCC 11, the value that arrives at type = GFC_DESCRIPTOR_TYPE (s); and is then used for d->type = (CFI_type_t)type; is BT_ASSUMED (= 11) instead of the expected BT_REAL (= 3), loosing the data type. As d->type is now BT_ASSUMED and this case is not handled, we run into the code: switch (d->type) ... default: internal_error (NULL, "Invalid type in descriptor"); * * * I want to note that both functions, _gfortran_cfi_desc_to_gfc_desc _gfortran_gfc_desc_to_cfi_desc are only in GCC 12's libgfortran.so to provide backward compatibility with GCC <= 11. Thus, we have two options: (A) We change those to functions back to the GCC 11 version; the new check was added in Sandra's commit r12-3321-g93b6b2f614eb692d1d8126ec6cb946984a9d01d7 back when those functions were still used in GCC 12. (B) I think we have to possibilities to map this: BT_ASSUMED -> CFI_type_cptr or CFI_type_other; using the latter, that's the following (untested but it should work): ------------------------------------------- --- a/libgfortran/runtime/ISO_Fortran_binding.c +++ b/libgfortran/runtime/ISO_Fortran_binding.c @@ -182,4 +182,7 @@ gfc_desc_to_cfi_desc (CFI_cdesc_t **d_ptr, const gfc_array_void *s) d->type = CFI_type_struct; break; + case BT_ASSUME: + d->type = CFI_type_other; + break; case BT_VOID: /* FIXME: PR 100915. GFC descriptors do not distinguish between -------------------------------------------- Thoughts whether (A) or (B) is better? In any case, we should check whether the testcase of comment 0 plus the C code above in this comment should be added as new testcase. But it might very well already be covered in our testsuite.
On Mon, 12 Dec 2022, burnus at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108056 > > --- Comment #12 from Tobias Burnus <burnus at gcc dot gnu.org> --- [...] > Thus, we have two options: > > (A) We change those to functions back to the GCC 11 version; the new check was > added in Sandra's commit r12-3321-g93b6b2f614eb692d1d8126ec6cb946984a9d01d7 > back when those functions were still used in GCC 12. > > (B) I think we have to possibilities to map this: > > BT_ASSUMED -> CFI_type_cptr or CFI_type_other; using the latter, that's the > following (untested but it should work): > > ------------------------------------------- > --- a/libgfortran/runtime/ISO_Fortran_binding.c > +++ b/libgfortran/runtime/ISO_Fortran_binding.c > @@ -182,4 +182,7 @@ gfc_desc_to_cfi_desc (CFI_cdesc_t **d_ptr, const > gfc_array_void *s) > d->type = CFI_type_struct; > break; > + case BT_ASSUME: > + d->type = CFI_type_other; > + break; > case BT_VOID: > /* FIXME: PR 100915. GFC descriptors do not distinguish between > -------------------------------------------- > > Thoughts whether (A) or (B) is better? I'd go with (A) if the functions are just for legacy code and not used by GCC 12+ at all.
The master branch has been updated by Tobias Burnus <burnus@gcc.gnu.org>: https://gcc.gnu.org/g:e205ec03f0794aeac3e8a89e947c12624d5a274e commit r13-4716-ge205ec03f0794aeac3e8a89e947c12624d5a274e Author: Tobias Burnus <tobias@codesourcery.com> Date: Thu Dec 15 12:25:07 2022 +0100 libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056] Since GCC 12, the conversion between the array descriptors formats - the internal (GFC) and the C binding one (CFI) - moved to the compiler itself such that the cfi_desc_to_gfc_desc/gfc_desc_to_cfi_desc functions are only used with older code (GCC 9 to 11). The newly added checks caused asserts as older code did not pass the proper values (e.g. real(4) as effective argument arrived as BT_ASSUME type as the effective type got lost inbetween). As proposed in the PR, revert to the GCC 11 version - known bugs is better than some fixes and new issues. Still, GCC 12 is much better in terms of TS29113 support and should really be used. This patch uses the current libgomp version of the GCC 11 branch, except it fixes the GFC version number (which is 0), uses calloc instead of malloc, and sets the lower bound to 1 instead of keeping it as is for CFI_attribute_other. libgfortran/ChangeLog: PR libfortran/108056 * runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc, gfc_desc_to_cfi_desc): Mostly revert to GCC 11 version for those backward-compatiblity-only functions.
Hi Tobias, My script shows that this commit cause testcase fail following: (It is still running and you might get a email from gcc-regression afterwards) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 19) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 19) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 19) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 25) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 25) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 25) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 31) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 31) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 31) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 34) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 34) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 34) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 37) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 37) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 37) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 40) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 40) FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 40) FAIL: libgomp.fortran/allocate-4.f90 -O (test for excess errors) FAIL: libgomp.fortran/allocate-4.f90 -O (test for excess errors) FAIL: libgomp.fortran/allocate-4.f90 -O (test for excess errors) Apology for could not debugging that since I am not familiar with fortran. Could you help to see why or we could just ignore them?
(In reply to Haochen Jiang from comment #15) > My script shows that this commit cause testcase fail following: > (It is still running and you might get a email from gcc-regression > afterwards) > FAIL: libgomp.fortran/allocate-4.f90 -O (test for errors, line 19) Sorry, it looks as if I accidentally committed one file that should not get committed. (It is part of the pending patch at https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608401.html ) I will revert it now and keep triple checking next time. What puzzles me is that the commit hook no longer rejects commits where files aren't listed in the ChangeLog. I will follow up with Martin Liška and see why that's no longer the case. Thanks for reporting!
The master branch has been updated by Tobias Burnus <burnus@gcc.gnu.org>: https://gcc.gnu.org/g:18af26fc375398f0a7cd7bae5aabebd447f8c899 commit r13-4737-g18af26fc375398f0a7cd7bae5aabebd447f8c899 Author: Tobias Burnus <tobias@codesourcery.com> Date: Fri Dec 16 08:56:03 2022 +0100 Remove libgomp/testsuite/libgomp.fortran/allocate-4.f90 [PR108056] Commit r13-4716-ge205ec03f0794aeac3e8a89e947c12624d5a274e accidentally included a testcase of another patch that is pending review: https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608401.html libgomp/ PR libfortran/108056 * testsuite/libgomp.fortran/allocate-4.f90: Remove accidentally added file.
So, fixed on trunk, to be backported later?
Yep. Making P1 so we remember to fix it for 12.3.
The releases/gcc-12 branch has been updated by Tobias Burnus <burnus@gcc.gnu.org>: https://gcc.gnu.org/g:ed3e8a988e0ec5b926093e26dfeef1d8b7504d1f commit r12-9003-ged3e8a988e0ec5b926093e26dfeef1d8b7504d1f Author: Tobias Burnus <tobias@codesourcery.com> Date: Wed Dec 21 07:55:22 2022 +0100 libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056] Since GCC 12, the conversion between the array descriptors formats - the internal (GFC) and the C binding one (CFI) - moved to the compiler itself such that the cfi_desc_to_gfc_desc/gfc_desc_to_cfi_desc functions are only used with older code (GCC 9 to 11). The newly added checks caused asserts as older code did not pass the proper values (e.g. real(4) as effective argument arrived as BT_ASSUME type as the effective type got lost inbetween). As proposed in the PR, revert to the GCC 11 version - known bugs is better than some fixes and new issues. Still, GCC 12 is much better in terms of TS29113 support and should really be used. This patch uses the current libgomp version of the GCC 11 branch, except it fixes the GFC version number (which is 0), uses calloc instead of malloc, and sets the lower bound to 1 instead of keeping it as is for CFI_attribute_other. (cherry picked from commit e205ec03f0794aeac3e8a89e947c12624d5a274e) (This cherry pick excludes an accidentally committed file, which was removed in follow-up commit 18af26fc375398f0a7cd7bae5aabebd447f8c899.) libgfortran/ChangeLog: PR libfortran/108056 * runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc, gfc_desc_to_cfi_desc): Mostly revert to GCC 11 version for those backward-compatiblity-only functions.
FIxed now.