The fix for PR 87689 (ABI violation on POWER) has led to problems with what some C programs expect as the Fortran calling convention to be. It seems that many C codes assume that no hidden string length is passed for a one-character string. This is wrong (and has been wrong since the days of the original Fortran compiler for Unix), but the code exists, and we should recommend a solution for it. This is important because C interfaces to LAPACK, as central a piece of software as you can find, is impacted by this. This has been quite extensively debugged by the R people, see https://gcc.gnu.org/ml/fortran/2019-05/msg00021.html . On x86_64, Tomas Kalibera found that, while things "mostly" worked, a tail call was causing problems because the stack space for the length was not allocated. -fno-optimize-sibling-calls seems to cure it, but that may not be the only thing affected by this. Note that this applies to gcc 7, gcc 8, gcc 9 and gcc 10, since the patch was applied to all open branches at the time (an ABI violation on a primary platform was deemed important enough).
(A reply to https://gcc.gnu.org/ml/fortran/2019-05/msg00021.html ; I want to have this in the PR's audit trail). Toon, I have added you to this because of your experience going back to g77, and also because you're a member of the steering committee. > I've debugged one of the packages and I > confirm that the breakage is related to passing of strings from C to > Fortran. Indeed, BLAS and LAPACK define a large number of subroutines > that take one or more explicit single-character strings as arguments. > Other than that, BLAS has only one function (xerbla), which takes a > string of unspecified length, LAPACK only has four (ilaenv, > ilaenv2stage, lsamen, xerbla). The C interfaces to BLAS/LAPACK from > Netlib depend on the historic behavior that explicit single-character > strings are interoperable, concretely CBLAS and LAPACKE provide C > interfaces/code that calls into Fortran BLAS/LAPACK without passing the > 1s as lengths of the character strings (functions taking a string of > unspecified length are trivial and re-implemented in C). OUCH. So, basically, people have been depending on C undefined behavior for ages, and this includes recent developments like LAPACKE. Only an accident of calling conventions has kept this "working". Oh my... If we are looking at the System V Application Binary Interface AMD64 Architecture Processor Supplement, I can understand how this worked. I suppose this never worked on Windows stdcall, where the callee cleans up the stack, so it must know the exact number of bytes, and the functions are therefore decorated with the number of bytes in the argument list. (By the way, interoperability has a special meaning in Fortran, where it is reserved for BIND(C) procedures and variables). > This has been > working fine for very many years as the Fortran code never needed to > access the length it knew was 1. R has been using the same practice, > which long predates ISO_C_BINDING/BIND(C), and I've seen online > discussions where people assumed interoperability of length 1 strings, > once mentioning also a citation from Fortran 2003 Handbook that says "A > Fortran character string with a length other than 1 is not > interoperable" (which invites interpretation that length 1 strings were > ). They are interoperable in the sense that they can be an argument to a BIND(C) procedure. This is the terminology used by the Fortran standard. > I am not an expert to say whether the current Fortran standard > requires that interoperability and I assume that it does not given this > gfortran change. What we are discussing here is outside the Fortran standard's scope. > This gfortran change breaks this interoperability: if a C function calls > a Fortran function, passing it a single-character string for a parameter > taking explicit single-character Fortran string, it may crash. I've > debugged one case with R package BDgraph, this example > "library(BDgraph); data.sim <- bdgraph.sim( n = 70, p = 5, size = 7, vis > = TRUE )" crashes due to corruption of C stack by Fortran function > DPOSV, when compiled with the new gfortran and with -O2. To see the > problem, one can just look at the disassembly of DPOSV (LAPACK), neither > the package nor R is not necessary: > > SUBROUTINE DPOSV( UPLO, N, NRHS, A, LDA, B, LDB, INFO ) > CHARACTER UPLO > > In one case, DPOSV calls DPOTRS before returning. The new gfortran with > -O2 performs tail-call optimization, jumping to DPOTRS. In the annotated > disassembly snippet, at 11747f1, DPOSV tries to ensure that there is > constant 1 as string length of UPLO when tail-calling into DPOTRS, so it > writes it to stack where there already should have been 1 as length of > UPLO passed to DPOSV. But this argument has not been passed to DPOSV, so > this causes stack corruption. [assembly elided] > Note that DPOSV never uses the length of the string (UPLO) from the > hidden argument, the compiler clearly knows that its length is 1. In > calls where the length is passed in registers, this does not cause > trouble (like LSAME) and indeed is needed as the registers have > different values > > IF( .NOT.LSAME( UPLO, 'U' ) .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN > > 117448: b9 01 00 00 00 mov $0x1,%ecx > > 11744d: ba 01 00 00 00 mov $0x1,%edx > > 117452: 48 8d 35 bb 12 09 00 lea > 0x912bb(%rip),%rsi # 1a8714 <ipivot.4261+0xd14> > > 117459: 4c 89 f7 mov %r14,%rdi > > 11745c: e8 1f 3d ef ff callq b180 <lsame_@plt> > > but it seems to me that the compiler could just refrain from setting the > length to be 1 on the stack at 1174f1, since it knows it should have > already been there. That would be a nice feature of tail-call optimiation, which would also "fix" this bug at the same time. > It would be a nice property if Fortran code that > never accesses the hidden arguments with the lengths of the strings, > because it knows what those lengths are, would also never write to those > hidden arguments on the stack when it knows what they are (should be). But it would not be clear if the callee also does not access the memory. Hm... I think it might make sense to make a separate optimzation PR out of this. > Before the gfortran change, DPOSV would call to DPOTRS normally (no > tail-call optimization), so this problem would not occur (I tested with > 268974). By disabling tail call optimization via > -fno-optimize-sibling-calls, the problem goes away also for other > packages my colleagues have identified as crashing with the new > gfortran. OK, I can see that. > Did you know of any other optimization that could break this > interoperability of 1-length strings? It would be really nice to users > if this interoperability could be preserved, and if not by default than > at least with some option. LTO is likely to have issues, both positive and negative ones. On the one hand, you will get warnings about LTO mismatches. On the other hand, LTO has more information, so it will probably silently correct the mismatch. It is certainly worth a _LOT_ to be LTO-clean, especially in a project like R where people write a lot of add-on code. (At the moment, you put your version of LAPACK into a shared library, so LTO cannot be used for this.). The tail-call issue is a good catch. I don't know of any other option which might be affected, but I am also not an expert here. Especially, gcc and gfortran support very many platforms with (I presume) many calling conventions, so I am not sure that this will solve all issues. > Traditionally, BLAS/LAPACK implementations are interchangeable at > dynamic linking time, using the Fortran interface that is however also > used from C, without passing lengths for fixed 1-character strings. R > supports this too, at least on some Linux distributions including > Debian/Ubuntu it is packaged so that it runs with the BLAS/LAPACK > implementation installed on the system. Even though this is probably not > correct wrt to the todays Fortran standard (I don't know for sure), this > is the common practice, and fixing this would not be easy - one would > have to create a new interface to be used from C, separate from the > Fortran one, and all software would have to start using that interface > from C. In the current situation when the Fortran interface is used, > confusion will arise with this gfortran change as different BLAS/LAPACK > implementations are built by different Fortran compilers and use a > different mix of Fortran/C for different computational subroutines. Note > CBLAS could not be readily used as it itself breaks with the current > gfortran change as well. OK, I see this could be a major problem. One of the issues is that the old way, calling a non-variadic function in a variadic way, is also not guaranteed to be free of trouble. So, what to do? It might be a possibility to invent some new flag, let's call it -fomit-string-length, which simply does not pass the string lengths. This would be binary compatible with existing calls, but a major ABI change, and would also probably break libgfortran. Another possibility would be to add an option like -fexternal-varargs, which would call procedures without explicit interface with a variadic call. Not ideal (see above), but it would restore the old behavior. Somewhat orthogonal to this, it would also be possible to emit C prototypes for all external subroutines. I wrote this kind of thing for BIND(C) procedures and variables, and it is eminently doable. So any project could just generate the correct prototypes during compilation. Hm...
Hi Thomas and Tomas, I am investigating a problem with the ISO_Fortran_binding implementation and discovered that the hidden string length disappears when the procedure is declared as bind(C). This obviously correct. Would this problem be fixed if all external procedure calls without an explicit interface, where a single character is being passed, were treated likewise? Cheers Paul
Ooof! (Just for the record, I don't think we should revert to the previous behavior. Whatever we do should be robust in the face of LTO etc.) I'd rather not see extra command-line arguments. For something which modifies the ABI and somehow works by accident part of the time, this is just too difficult to keep track of for end users, IMHO. My suggestions: 1) When compiling an external procedure, for character(len=1) arguments we don't generate the hidden string length argument. And similarly when calling an external procedure, if a len=1 character is passed, we omit the hidden string length argument. This, I believe, is what Paul is suggesting in the previous comment? 2) External procedures with character arguments are compiled and called as varargs functions. This is what Thomas is suggesting, except unconditionally and not controlled by an option. I'm not really happy with either of these, but the third option, of fixing all Fortran-calling code out there isn't realistic either.
(In reply to Janne Blomqvist from comment #3) > Ooof! > > (Just for the record, I don't think we should revert to the previous > behavior. Whatever we do should be robust in the face of LTO etc.) I concur. > I'd rather not see extra command-line arguments. For something which > modifies the ABI and somehow works by accident part of the time, this is > just too difficult to keep track of for end users, IMHO. > > My suggestions: > > 1) When compiling an external procedure, for character(len=1) arguments we > don't generate the hidden string length argument. And similarly when calling > an external procedure, if a len=1 character is passed, we omit the hidden > string length argument. This, I believe, is what Paul is suggesting in the > previous comment? Unfortunately, this would not work :-( While looking at the R source, I have also found some occurences where 'F' was spelled out 'Full'. Also, what could be done with CALL FOO ('a', 'bcd') ? > 2) External procedures with character arguments are compiled and called as > varargs functions. This is what Thomas is suggesting, except unconditionally > and not controlled by an option. Not an option, unfortunately :-( This would bring back PR 87689, and this was an ABI breakage on a primary GCC platform (POWER) with perfectly standard-compliant code. > I'm not really happy with either of these, but the third option, of fixing > all Fortran-calling code out there isn't realistic either. Currently, I am leaning towards using an option with a mandatory warning (no way to turn it off) which does not push any string lenghts onto the stack at all, something like -fbroken-character-abi, which we would use for external procedures without an explicit interface. The warning would have to be mandatory to a) point people towards what's wrong with their code, and b) avoid questions of "why does my code no longer work". I can also extend -fc-prototypes so that it also issues prototypes for any global procedures outside of BIND(C), or add a new option.
(In reply to Thomas Koenig from comment #4) > (In reply to Janne Blomqvist from comment #3) > > 1) When compiling an external procedure, for character(len=1) arguments we > > don't generate the hidden string length argument. And similarly when calling > > an external procedure, if a len=1 character is passed, we omit the hidden > > string length argument. This, I believe, is what Paul is suggesting in the > > previous comment? > > Unfortunately, this would not work :-( > > While looking at the R source, I have also found some occurences > where 'F' was spelled out 'Full'. Ah, indeed. > Also, what could be done with CALL FOO ('a', 'bcd') ? My idea would be to not pass the hidden string length for the 'a' argument, but pass it for the 'bcd' argument. But as you mention above, there are indeed several ways in which his heuristic can be broken. I have a vague memory that other Fortran compilers have special-cased length-1 string arguments, though. > > 2) External procedures with character arguments are compiled and called as > > varargs functions. This is what Thomas is suggesting, except unconditionally > > and not controlled by an option. > > Not an option, unfortunately :-( > > This would bring back PR 87689, and this was an ABI breakage on > a primary GCC platform (POWER) with perfectly standard-compliant > code. Wasn't the problem here that the caller was using varargs, while the callee was compiled as a normal function, thus causing the mismatch? So my suggestion was that external procedures be compiled as varargs as well, which would make the caller and callee ABI's to match. But now that I think about it, it wouldn't help this LAPACK case, as here the C side has a non-varargs prototype so the C caller will in any case not use the varargs convention. And I've never seen any C code defining prototypes to Fortran procedures as varargs either, so I guess this option is a non-starter. Bluh. > > I'm not really happy with either of these, but the third option, of fixing > > all Fortran-calling code out there isn't realistic either. > > Currently, I am leaning towards using an option with a mandatory > warning (no way to turn it off) which does not push any string lenghts > onto the stack at all, something like -fbroken-character-abi, > which we would use for external procedures without an explicit interface. Hmm, wouldn't this completely break procedures with "character(len=*)" dummy arguments? Or did mean that it would be used only for length=1 strings? > I can also extend -fc-prototypes so that it also issues prototypes > for any global procedures outside of BIND(C), or add a new option. This might be useful.
(In reply to Janne Blomqvist from comment #5) > (In reply to Thomas Koenig from comment #4) > > Currently, I am leaning towards using an option with a mandatory > > warning (no way to turn it off) which does not push any string lenghts > > onto the stack at all, something like -fbroken-character-abi, > > which we would use for external procedures without an explicit interface. > > Hmm, wouldn't this completely break procedures with "character(len=*)" dummy > arguments? Yes :-( This would be more or less equivalent to a -fbind-c flag. >Or did mean that it would be used only for length=1 strings? Unfortunately, the callee does not know the string length passed by the caller. > > I can also extend -fc-prototypes so that it also issues prototypes > > for any global procedures outside of BIND(C), or add a new option. > > This might be useful. I just noticed that -fc-prototypes does not issue a prototype for a global BIND(C) procedure. I'll probably fix that one soon.
(In reply to Thomas Koenig from comment #6) > (In reply to Janne Blomqvist from comment #5) > > (In reply to Thomas Koenig from comment #4) > > > Currently, I am leaning towards using an option with a mandatory > > > warning (no way to turn it off) which does not push any string lenghts > > > onto the stack at all, something like -fbroken-character-abi, > > > which we would use for external procedures without an explicit interface. > > > > Hmm, wouldn't this completely break procedures with "character(len=*)" dummy > > arguments? > > Yes :-( > > This would be more or less equivalent to a -fbind-c flag. > > >Or did mean that it would be used only for length=1 strings? > > Unfortunately, the callee does not know the string length > passed by the caller. > > > > I can also extend -fc-prototypes so that it also issues prototypes > > > for any global procedures outside of BIND(C), or add a new option. > > > > This might be useful. > > I just noticed that -fc-prototypes does not issue a prototype > for a global BIND(C) procedure. I'll probably fix that one soon. Just for the record, I have long since thought that passing the string length as a hidden argument is a design error just as awkward of the poor array descriptor design. We should be passing the data field of a descriptor as the address of the string, so that there is a known offset to the string length. This could be a generalised improvement, since it could also be used in general for f77 argument passing and would allow us to get rid in the initialisation and hidden descriptors nonsense in dummy handling. Storing all entities as descriptors would eliminate huge quantities of code in trans-*.c :-) Dream on, Paul.... Maybe for 11-branch? Paul
(In reply to Janne Blomqvist from comment #3) > 1) When compiling an external procedure, for character(len=1) arguments we > don't generate the hidden string length argument. And similarly when calling > an external procedure, if a len=1 character is passed, we omit the hidden > string length argument. This, I believe, is what Paul is suggesting in the > previous comment? Having already shot down my (2) proposal, I thought a bit more about this one, that is omitting the hidden length argument for dummy arguments with len=1, and when calling a procedure with a len=1 argument. Unfortunately, beyond being a somewhat fragile heuristic, it's also fundamentally broken. Consider subroutine foo(a) character(len=1) :: a ... end subroutine foo So with this suggestion, this procedure would be generated without the hidden length argument. The brokenness comes from call foo("ab") which would generate a call to a procedure WITH the hidden string length argument. However, this code is perfectly standard conforming, F2018 15.5.2.4 says that a character dummy argument must have a length less than or equal to the actual argument, which is fulfilled by the above (1 <= 2). That is, we can't special case len=1 procedures and calls in any way.
> So with this suggestion, this procedure would be generated without the hidden > length argument. The brokenness comes from > call foo("ab") > which would generate a call to a procedure WITH the hidden string length > argument. However, this code is perfectly standard conforming, F2018 15.5.2.4 > says that a character dummy argument must have a length less than or equal to > the actual argument, which is fulfilled by the above (1 <= 2). That is, we > can't special case len=1 procedures and calls in any way. And this happens *a lot* in the LAPACK code itself. Just do a grep -i sgemm on it and you will see it in action. Perhaps this is the reason why this occurs in the R -> LAPACK boundary code ...
I think the only option is to go forward without any change to gfortran and simply teach people about this "ABI change" and how to fix the C side. Everything else sounds like a hack that will create its own problems when used. So - RESOLVED INVALID? The recommended solution is to fix the C code.
Even restoring the state that LAPACK/BLAS works but without providing guarantees would help short-term, and I think this could be in line with the goal of best performance within the standard. At least in the case I debugged, I think gfortran could do better by not writing the 1 as string length to the place on the stack where the compiler knows there is already 1 as string length.
(In reply to Tomas Kalibera from comment #11) > At least in the case I debugged, I think gfortran could do better by not > writing the 1 as string length to the place on the stack where the compiler > knows there is already 1 as string length. The problem is not that the compiler gets the wrong string length, in this case when the procedure argument is defined to have a length of 1, the compiler never needs to access the hidden string length argument [1]. The problem is that both the caller and the callee need to agree on the number (and type) of arguments. And because in Fortran implicit interfaces ("F77"-style interfaces, without modules, interface blocks or any other newfangled "modern Fortran" stuff) the only information about the interface of a procedure that the caller knows is the name of the procedure. So for an example compiling the code call foo("abc") the compiler has no knowledge of the procedure beyond the name "foo". From this code the compiler can then deduce that it must be a subroutine, and that it takes one argument of type character. But the compiler cannot verify this, it must assume the programmer knows what he's doing. Now, the problem is that the definition of the procedure can be subroutine foo(a) character(len=1) :: a .... end subroutine foo or subroutine foo(a) character(len=3) :: a ! Or len=2 .... end subroutine foo or subroutine foo(a) character(len=*) :: a .... end subroutine foo In the first two the hidden string length argument is not needed, but because the procedure ABI must match for all the possibilities, and for the third case the hidden length IS needed, the hidden string length argument is always needed, although it's unused when the length is fixed. [1] It would be possible to have some optional debugging option which inserts code to verify that the actual argument length >= the dummy argument length. But that's not really related to this issue.
I understand the compiler may not know and typically does not whether the called function accepts "character(len=1)" or "character(len=*)", so it may have to pass the 1. But the situation where I suggest not writing the 1 is more subtle (see my original post - https://gcc.gnu.org/ml/fortran/2019-05/msg00021.html). There DPOSV is tail-calling (jumping) into DPOTRS). DPOSV wants to pass on the length of UPLO (1) to DPOTRS. DPOSV knows it was called with this hidden length argument of 1, and indeed at the same location on the stack as it is needed for DPOTRS, as the length of the same variable UPLO, but it still writes a compile-time constant 1 to that location on the stack that already has (should have) 1.
On Mon, 6 May 2019, tomas.kalibera at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90329 > > --- Comment #13 from Tomas Kalibera <tomas.kalibera at gmail dot com> --- > I understand the compiler may not know and typically does not whether the > called function accepts "character(len=1)" or "character(len=*)", so it may > have to pass the 1. But the situation where I suggest not writing the 1 is more > subtle (see my original post - > https://gcc.gnu.org/ml/fortran/2019-05/msg00021.html). > > There DPOSV is tail-calling (jumping) into DPOTRS). DPOSV wants to pass on the > length of UPLO (1) to DPOTRS. DPOSV knows it was called with this hidden length > argument of 1, and indeed at the same location on the stack as it is needed for > DPOTRS, as the length of the same variable UPLO, but it still writes a > compile-time constant 1 to that location on the stack that already has (should > have) 1. I think that's called sibling-calling, not tailcalling (tail-call with same arguments).
Hi Tomas, > I understand the compiler may not know and typically does not whether the > called function accepts "character(len=1)" or "character(len=*)", so it may > have to pass the 1. But the situation where I suggest not writing the 1 is > more subtle (see my original post - > https://gcc.gnu.org/ml/fortran/2019-05/msg00021.html). I have given this some thought, and I don't think this can be done in the general case, unfortunately. Consider program main call foo("ab") end subroutine foo(c) character*1 c call bar(c) end subroutine bar(c) character*(*) c print *,len(c) end This is legal Fortran going back to F77, and it should print 1. If your proposal was implemented, this would print 2, which would be a wrong-code bug :-( So, what can we do? Previously, the way of interfacing C with Fortran was fragile, non-conforming, and worked. Now it is fragile, non-conforming, and does not work. In your (excellent, by the way) debugging work, you have identified an option, -fno-optimize-sibling-calls, which restores the status quo because things would go back to being fragile, nonconforming, and they would work again. Since we applied the fix for PR 87689 to gcc 7, gcc 8 and gcc 9, I would suggest that we make -fno-optimize-sibling-calls the default on these branches. Maintaining binary compatibility (even if it is bug compatibility) with existing packages is something we should strive for, especially with such important software packages as BLAS and LAPACK. These packages are one important reason why people still use Fortran, and I would hate to push them towards flang with this. For current trunk, I would recommend keeping the current hehavior and contact the LAPACK maintainers to a) give them a heads-up for this problem, and b) a year to work out the problem. Would this be something that people could agree on?
(In reply to Thomas Koenig from comment #15) > Hi Tomas, > > > I understand the compiler may not know and typically does not whether the > > called function accepts "character(len=1)" or "character(len=*)", so it may > > have to pass the 1. But the situation where I suggest not writing the 1 is > > more subtle (see my original post - > > https://gcc.gnu.org/ml/fortran/2019-05/msg00021.html). > > I have given this some thought, and I don't think this can be done > in the general case, unfortunately. > > Consider > > program main > call foo("ab") > end > > subroutine foo(c) > character*1 c > call bar(c) > end > > subroutine bar(c) > character*(*) c > print *,len(c) > end > > This is legal Fortran going back to F77, and it should print 1. > > If your proposal was implemented, this would print 2, which would > be a wrong-code bug :-( > > So, what can we do? Previously, the way of interfacing C > with Fortran was fragile, non-conforming, and worked. Now > it is fragile, non-conforming, and does not work. > > In your (excellent, by the way) debugging work, you have > identified an option, -fno-optimize-sibling-calls, which > restores the status quo because things would go back to > being fragile, nonconforming, and they would work again. > > Since we applied the fix for PR 87689 to gcc 7, gcc 8 and gcc 9, > I would suggest that we make -fno-optimize-sibling-calls > the default on these branches. Maintaining binary compatibility > (even if it is bug compatibility) with existing packages is > something we should strive for, especially with such > important software packages as BLAS and LAPACK. These packages > are one important reason why people still use Fortran, and > I would hate to push them towards flang with this. > > For current trunk, I would recommend keeping the current > hehavior and contact the LAPACK maintainers to a) give them > a heads-up for this problem, and b) a year to work out > the problem. > > Would this be something that people could agree on? Does -fno-optimizing-sibling-calls effect performance? A 1% (or less) degradation may be considered negligible, and an acceptible compromise. 10% would be unacceptable. Guess I'll need to dust off my old copy of the Polyhedron Benchmarks and run a few tests.
On Mon, May 06, 2019 at 04:40:08PM +0000, kargl at gcc dot gnu.org wrote: > > Since we applied the fix for PR 87689 to gcc 7, gcc 8 and gcc 9, > > I would suggest that we make -fno-optimize-sibling-calls > > the default on these branches. Maintaining binary compatibility > > (even if it is bug compatibility) with existing packages is > > something we should strive for, especially with such > > important software packages as BLAS and LAPACK. These packages > > are one important reason why people still use Fortran, and > > I would hate to push them towards flang with this. > > > > For current trunk, I would recommend keeping the current > > hehavior and contact the LAPACK maintainers to a) give them > > a heads-up for this problem, and b) a year to work out > > the problem. > > > > Would this be something that people could agree on? > > Does -fno-optimizing-sibling-calls effect performance? > A 1% (or less) degradation may be considered negligible, > and an acceptible compromise. 10% would be unacceptable. > > Guess I'll need to dust off my old copy of the Polyhedron > Benchmarks and run a few tests. > So, I dusted off my old PB code and ran some tests. The system is x86_64-*-freebsd. I saw nothing to suggest that this option would have a negative impact on performance. ================================================================================ Date & Time : 6 May 2019 13:29:24 Test Name : gfcx Compile Command : gfcx -static -ffpe-summary=none -O3 -pipe -mtune=native -march=native -ffast-math -ftree-vectorize -funroll-loops --param max-unroll-times=4 %n.f90 -o %n Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft Maximum Times : 200.0 Target Error % : 0.100 Minimum Repeats : 5 Maximum Repeats : 10 Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 1.60 5511576 8.00 6 0.0893 aermod 49.38 6822120 19.18 5 0.0054 air 7.49 5632568 4.57 5 0.0340 capacita 4.97 5639072 40.91 5 0.0384 channel 1.31 5520904 1.86 10 0.2693 doduc 7.53 5655488 19.33 6 0.0978 fatigue 3.42 5618928 4.41 5 0.0170 gas_dyn 3.80 5604440 1.91 5 0.0214 induct 7.59 5771912 6.18 5 0.0209 linpk 1.23 5486240 7.82 5 0.0096 mdbx 2.97 5553200 7.33 5 0.0388 nf 2.73 5533744 8.57 5 0.0095 protein 4.89 5762320 25.13 5 0.0494 rnflow 7.87 5965568 34.26 5 0.0810 test_fpu 5.90 5705216 6.03 10 0.1206 tfft 1.64 5519096 1.81 5 0.0138 Geometric Mean Execution Time = 7.94 seconds ================================================================================ ================================================================================ Date & Time : 6 May 2019 16:11:36 Test Name : gfcx Compile Command : gfcx -fno-optimize-sibling-calls -static -ffpe-summary=none -O3 -pipe -mtune=native -march=native -ffast-math -ftree-vectorize -funroll-loops --param max-unroll-times=4 %n.f90 -o %n Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft Maximum Times : 200.0 Target Error % : 0.100 Minimum Repeats : 5 Maximum Repeats : 10 Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 1.58 5511576 7.99 5 0.0139 aermod 49.16 6822120 19.17 5 0.0216 air 7.48 5632568 4.57 5 0.0274 capacita 4.95 5639072 41.24 5 0.3593 channel 1.31 5520904 1.86 10 0.2607 doduc 8.00 5655488 19.29 10 0.0947 fatigue 3.43 5618928 4.40 5 0.0545 gas_dyn 3.81 5604440 1.97 5 0.0328 induct 7.58 5771912 6.18 5 0.0121 linpk 1.23 5486240 7.85 5 0.0699 mdbx 2.97 5553200 7.28 5 0.0703 nf 2.72 5533744 8.62 5 0.0998 protein 4.87 5762320 25.30 8 0.1194 rnflow 7.86 5965568 34.28 6 0.2875 test_fpu 5.88 5705216 6.03 8 0.0980 tfft 1.64 5519096 1.74 6 0.0841 Geometric Mean Execution Time = 7.94 seconds ================================================================================
(In reply to Thomas Koenig from comment #15) > Since we applied the fix for PR 87689 to gcc 7, gcc 8 and gcc 9, > I would suggest that we make -fno-optimize-sibling-calls > the default on these branches. Maintaining binary compatibility > (even if it is bug compatibility) with existing packages is > something we should strive for, especially with such > important software packages as BLAS and LAPACK. +1. Especially considering Steve's benchmark suggesting there's practically no difference, although there may of course be other code where sibling call optimization makes a difference. > For current trunk, I would recommend keeping the current > hehavior and contact the LAPACK maintainers to a) give them > a heads-up for this problem, and b) a year to work out > the problem. Yes. Closer to GCC 10, we can revisit this. I suspect we'll have to make -fno-optimize-sibling-calls the default for GCC 10 as well; while we might be able to help LAPACK maintainers fix LAPACKE there's in this timeframe there's certainly a lot of other code out there with custom C-Fortran interfaces which might be affected.
On Tue, 7 May 2019, jb at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90329 > > --- Comment #18 from Janne Blomqvist <jb at gcc dot gnu.org> --- > (In reply to Thomas Koenig from comment #15) > > Since we applied the fix for PR 87689 to gcc 7, gcc 8 and gcc 9, > > I would suggest that we make -fno-optimize-sibling-calls > > the default on these branches. Maintaining binary compatibility > > (even if it is bug compatibility) with existing packages is > > something we should strive for, especially with such > > important software packages as BLAS and LAPACK. > > +1. Especially considering Steve's benchmark suggesting there's practically no > difference, although there may of course be other code where sibling call > optimization makes a difference. > > > For current trunk, I would recommend keeping the current > > hehavior and contact the LAPACK maintainers to a) give them > > a heads-up for this problem, and b) a year to work out > > the problem. > > Yes. Closer to GCC 10, we can revisit this. I suspect we'll have to make > -fno-optimize-sibling-calls the default for GCC 10 as well; while we might be > able to help LAPACK maintainers fix LAPACKE there's in this timeframe there's > certainly a lot of other code out there with custom C-Fortran interfaces which > might be affected. I don't see how -fno-optimize-sibling-calls helps in a systematic way. It might obfuscate a specific example enough to make it work, but...?
(In reply to rguenther@suse.de from comment #19) > I don't see how -fno-optimize-sibling-calls helps in a systematic way. > It might obfuscate a specific example enough to make it work, but...? There is only empirical data, but more than one case. I've debugged only one case, but all the other R packages that were newly failing with that gfortran change also started working again with -fno-optimize-sibling-calls (~15 CRAN packages). That -fno-optimize-sibling-calls helps has also been confirmed by Kurt Hornik from the CRAN team.
I filed https://github.com/Reference-LAPACK/lapack/issues/339 to start a discussion about fixing CBLAS and LAPACKE in upstream LAPACK.
Author: tkoenig Date: Wed May 8 21:55:13 2019 New Revision: 271018 URL: https://gcc.gnu.org/viewcvs?rev=271018&root=gcc&view=rev Log: 2019-05-08 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/90351 PR fortran/90329 * gfortran.dg/dump-parse-tree.c: Include version.h. (gfc_dump_external_c_prototypes): New function. (get_c_type_name): Select "char" as a name for a simple char. Adjust to handling external functions. Also handle complex. (write_decl): Add argument bind_c. Adjust for dumping of external procedures. (write_proc): Likewise. (write_interop_decl): Add bind_c argument to call of write_proc. * gfortran.h: Add prototype for gfc_dump_external_c_prototypes. * lang.opt: Add -fc-prototypes-external flag. * parse.c (gfc_parse_file): Move dumping of BIND(C) prototypes. Call gfc_dump_external_c_prototypes if option is set. * invoke.texi: Document -fc-prototypes-external. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/dump-parse-tree.c trunk/gcc/fortran/gfortran.h trunk/gcc/fortran/invoke.texi trunk/gcc/fortran/lang.opt trunk/gcc/fortran/parse.c
On trunk, it is now possible to generate C prototypes from Fortran source code for external procedures. Applying this to dlapack.f in the R source tree yields something like this (just grabbing a few lines) $ gfortran -fsyntax-only -fc-prototypes-external dlapack.f [...] void dtpqrt2_ (int *m, int *n, int *l, double *a, int *lda, double *b, int *ldb, double *t, int *ldt, int *info); void dlarfg_ (int *n, double *alpha, double *x, int *incx, double *tau); void dlatps_ (char *uplo, char *trans, char *diag, char *normin, int *n, double *ap, double *x, double *scale, double *cnorm, int *info, size_t uplo_len, size_t trans_len, size_t diag_len, size_t normin_len); double dlantp_ (char *norm, char *uplo, char *diag, int *n, double *ap, double *work, size_t norm_len, size_t uplo_len, size_t diag_len); void dlatbs_ (char *uplo, char *trans, char *diag, char *normin, int *n, int *kd, double *ab, int *ldab, double *x, double *scale, double *cnorm, int *info, size_t uplo_len, size_t trans_len, size_t diag_len, size_t normin_len); double dlantb_ (char *norm, char *uplo, char *diag, int *n, int *k, double *ab, int *ldab, double *work, size_t norm_len, size_t uplo_len, size_t diag_len); Looking for _len yields 236 lines, that is probably the number of subroutines and functions which would need to be adapted.
Author: tkoenig Date: Thu May 9 17:40:30 2019 New Revision: 271038 URL: https://gcc.gnu.org/viewcvs?rev=271038&root=gcc&view=rev Log: 2019-05-09 Thomas Koenig <tkoenig@gcc.gnu.org> Backport from trunk PR fortran/90351 PR fortran/90329 * gfortran.dg/dump-parse-tree.c: Include version.h. (gfc_dump_external_c_prototypes): New function. (get_c_type_name): Select "char" as a name for a simple char. Adjust to handling external functions. Also handle complex. (write_decl): Add argument bind_c. Adjust for dumping of external procedures. (write_proc): Likewise. (write_interop_decl): Add bind_c argument to call of write_proc. * gfortran.h: Add prototype for gfc_dump_external_c_prototypes. * lang.opt: Add -fc-prototypes-external flag. * parse.c (gfc_parse_file): Move dumping of BIND(C) prototypes. Call gfc_dump_external_c_prototypes if option is set. * invoke.texi: Document -fc-prototypes-external. Modified: branches/gcc-9-branch/gcc/fortran/ChangeLog branches/gcc-9-branch/gcc/fortran/dump-parse-tree.c branches/gcc-9-branch/gcc/fortran/gfortran.h branches/gcc-9-branch/gcc/fortran/invoke.texi branches/gcc-9-branch/gcc/fortran/lang.opt branches/gcc-9-branch/gcc/fortran/parse.c
(In reply to Thomas Koenig from comment #15) > Hi Tomas, > > > I understand the compiler may not know and typically does not whether the > > called function accepts "character(len=1)" or "character(len=*)", so it may > > have to pass the 1. But the situation where I suggest not writing the 1 is > > more subtle (see my original post - > > https://gcc.gnu.org/ml/fortran/2019-05/msg00021.html). > > I have given this some thought, and I don't think this can be done > in the general case, unfortunately. > > Consider > > program main > call foo("ab") > end > > subroutine foo(c) > character*1 c > call bar(c) > end > > subroutine bar(c) > character*(*) c > print *,len(c) > end > > This is legal Fortran going back to F77, and it should print 1. > > If your proposal was implemented, this would print 2, which would > be a wrong-code bug :-( > > So, what can we do? Previously, the way of interfacing C > with Fortran was fragile, non-conforming, and worked. Now > it is fragile, non-conforming, and does not work. > > In your (excellent, by the way) debugging work, you have > identified an option, -fno-optimize-sibling-calls, which > restores the status quo because things would go back to > being fragile, nonconforming, and they would work again. You don't want to do that, that is way big hammer and in some cases, tail call optimizations are essential for correct program behavior (without the optimizations for deep recursions you won't fit on stack, while when using them you will). If you want some hack to workaround these buggy wrappers (but will their maintainers then actually fix anything rather than ignoring the problem?), then I could see tail-calls.c or calls.c punt if it sees DECL_ARTIFICIAL PARM_DECLs passed on the stack at the end of argument list and it is the fortran length (it would e.g. help if those arguments would be named with an initial dot rather than underscore (I believe that used to be the case in the past)), or we need some way to propagate the info that the argument is hidden string length through LTO.
It could even allow tailcalls in the character(len=*) cases because in those cases if the caller omits the string length hidden argument, I see no reason to try to workaround that, it will simply never work properly. The only reason without tailcalls things appear to work is that the argument is TREE_READONLY and nothing touches it, so there is some hope it won't be modified either, when doing a tailcall if the tail callee needs any of those stack slots for other arguments, it will be written into. We could still allow tail calls if the callee needs fewer arguments than the caller and doesn't need to overwrite any of the hidden string length arguments the buggy callers don't provide.
Note that both ifort and xlc fortran have the hidden string length arguments as well.
More fallout, this time with C++ code: https://bugzilla.redhat.com/show_bug.cgi?id=1709538 +1 for "Maintaining binary compatibility (even if it is bug compatibility) with existing packages"
Created attachment 46359 [details] gcc10-pr90329.patch Untested workaround that isn't a too big hammer. This should just avoid tail calls in functions where the hidden string arguments for character(len=constant) dummy arguments are passed partially or fully on the stack and the callee needs to pass anything on the stack and thus might clobber that.
Hi Jakub, > Untested workaround that isn't a too big hammer. This should just avoid > tail calls in functions where the hidden string arguments for > character(len=constant) dummy arguments are passed partially or fully on the > stack and the callee needs to pass anything on the stack and thus might > clobber that. Thanks for taking this on! Two remarks: It should be possible to have this workaround in place only when there is no explicit interface for the called routine. No need to penalize modern, correct Fortran for this. I can look at this, but I will only be able to find time on the weekend. And I would like to have the Fortran part behind an option, so people could disable this pessimization if they start using the correct prototypes. Ideally, the compiler could warn about this. It already does so if you use -flto, but without LTO, the warning would have to be emitted from the C compiler, and I see no even halfway sane way of doing it.
(In reply to Thomas Koenig from comment #30) > Hi Jakub, > > > Untested workaround that isn't a too big hammer. This should just avoid > > tail calls in functions where the hidden string arguments for > > character(len=constant) dummy arguments are passed partially or fully on the > > stack and the callee needs to pass anything on the stack and thus might > > clobber that. > > Thanks for taking this on! > > Two remarks: It should be possible to have this workaround in place > only when there is no explicit interface for the called routine. > No need to penalize modern, correct Fortran for this. First of all, the thing is that we don't have just caller and callee to consider in this case, but 3 functions, some C/C++ function (C caller below) calls incorrectly prototyped (or for C unprototyped) Fortran function (Fortran caller bellow) and that function calls some other function in a position where tail call is considered (callee bellow). I don't see how having a properly interfaced callee (and it doesn't matter if that one is implemented in Fortran or C or C++ or any other language) matters to whether we need to do this workaround or not. The bug is in the incorrect prototype visible in the C caller and the incorrect (insufficient number) of arguments in that call. Even if the callee function has proper interface, if we tail call it, we will clobber the argument area of the Fortran caller in the C caller. And it doesn't really matter what arguments the callee has, it could very well be: subroutine foo (a, b, c, d, e, f) character(len=1) :: a integer(kind=8) :: b, c, d, e, f interface subroutine bar (g, h, i, j, k, l, m) integer(kind=8) :: g, h, i, j, k, l, m end subroutine bar end interface call bar (b, c, d, e, f, 2, 3) end subroutine foo Here if we tail call bar from foo, which we generally on x86_64 can, because the caller has 6 normal arguments + 1 artificial hidden string length, and the callee has 7 normal arguments, in both cases 6 go in registers, the 7th in a stack slot, then the argument stack area is equal size (for tail call it is enough if the callee stack area is smaller or equal than caller's). > And I would like to have the Fortran part behind an option, so > people could disable this pessimization if they start using the > correct prototypes. Agreed, just no idea how to call that option and how much to express in the option name that it is a workaround for undefined behavior. > Ideally, the compiler could warn about this. It already does so > if you use -flto, but without LTO, the warning would have to > be emitted from the C compiler, and I see no even halfway sane > way of doing it. This can be diagnosed only with LTO (but it already is I think) or perhaps with some debug info parsing and analysis tool that would inspect debug info of the whole app and all libraries and complain. It is the same thing as extern int baz (int x, int y); int foo (int x, int y) { return baz (x + y, 0); } in one C TU and extern int foo (int x); int bar (void) { return foo (10); } in another one, we don't diagnose that without LTO either and can't.
BTW, as can be seen on https://fortran.godbolt.org/z/ckMt1t it is a pure luck that these bogus C/C++ wrappers seem to work, if one tries simple subroutine foo (a, b, c, d, e, f, g, h) integer :: b, c, d, e, f, g, h character(len=1) :: a call bar (a, b, c, d, e, f, g, h) end subroutine foo then at -O2 both older gfortran and ifort emit a tail call, which will overwrite the hidden string length argument, so if one prototypes this as void foo_ (char *a, blas_int b, blas_int c, blas_int d, blas_int e, blas_int f, blas_int g, blas_int h); and calls it with say foo_ ("1", 2, 3, 4, 5, 6, 7, 8); then it will overwrite some variables in the caller both with older gfortran and with ifort.
Author: jakub Date: Thu May 16 09:37:43 2019 New Revision: 271285 URL: https://gcc.gnu.org/viewcvs?rev=271285&root=gcc&view=rev Log: PR fortran/90329 * tree-core.h (struct tree_decl_common): Document decl_nonshareable_flag for PARM_DECLs. * tree.h (DECL_HIDDEN_STRING_LENGTH): Define. * calls.c (expand_call): Don't try tail call if caller has any DECL_HIDDEN_STRING_LENGTH PARM_DECLs that are or might be passed on the stack and callee needs to pass any arguments on the stack. * tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use else if instead of series of mutually exclusive ifs. Handle DECL_HIDDEN_STRING_LENGTH for PARM_DECLs. * tree-streamer-out.c (pack_ts_decl_common_value_fields): Likewise. * trans-decl.c (create_function_arglist): Set DECL_HIDDEN_STRING_LENGTH on hidden string length PARM_DECLs if len is constant. Modified: trunk/gcc/ChangeLog trunk/gcc/calls.c trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/trans-decl.c trunk/gcc/tree-core.h trunk/gcc/tree-streamer-in.c trunk/gcc/tree-streamer-out.c trunk/gcc/tree.h
Even the subroutine foo (a, b, c, d, e, f) integer :: c, e, f character(len=1) :: a, b double precision :: d (e, *) call bar (a, b, c, d, e, f) end subroutine foo is tail call optimized by both older gfortran and ifort, and this one has the same prototype as the lapack routine and so will be broken by both old gfortran and ifort when called incorrectly without the hidden string length arguments.
One more thing, if I add to start of the DTRTRI function INTERFACE SUBROUTINE DTRTI2( UPLO, DIAG, N, A, LDA, INFO ) CHARACTER DIAG, UPLO INTEGER INFO, LDA, N DOUBLE PRECISION A( LDA, * ) END SUBROUTINE END INTERFACE INTERFACE SUBROUTINE XERBLA( FOO, BAR ) CHARACTER FOO INTEGER BAR END SUBROUTINE END INTERFACE INTERFACE SUBROUTINE DTRMM ( S1, S2, S3, S4, I1, I2, D1, A, I3, D2, I4 ) CHARACTER S1, S2, S3, S4 INTEGER I1, I2, I3, I4 DOUBLE PRECISION D1, D2 DOUBLE PRECISION A( I3, * ) END SUBROUTINE END INTERFACE INTERFACE SUBROUTINE DTRSM ( S1, S2, S3, S4, I1, I2, D1, D2, I3, D3, I4 ) CHARACTER S1, S2, S3, S4 INTEGER I1, I2, I3, I4 DOUBLE PRECISION D1, D2, D3 END SUBROUTINE END INTERFACE INTERFACE FUNCTION LSAME ( S1, S2 ) CHARACTER S1, S2 LOGICAL LSAME END FUNCTION END INTERFACE INTERFACE FUNCTION ILAENV ( I1, S1, S2, I2, I3, I4, I5 ) INTEGER I1, I2, I3, I4, I5, ILAENV CHARACTER S1, S2 END FUNCTION END INTERFACE and remove LOGICAL LSAME INTEGER ILAENV EXTERNAL LSAME, ILAENV EXTERNAL DTRMM, DTRSM, DTRTI2, XERBLA then it is tail call optimized back to somewhere in between r164328 (still doesn't tail call it) and r164450 (already tail calls it). What matters is not whether the exact function/subroutine we are calling in the tail call position has interface or not, tail call optimization doesn't care about that, but the presence or absence of TYPE_ARG_TYPES on the function types that are called matters for the alias analysis (r268991 vs. r268992): -ESCAPED = uplo -ESCAPED = &STRING +callarg(30) = uplo +callarg(30) = callarg(30) + UNKNOWN +callarg(30) = *callarg(30) + UNKNOWN +CALLUSED(28) = callarg(30) +CALLCLOBBERED(29) = callarg(30) +*callarg(30) = NONLOCAL +callarg(31) = &STRING +callarg(31) = callarg(31) + UNKNOWN +callarg(31) = *callarg(31) + UNKNOWN +CALLUSED(28) = callarg(31) +CALLCLOBBERED(29) = callarg(31) +*callarg(31) = NONLOCAL and similarly for many other arguments. So, if we wanted to narrow the workaround more, we could do the DECL_HIDDEN_STRING_LENGTH setting conditional on whether the function has any calls to Fortran implicitly prototyped function. Or if we add a command line switch, have different modes, one which would assume sane callers, another one that would do it only for functions that call implicitly prototyped functions and another that would do it always like the committed workaround does. On the other side, those unprototyped calls could come from inlined calls too, though as a heuristics it could work, usually the codebase either is modern fortran, or is not, and given that the workaround is for undefined behavior, we are always free to do anything we want.
Author: tkoenig Date: Sun May 19 08:22:41 2019 New Revision: 271376 URL: https://gcc.gnu.org/viewcvs?rev=271376&root=gcc&view=rev Log: 2019-05-19 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/90329 * invoke.texi: Document -fbroken-callers. * lang.opt: Add -fbroken-callers. * trans-decl.c (create_function_arglist): Only set DECL_HIDDEN_STRING_LENGTH if flag_broken_callers is set. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/invoke.texi trunk/gcc/fortran/lang.opt trunk/gcc/fortran/trans-decl.c
Thanks for the workaround. Will the patches be backported to gcc 8.x and 9.x ?
On Wed, May 22, 2019 at 04:38:40AM +0000, conradsand.arma at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90329 > > --- Comment #37 from Conrad S <conradsand.arma at gmail dot com> --- > Thanks for the workaround. > Will the patches be backported to gcc 8.x and 9.x ? > A better question might be: Are you going to fix your code?
> A better question might be: Are you going to fix your code? Yes [1], but that's besides the point here. I can certainly fix my code, but that leaves 99% of other software. Backports to gcc 8.x and 9.x would be very beneficial. [1] in progress: https://gitlab.com/conradsnicta/armadillo-code/issues/123
I must say I don't like -fbroken-callers option name too much, can we use instead something like -ftail-call-workaround={0,1,2} / -f{,no-}tail-call-workaround where -ftail-call-workaround == -ftail-call-workaround=1 would be the default (for now) and enabled only if resolve_symbol finds FL_PROCEDURE if_source == IFSRC_UNKNOWN symbols (remembered in some gfc_ns member and set on containing procedure, not say containing BLOCK), -fno-tail-call-workaround == -ftail-call-workaround=0 would disable this and -ftail-call-workaround=2 would be enabled no matter whether there are any FL_PROCEDURE IFSRC_UNKNOWN symbols or not?
Author: jakub Date: Wed May 29 14:08:57 2019 New Revision: 271738 URL: https://gcc.gnu.org/viewcvs?rev=271738&root=gcc&view=rev Log: PR fortran/90329 * lang.opt (fbroken-callers): Remove. (ftail-call-workaround, ftail-call-workaround=): New options. * gfortran.h (struct gfc_namespace): Add implicit_interface_calls. * interface.c (gfc_procedure_use): Set implicit_interface_calls for calls to implicit interface procedures. * trans-decl.c (create_function_arglist): Use flag_tail_call_workaround instead of flag_broken_callers. If it is not 2, also require sym->ns->implicit_interface_calls. * invoke.texi (fbroken-callers): Remove documentation. (ftail-call-workaround, ftail-call-workaround=): Document. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/gfortran.h trunk/gcc/fortran/interface.c trunk/gcc/fortran/invoke.texi trunk/gcc/fortran/lang.opt trunk/gcc/fortran/trans-decl.c
Author: jakub Date: Wed May 29 15:55:12 2019 New Revision: 271743 URL: https://gcc.gnu.org/viewcvs?rev=271743&root=gcc&view=rev Log: PR fortran/90329 * lto-streamer.h (LTO_minor_version): Bump to 1. Backported from mainline 2019-05-29 Jakub Jelinek <jakub@redhat.com> PR fortran/90329 * lang.opt (fbroken-callers): Remove. (ftail-call-workaround, ftail-call-workaround=): New options. * gfortran.h (struct gfc_namespace): Add implicit_interface_calls. * interface.c (gfc_procedure_use): Set implicit_interface_calls for calls to implicit interface procedures. * trans-decl.c (create_function_arglist): Use flag_tail_call_workaround instead of flag_broken_callers. If it is not 2, also require sym->ns->implicit_interface_calls. * invoke.texi (fbroken-callers): Remove documentation. (ftail-call-workaround, ftail-call-workaround=): Document. 2019-05-19 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/90329 * invoke.texi: Document -fbroken-callers. * lang.opt: Add -fbroken-callers. * trans-decl.c (create_function_arglist): Only set DECL_HIDDEN_STRING_LENGTH if flag_broken_callers is set. 2019-05-16 Jakub Jelinek <jakub@redhat.com> PR fortran/90329 * tree-core.h (struct tree_decl_common): Document decl_nonshareable_flag for PARM_DECLs. * tree.h (DECL_HIDDEN_STRING_LENGTH): Define. * calls.c (expand_call): Don't try tail call if caller has any DECL_HIDDEN_STRING_LENGTH PARM_DECLs that are or might be passed on the stack and callee needs to pass any arguments on the stack. * tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use else if instead of series of mutually exclusive ifs. Handle DECL_HIDDEN_STRING_LENGTH for PARM_DECLs. * tree-streamer-out.c (pack_ts_decl_common_value_fields): Likewise. * trans-decl.c (create_function_arglist): Set DECL_HIDDEN_STRING_LENGTH on hidden string length PARM_DECLs if len is constant. Modified: branches/gcc-9-branch/gcc/ChangeLog branches/gcc-9-branch/gcc/calls.c branches/gcc-9-branch/gcc/fortran/ChangeLog branches/gcc-9-branch/gcc/fortran/gfortran.h branches/gcc-9-branch/gcc/fortran/interface.c branches/gcc-9-branch/gcc/fortran/invoke.texi branches/gcc-9-branch/gcc/fortran/lang.opt branches/gcc-9-branch/gcc/fortran/trans-decl.c branches/gcc-9-branch/gcc/lto-streamer.h branches/gcc-9-branch/gcc/tree-core.h branches/gcc-9-branch/gcc/tree-streamer-in.c branches/gcc-9-branch/gcc/tree-streamer-out.c branches/gcc-9-branch/gcc/tree.h
Author: jakub Date: Wed May 29 16:02:56 2019 New Revision: 271744 URL: https://gcc.gnu.org/viewcvs?rev=271744&root=gcc&view=rev Log: PR fortran/90329 * lto-streamer.h (LTO_minor_version): Bump to 2. Backported from mainline 2019-05-29 Jakub Jelinek <jakub@redhat.com> PR fortran/90329 * lang.opt (fbroken-callers): Remove. (ftail-call-workaround, ftail-call-workaround=): New options. * gfortran.h (struct gfc_namespace): Add implicit_interface_calls. * interface.c (gfc_procedure_use): Set implicit_interface_calls for calls to implicit interface procedures. * trans-decl.c (create_function_arglist): Use flag_tail_call_workaround instead of flag_broken_callers. If it is not 2, also require sym->ns->implicit_interface_calls. * invoke.texi (fbroken-callers): Remove documentation. (ftail-call-workaround, ftail-call-workaround=): Document. 2019-05-19 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/90329 * invoke.texi: Document -fbroken-callers. * lang.opt: Add -fbroken-callers. * trans-decl.c (create_function_arglist): Only set DECL_HIDDEN_STRING_LENGTH if flag_broken_callers is set. 2019-05-16 Jakub Jelinek <jakub@redhat.com> PR fortran/90329 * tree-core.h (struct tree_decl_common): Document decl_nonshareable_flag for PARM_DECLs. * tree.h (DECL_HIDDEN_STRING_LENGTH): Define. * calls.c (expand_call): Don't try tail call if caller has any DECL_HIDDEN_STRING_LENGTH PARM_DECLs that are or might be passed on the stack and callee needs to pass any arguments on the stack. * tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use else if instead of series of mutually exclusive ifs. Handle DECL_HIDDEN_STRING_LENGTH for PARM_DECLs. * tree-streamer-out.c (pack_ts_decl_common_value_fields): Likewise. * trans-decl.c (create_function_arglist): Set DECL_HIDDEN_STRING_LENGTH on hidden string length PARM_DECLs if len is constant. Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/calls.c branches/gcc-8-branch/gcc/fortran/ChangeLog branches/gcc-8-branch/gcc/fortran/gfortran.h branches/gcc-8-branch/gcc/fortran/interface.c branches/gcc-8-branch/gcc/fortran/invoke.texi branches/gcc-8-branch/gcc/fortran/lang.opt branches/gcc-8-branch/gcc/fortran/trans-decl.c branches/gcc-8-branch/gcc/lto-streamer.h branches/gcc-8-branch/gcc/tree-core.h branches/gcc-8-branch/gcc/tree-streamer-in.c branches/gcc-8-branch/gcc/tree-streamer-out.c branches/gcc-8-branch/gcc/tree.h
Workaround added for 8.4+, 9.2+ and 10.1+ so far.
Hi everyone. Pardon my ignorance of C-Fortran bridging matters, but does any of the following make sense? The Fortran subroutine has no idea whether the word where the argument is supposed to be is really the argument, or just some unrelated part of the stack (because the caller didn't pass the argument). Can't the Fortran routine check whether that word has the expected value 1? Then in that case, just pretend it's the argument (whether that is the case, or whether that 1 is just a spurious value that looks like the right argument). In this case, just do all the tail calling to sibling routines happily, who will get the right string length. If the word does not have value 1, then, uh oh: treat it suspiciously! The function can then *call itself* with with a *copy* of the arguments, including the correct length 1. When that call returns, it returns. The recursively invoked activation will of course the value of 1, and then act cool. The downside is that there is a performance penalty with this check and extra recursive call. When the callers are fixed to pass the argument, the penalty is reduced, but the check is still being made.
C pseudocode in light of previous comment: double abused_fortran_fn(double x, double y, char str[1], int len) { if (len != 1) return abused_fortran_fn(x, y, str, 1); /* full call, not tail! */ return tail_call_sibling(x, y, str, len); }
(In reply to Kaz Kylheku from comment #45) > Hi everyone. > > Pardon my ignorance of C-Fortran bridging matters, but does any of the > following make sense? > > The Fortran subroutine has no idea whether the word where the argument is > supposed to be is really the argument, or just some unrelated part of the > stack (because the caller didn't pass the argument). > > Can't the Fortran routine check whether that word has the expected value 1? > Then in that case, just pretend it's the argument (whether that is the case, > or whether that 1 is just a spurious value that looks like the right > argument). In this case, just do all the tail calling to sibling routines > happily, who will get the right string length. I see two problems with this suggestion, one minor and one major. First, there may well be a value > 1 on the stack for a regular call, see comment #15. Second, the value 1 may end up being there by accident, then this method would cause a crash, and this would be even harder to debug than the original case.
(In reply to Thomas Koenig from comment #47) > I see two problems with this suggestion, one minor and one major. > > First, there may well be a value > 1 on the stack for a regular > call, see comment #15. Thus it adds overhead to cases like when "ab" is passed, where the callee treats that as a length 1 string. It has to be suspicious of the value 2 and go through the self-calling hoop. (Oh well; obviously this is a workaround that would have to go through deprecation and removal.) > Second, the value 1 may end up being there by accident, then this That I clearly understand: a value 1 may be there spuriously even if no argument was passed. > method would cause a crash, and this would be even harder to debug > than the original case. That's the thing I'm keenly curious about: if that value is fortuitously the right one, in the right place on the stack, where is the crash? (Of course, we must not mutate that word; it belongs to the C caller's frame.)
(In reply to Kaz Kylheku from comment #48) > That's the thing I'm keenly curious about: if that value is fortuitously the > right one, in the right place on the stack, where is the crash? > > (Of course, we must not mutate that word; it belongs to the C caller's > frame.) In most ABIs, the argument area is owned by the callee, not caller, and this PR is exactly about mutating that area when performing a tail call.
Jakub, should your fix be backported to the gcc-7 branch as well, considering that the PR 87689 fix was applied to that branch as well?
Yes, I just didn't have time for that yet.
Author: jakub Date: Fri Aug 30 12:41:43 2019 New Revision: 275153 URL: https://gcc.gnu.org/viewcvs?rev=275153&root=gcc&view=rev Log: Backported from mainline 2019-05-29 Jakub Jelinek <jakub@redhat.com> PR fortran/90329 * lto-streamer.h (LTO_minor_version): Bump to 1. Backported from mainline 2019-05-16 Jakub Jelinek <jakub@redhat.com> PR fortran/90329 * tree-core.h (struct tree_decl_common): Document decl_nonshareable_flag for PARM_DECLs. * tree.h (DECL_HIDDEN_STRING_LENGTH): Define. * calls.c (expand_call): Don't try tail call if caller has any DECL_HIDDEN_STRING_LENGTH PARM_DECLs that are or might be passed on the stack and callee needs to pass any arguments on the stack. * tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use else if instead of series of mutually exclusive ifs. Handle DECL_HIDDEN_STRING_LENGTH for PARM_DECLs. * tree-streamer-out.c (pack_ts_decl_common_value_fields): Likewise. * lang.opt (fbroken-callers): Remove. (ftail-call-workaround, ftail-call-workaround=): New options. * gfortran.h (struct gfc_namespace): Add implicit_interface_calls. * interface.c (gfc_procedure_use): Set implicit_interface_calls for calls to implicit interface procedures. * trans-decl.c (create_function_arglist): Use flag_tail_call_workaround instead of flag_broken_callers. If it is not 2, also require sym->ns->implicit_interface_calls. * invoke.texi (fbroken-callers): Remove documentation. (ftail-call-workaround, ftail-call-workaround=): Document. 2019-05-19 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/90329 * invoke.texi: Document -fbroken-callers. * lang.opt: Add -fbroken-callers. * trans-decl.c (create_function_arglist): Only set DECL_HIDDEN_STRING_LENGTH if flag_broken_callers is set. 2019-05-16 Jakub Jelinek <jakub@redhat.com> PR fortran/90329 * trans-decl.c (create_function_arglist): Set DECL_HIDDEN_STRING_LENGTH on hidden string length PARM_DECLs if len is constant. Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/calls.c branches/gcc-7-branch/gcc/fortran/ChangeLog branches/gcc-7-branch/gcc/fortran/gfortran.h branches/gcc-7-branch/gcc/fortran/interface.c branches/gcc-7-branch/gcc/fortran/invoke.texi branches/gcc-7-branch/gcc/fortran/lang.opt branches/gcc-7-branch/gcc/fortran/trans-decl.c branches/gcc-7-branch/gcc/lto-streamer.h branches/gcc-7-branch/gcc/tree-core.h branches/gcc-7-branch/gcc/tree-streamer-in.c branches/gcc-7-branch/gcc/tree-streamer-out.c branches/gcc-7-branch/gcc/tree.h
Backported to 7.x, please test it.
(In reply to Jakub Jelinek from comment #53) > Backported to 7.x, please test it. Thanks! I tested the default behavior with DPOSV and reference LAPACK, where it worked fine. Also with all of CRAN+BIOC R packages - there is one newly failing package with 275369 compared to gfortran 7 from Ubuntu 7.4.0-8ubuntu1 with -fno-optimize-sibling-calls, but I don't think it is related to the change.
It does not happen on every run. My MWE: ``` http://www-look-4.com/tech/honor-magicbook/ // test.cpp #include<iostream> int main() https://komiya-dental.com/property/google-android/ { int a{ 4 }; std::cout << "a = " << a << '\n'; http://www.iu-bloomington.com/property/properties-in-turkey/ return 0; } ``` Simply running the program under GDB yields: ``` https://waytowhatsnext.com/property/disney-at-home/ (gdb) r Starting program: test-gdb/a.out [New Thread 0x1a03 of process 24826] [New Thread 0x1b03 of process 24826] http://www.wearelondonmade.com/technology/van-technology/ warning: unhandled dyld version (17) a = 4 [Inferior 1 (process 24826) exited normally] ``` http://www.jopspeech.com/technology/thunderbolt-4/ However, setting breakpoints triggers it: ``` (gdb) b main http://joerg.li/technology/b-class-cars/ Breakpoint 1 at 0x100003e57: file test.cpp, line 5. (gdb) r Starting program: test-gdb/a.out [New Thread 0x2403 of process 24836] http://connstr.net/technology/nasa-latest/ [New Thread 0x2203 of process 24836] warning: unhandled dyld version (17) http://embermanchester.uk/tech/google-drive/ Thread 2 hit Breakpoint 1, main () at test.cpp:5 5 int a{ 4 }; (gdb) r http://www.slipstone.co.uk/technology/cars-interior/ The program being debugged has been started already. Start it from the beginning? (y or n) n Program not restarted. http://www.logoarts.co.uk/technology/robot-vacuums/ (gdb) q A debugging session is active. Inferior 1 [process 24836] will be killed. http://www.acpirateradio.co.uk/technology/global-warming/ Quit anyway? (y or n) y ../../gdb/target.c:2149: internal-error: void target_mourn_inferior(ptid_t): Assertion `ptid == inferior_ptid' failed. http://www.compilatori.com/technology/download-videos/ A problem internal to GDB has been detected, further debugging may prove unreliable. https://www.webb-dev.co.uk/services/vaccine-services/ Quit this debugging session? (y or n) y