As reported at http://www.rhinocerus.net/forum/lang-fortran/708936-inquire-get-maximum-record-length-bug-gfortran.html One has to distinguish three cases: - Direct access: The RECL= has to be given in OPEN - Stream access: No RECL= is allowed in OPEN - Sequential access: RECL= can be given in OPEN, otherwise the default is used. gfortran handles the case of RECL= given in OPEN correctly by returning that value and it handles stream access by using -1. (As the argument becomes undefined, any value would do.) Additionally, gfortran always returns -1 for sequential access instead of returning the default value. The question is only how to handle sequential access. We have two questions: a) gfortran handles (on most systems) large files, i.e. the default value would be larger than huge(0). Additionally, (cf. PR 44292) gfortran does not handle INT64 variables for RECL. Thus, which value should be returned? HUGE(kind(recl_var))? b) What should happen if the user exceeds the RECL= specified in OPEN. (Possibly, that's handled correctly, but I want to add it to the to-do list for checking.) - That is also related to the question of (a): Handling INQUIRE RECL=int32_var. From the Fortran 2008 standard at "9.10.2.26 RECL= specifier in the INQUIRE statement": "The scalar-int-variable in the RECL= specifier is assigned the value of the record length of a connection for direct access, or the value of the maximum record length of a connection for sequential access. If the connection is for formatted input/output, the length is the number of characters for all records that contain only characters of default kind. If the connection is for unformatted input/output, the length is measured in file storage units. If there is no connection, or if the connection is for stream access, the scalar int-variable becomes undefined." For OPEN, the following applies: "The value of the RECL= specifier shall be positive. It specifies the length of each record in a file being connected for direct access, or specifies the maximum length of a record in a file being connected for sequential access. This specifier shall not appear when a file is being connected for stream access. This specifier shall appear when a file is being connected for direct access. If this specifier is omitted when a file is being connected for sequential access, the default value is processor dependent. If the file is being connected for formatted input/output, the length is the number of characters for all records that contain only characters of default kind. When a record contains any nondefault characters, the effect of the RECL= specifier is processor dependent. If the file is being connected for unformatted input/output, the length is measured in file storage units. For an existing file, the value of the RECL= specifier shall be included in the set of allowed record lengths for the le. For a new file, the processor creates the file with a set of allowed record lengths that includes the specified value."
For comparison, for formatted and unformatted sequential, I get the following result with different compilers: NAG f95: 1024 / unformatted: 65536 Crayftn: 1024 / -1 (with int64 variable: 9223372036854775807) g95: 1000000000 / 1000000000 ifort 132 / 510 PGI -14920 (with int64 variable: 140737488340408) / as formatted gfortran -1 / -1 When exceeding an explicitly given RECL=: NAG f95: Buffer overflow on output g95: Writing more data than the record size (RECL) gfortran: End of record ifort: no error - but starts new record after recl is exceeded* crayftn: as ifort pgi: no error - but a single record* * For formatted; writing a longer string to unformatted will yield a single record. * * * gfortran has: ./libgfortran.h:#define DEFAULT_RECL 1073741824 which gets used with ./runtime/environ.c: {"GFORTRAN_DEFAULT_RECL", DEFAULT_RECL, &options.default_recl, and in io/unit.c's init_units for stdin, stdout and stderr. Others get set via io/open.c's new_unit: u->recl = opp->recl_in; And in transfer, one finds code like: /* For preconnected units with default record length, set bytes left to unit record length and proceed, otherwise error. */ ... && dtp->u.p.current_unit->recl == DEFAULT_RECL)) dtp->u.p.current_unit->bytes_left = dtp->u.p.current_unit->recl;
*** Bug 53799 has been marked as a duplicate of this bug. ***
IIRC recl is an unsigned integer? I will look further on this one.
(In reply to comment #3) > IIRC recl is an unsigned integer? I will look further on this one. But Fortran doesn't have unsigned integers. If the intention is to indicate no practical limit, then I guess the value to return is HUGE(0).
(In reply to comment #4) > But Fortran doesn't have unsigned integers. If the intention is to indicate no > practical limit, then I guess the value to return is HUGE(0). And we should fill an interpretation request regarding this issue - as suggested by Richard Maine. Regarding the limit: I have to admit that I do not understand how GFORTRAN_DEFAULT_RECL is currently handled. Will it be the true upper bound, will it be extended etc.? There is a lot of special code, but that value seems to get only set for STDIN/STDOUT/STDERR. Additionally, using a special value from the allowed range askes (in principle) for trouble. Using a negative value internally for special values might make more sense - though it might require additional conditional checks (-> slower, potential sources for bugs).
Yes, I am talking about our internal representation. I have to go look at the code detail yet, I am just thinking out loud here on the bug report. It is one possibility of what we are doing wrong if we set it to the max unsigned value. There are still other possibilities of whats wrong here. Give me a day or two. ;) I believe it gets set in open.c, or it should.
I have now ask at http://j3-fortran.org/pipermail/j3/2012-June/005446.html
Steve Lionel points at the following: "If an error condition occurs during execution of an INQUIRE statement, all of the inquiry specier variables become undened, except for variables in the IOSTAT= and IOMSG= speciers (if any)." Thus, if the RECL of the connection is larger than huge(kind(recl_var)), one has to raise an error. The question is only how the default should be handled. As written, I do not really understand the current algorithm. Namely, whether we effectively implement huge(0_int64) by extending the chunks. Or whether we apply a lower boundary by default. Additionally, we have the problem that the internal IO descriptor is only of kind int32 (cf. PR 44292).
We have more then one thing to fix here. Try this variation: integer(kind=8) :: s, r open(unit=1, file='testsize.f90', status='old', recl=500) inquire(unit=1, size=s, recl=r) print *, 'size=', s, ' recl=', r end I will continue dabbling with this while we wait for the Comment #7 response.
For completeness, in the case I give in Comment #9, I get Operating system error: Cannot allocate memory Memory allocation failed I have instrumented a few places to see what we are getting with the original test case included here: integer(kind=8) :: s, r open(unit=1, file='testsize.f90', status='old') inquire(unit=1, size=s, recl=r) print *, 'size=', s, ' recl=', r end And I get: $ ./a.out In init_units() max_offset set to 4294967295 In newunit(): 4294967295 Ping! recl=4294967295 size= 135 recl= -1 This confirms that we are setting the unit recl to a large number which is printed as -1 when using signed output, but unsigned is what we might expect. max_offset is calculated at run time as follows: /* Calculate the maximum file offset in a portable manner. max will be the largest signed number for the type gfc_offset. set a 1 in the LSB and keep a running sum, stopping at MSB-1 bit. */ max_offset = 0; for (i = 0; i < sizeof (max_offset) * 8 - 1; i++) max_offset = max_offset + ((gfc_offset) 1 << i); printf("max_offset set to %u\n", max_offset); I think we can decide what to do differently after we get Comment #7 feedback. In the meantime, I will be looking at what we are doing wrong in the case of Comment #9, maybe a separate bug.
Maybe a new PR for this is in order. gdb output with test case in Comment #9 (gdb) 634 if (flags->form == FORM_FORMATTED) (gdb) 636 if ((opp->common.flags & IOPARM_OPEN_HAS_RECL_IN)) (gdb) 637 fbuf_init (u, u->recl); (gdb) _gfortrani_fbuf_init (u=u@entry=0x6eea60, len=-1) at ../../../trunk/libgfortran/io/fbuf.c:38 38 { (gdb) 39 if (len == 0) (gdb) 38 { (gdb) 39 if (len == 0) (gdb) 42 u->fbuf = xmalloc (sizeof (struct fbuf)); (gdb) _gfortrani_xmalloc (n=n@entry=24) at ../../../trunk/libgfortran/runtime/memory.c:33 33 { (gdb) I need to peak at u-recl to see what it is here. 37 n = 1; (gdb) 39 p = malloc (n); (gdb) 41 if (p == NULL) (gdb) 45 } (gdb) _gfortrani_fbuf_init (u=u@entry=0x6eea60, len=-1) at ../../../trunk/libgfortran/io/fbuf.c:43 43 u->fbuf->buf = xmalloc (len); (gdb) 42 u->fbuf = xmalloc (sizeof (struct fbuf)); (gdb) 43 u->fbuf->buf = xmalloc (len); (gdb) _gfortrani_xmalloc (n=n@entry=18446744073709551615) at ../../../trunk/libgfortran/runtime/memory.c:33 33 { (gdb) 37 n = 1; (gdb) 39 p = malloc (n); (gdb) 41 if (p == NULL) (gdb) 42 os_error ("Memory allocation failed"); (gdb) _gfortrani_os_error (message=message@entry=0x4b6a6d "Memory allocation failed") at ../../../trunk/libgfortran/runtime/error.c:294 294 { (gdb)
Jerry > Try this variation: > > integer(kind=8) :: s, r > open(unit=1, file='testsize.f90', status='old', recl=500) > inquire(unit=1, size=s, recl=r) > print *, 'size=', s, ' recl=', r > end Yes, that works as expected, giving r=500. The case that doesn't work as expected is when you don't use RECL in the OPEN statement, and use inquire to find the default size limit. I assume you are right that -1 represents the maximum unsigned integer - just a pity that Fortran interprets it as -1 not the largest +ve value. Regards
Considering that we IMHO want to support unformatted sequential records > 2 GB on 64-bit targets without the user having to explicitly specify RECL= when opening, I'd suggest making DEFAULT_RECL match huge(0_C_SIZE_T). The downside is then that if the user inquires for RECL on a 64-bit target where the RECL= variable is a 32-bit integer, we'll have to generate an error, per Steve Lionel's comments. Wrt. the signed vs. unsigned issue, POSIX specifies that off_t must be a signed type, so all file offset calculations (including record size stuff) should be done in signed arithmetic. Now, gfc_offset is a typedef for off_t (or the equivalent type on mingw), so AFAICS we should be ok.
What is the status of this PR (the last entry is more than three years old)?
AFAICT it remains unfixed. GFortran development resources being what they are, not all bugs can be promptly fixed. But what's the rush to close old unfixed bugs anyway? "Number of open bugs" is a ridiculous metric for software quality, especially if one then closes unfixed bugs in some kind of effort to boost said metric.
> AFAICT it remains unfixed. GFortran development resources being what they are, > not all bugs can be promptly fixed. > But what's the rush to close old unfixed bugs anyway? "Number of open bugs" > is a ridiculous metric for software quality, especially if one then closes > unfixed bugs in some kind of effort to boost said metric. Well, I just moved the PR from UNCONFIRMED to WAITING. I am fine about moving it from WAITING to NEW. However, I don't see the point to keep open PRs about dubious enhancement that will never be considered for the next decades.
Still exists in 6.2.0 . If anyone can believe it, I actually used this feature to work around an ifort bug. Talk about the catch 22. Good to know that recl in open for sequential files still work.
Patch: https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01807.html
Author: jb Date: Tue Nov 28 19:28:50 2017 New Revision: 255215 URL: https://gcc.gnu.org/viewcvs?rev=255215&root=gcc&view=rev Log: PR 53796 Improve INQUIRE(RECL=...) handling The current F2018 draft (N2137) specifies behavior of the RECL= specifier in the INQUIRE statement, where it previously was left as undefined. Namely: - If the unit is not connected, RECL= should be given the value -1. - If the unit is connected with stream access, RECL= should be given the value -2. Further, as PR 53796 describes, the handling of RECL= is poor in other ways as well. When the recl is set to the maximum possible (GFC_INTEGER_8_HUGE / LLONG_MAX), which it does by default except for preconnected units, and when INQUIRE(RECL=) is used with a 4 byte integer, the value is truncated and the 4 byte value is thus -1. Fixing this to generate an error is a lot of work, as currently the truncation is done by the frontend, the library sees only an 8 byte value with no indication that the frontend is going to copy it to a 4 byte one. Instead, this patch does a bit twiddling trick such that the truncated 4 byte value is GFC_INTEGER_4_HUGE while still being 0.99999999 * GFC_INTEGER_8_HUGE which is large enough for all practical purposes. Finally, the patch removes GFORTRAN_DEFAULT_RECL which was used only for preconnected units, and instead uses the same approach as describe above. Regtested on x86_64-pc-linux-gnu, Ok for trunk. gcc/fortran/ChangeLog: 2017-11-28 Janne Blomqvist <jb@gcc.gnu.org> PR fortran/53796 * gfortran.texi: Remove mentions of GFORTRAN_DEFAULT_RECL. libgfortran/ChangeLog: 2017-11-28 Janne Blomqvist <jb@gcc.gnu.org> PR fortran/53796 * io/inquire.c (inquire_via_unit): Set recl to -1 for unconnected units. * io/io.h (default_recl): New variable. * io/open.c (new_unit): Set recl to default_recl for sequential, -2 for stream access. * io/transfer.c (read_block_form): Test against default_recl instead of DEFAULT_RECL. (write_block): Likewise. * io/unit.c (init_units): Calculate max_offset, default_recl. * libgfortran.h (DEFAULT_RECL): Remove. * runtime/environ.c: Remove GFORTRAN_DEFAULT_RECL. gcc/testsuite/ChangeLog: 2017-11-28 Janne Blomqvist <jb@gcc.gnu.org> PR fortran/53796 * gfortran.dg/inquire_recl_f2018.f90: New test. Added: trunk/gcc/testsuite/gfortran.dg/inquire_recl_f2018.f90 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/gfortran.texi trunk/gcc/testsuite/ChangeLog trunk/libgfortran/io/inquire.c trunk/libgfortran/io/io.h trunk/libgfortran/io/open.c trunk/libgfortran/io/transfer.c trunk/libgfortran/io/unit.c trunk/libgfortran/libgfortran.h trunk/libgfortran/runtime/environ.c
Is this fixed now?
Yes, I had forgot to close it.
Author: jb Date: Fri Sep 7 18:59:50 2018 New Revision: 264163 URL: https://gcc.gnu.org/viewcvs?rev=264163&root=gcc&view=rev Log: Remove unused init_unsigned_integer function. As pointed out by Bernhard Reutner-Fischer, this function is unused since the fix for PR 53796 in November 2017. 2018-09-07 Janne Blomqvist <jb@gcc.gnu.org> * runtime/environ.c (init_unsigned_integer): Remove. Modified: trunk/libgfortran/ChangeLog trunk/libgfortran/runtime/environ.c
Author: jb Date: Wed Aug 7 07:34:10 2019 New Revision: 274160 URL: https://gcc.gnu.org/viewcvs?rev=274160&root=gcc&view=rev Log: PR 53796 Make inquire(file=, recl=) conform to F2018 In my original patch to fix PR 53796 I forgot to fix the behavior for unconnected units when inquiring via filename. This patch fixes that. Regtested on x86_64-pc-linux-gnu, committed as obvious. libgfortran/ChangeLog: 2019-08-07 Janne Blomqvist <jb@gcc.gnu.org> PR fortran/53796 * io/inquire.c (inquire_via_filename): Set recl to -1 for unconnected units. gcc/testsuite/ChangeLog: 2019-08-07 Janne Blomqvist <jb@gcc.gnu.org> PR fortran/53796 * gfortran.dg/inquire_recl_f2018.f90: Test for unconnected unit with inquire via filename. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gfortran.dg/inquire_recl_f2018.f90 trunk/libgfortran/ChangeLog trunk/libgfortran/io/inquire.c