Bug 53796 - I/O INQUIRE of RECL: If not specified in OPEN, the default value should be returned (sequential access)
Summary: I/O INQUIRE of RECL: If not specified in OPEN, the default value should be ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.8.0
: P3 normal
Target Milestone: ---
Assignee: Janne Blomqvist
URL: https://gcc.gnu.org/ml/gcc-patches/20...
Keywords:
: 53799 (view as bug list)
Depends on: 44292
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-28 10:26 UTC by Tobias Burnus
Modified: 2019-08-07 07:34 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2015-10-20 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Burnus 2012-06-28 10:26:19 UTC
As reported at http://www.rhinocerus.net/forum/lang-fortran/708936-inquire-get-maximum-record-length-bug-gfortran.html


One has to distinguish three cases:
- Direct access: The RECL= has to be given in OPEN
- Stream access: No RECL= is allowed in OPEN
- Sequential access: RECL= can be given in OPEN, otherwise the default is used.

gfortran handles the case of RECL= given in OPEN correctly by returning that value and it handles stream access by using -1. (As the argument becomes undefined, any value would do.)

Additionally, gfortran always returns -1 for sequential access instead of returning the default value.


The question is only how to handle sequential access. We have two questions:

a) gfortran handles (on most systems) large files, i.e. the default value would be larger than huge(0). Additionally, (cf. PR 44292) gfortran does not handle INT64  variables for RECL. Thus, which value should be returned? HUGE(kind(recl_var))?

b) What should happen if the user exceeds the RECL= specified in OPEN. (Possibly, that's handled correctly, but I want to add it to the to-do list for checking.) - That is also related to the question of (a): Handling INQUIRE RECL=int32_var.


From the Fortran 2008 standard at "9.10.2.26 RECL= specifier in the INQUIRE statement":

"The scalar-int-variable in the RECL= specifier is assigned the value of the record length of a connection for direct access, or the value of the maximum record length of a connection for sequential access. If the connection is for
formatted input/output, the length is the number of characters for all records that contain only characters of default kind. If the connection is for unformatted input/output, the length is measured in file storage units. If there is no connection, or if the connection is for stream access, the scalar int-variable becomes undefined."


For OPEN, the following applies:

"The value of the RECL= specifier shall be positive. It specifies the length of each record in a file being connected for direct access, or specifies the maximum length of a record in a file being connected for sequential access. This
specifier shall not appear when a file is being connected for stream access. This specifier shall appear when a file is being connected for direct access. If this specifier is omitted when a file is being connected for sequential
access, the default value is processor dependent. If the file is being connected for formatted input/output, the length is the number of characters for all records that contain only characters of default kind. When a record
contains any nondefault characters, the effect of the RECL= specifier is processor dependent. If the file is being connected for unformatted input/output, the length is measured in file storage units. For an existing file, the value of the RECL= specifier shall be included in the set of allowed record lengths for the le. For a new file, the processor creates the file with a set of allowed record lengths that includes the specified value."
Comment 1 Tobias Burnus 2012-06-28 15:42:32 UTC
For comparison, for formatted and unformatted sequential, I get the following result with different compilers:

NAG f95: 1024  / unformatted: 65536
Crayftn: 1024 / -1 (with int64 variable: 9223372036854775807)
g95:     1000000000 / 1000000000
ifort     132 / 510
PGI      -14920 (with int64 variable: 140737488340408) / as formatted
gfortran -1 / -1

When exceeding an explicitly given RECL=:
NAG f95:  Buffer overflow on output
g95:      Writing more data than the record size (RECL)
gfortran: End of record
ifort:    no error - but starts new record after recl is exceeded*
crayftn:  as ifort
pgi:      no error - but a single record*
* For formatted; writing a longer string to unformatted will yield a single record.

 * * * 

gfortran has:

./libgfortran.h:#define DEFAULT_RECL 1073741824

which gets used with

./runtime/environ.c:  {"GFORTRAN_DEFAULT_RECL", DEFAULT_RECL, &options.default_recl,

and in io/unit.c's init_units for stdin, stdout and stderr. Others get set via io/open.c's new_unit: u->recl = opp->recl_in;


And in transfer, one finds code like:

          /* For preconnected units with default record length, set bytes left
             to unit record length and proceed, otherwise error.  */
           ...
                      && dtp->u.p.current_unit->recl == DEFAULT_RECL))
            dtp->u.p.current_unit->bytes_left = dtp->u.p.current_unit->recl;
Comment 2 Clive Page 2012-06-28 18:21:37 UTC
*** Bug 53799 has been marked as a duplicate of this bug. ***
Comment 3 Jerry DeLisle 2012-06-29 09:55:43 UTC
IIRC recl is an unsigned integer?  I will look further on this one.
Comment 4 Clive Page 2012-06-29 10:00:02 UTC
(In reply to comment #3)
> IIRC recl is an unsigned integer?  I will look further on this one.

But Fortran doesn't have unsigned integers.  If the intention is to indicate no practical limit, then I guess the value to return is HUGE(0).
Comment 5 Tobias Burnus 2012-06-29 10:07:10 UTC
(In reply to comment #4)
> But Fortran doesn't have unsigned integers.  If the intention is to indicate no
> practical limit, then I guess the value to return is HUGE(0).

And we should fill an interpretation request regarding this issue - as suggested by Richard Maine.

Regarding the limit: I have to admit that I do not understand how GFORTRAN_DEFAULT_RECL is currently handled. Will it be the true upper bound, will it be extended etc.?

There is a lot of special code, but that value seems to get only set for STDIN/STDOUT/STDERR. Additionally, using a special value from the allowed range askes (in principle) for trouble. Using a negative value internally for special values might make more sense - though it might require additional conditional checks (-> slower, potential sources for bugs).
Comment 6 Jerry DeLisle 2012-06-29 10:13:13 UTC
Yes, I am talking about our internal representation.  I have to go look at the
code detail yet, I am just thinking out loud here on the bug report. It is one
possibility of what we are doing wrong if we set it to the max unsigned value.

There are still other possibilities of whats wrong here.  Give me a day or two.
;)

I believe it gets set in open.c, or it should.
Comment 7 Tobias Burnus 2012-06-29 13:29:36 UTC
I have now ask at http://j3-fortran.org/pipermail/j3/2012-June/005446.html
Comment 8 Tobias Burnus 2012-06-29 15:04:47 UTC
Steve Lionel points at the following:

"If an error condition occurs during execution of an INQUIRE statement, all of the inquiry specier variables become undened, except for variables in the IOSTAT= and IOMSG= speciers (if any)."

Thus, if the RECL of the connection is larger than huge(kind(recl_var)), one has to raise an error.

The question is only how the default should be handled. As written, I do not really understand the current algorithm. Namely, whether we effectively implement huge(0_int64) by extending the chunks. Or whether we apply a lower boundary by default.

Additionally, we have the problem that the internal IO descriptor is only of kind int32 (cf. PR 44292).
Comment 9 Jerry DeLisle 2012-06-29 23:43:23 UTC
We have more then one thing to fix here.

Try this variation:

integer(kind=8) :: s, r
open(unit=1, file='testsize.f90', status='old', recl=500)
inquire(unit=1, size=s, recl=r)
print *, 'size=', s, ' recl=', r
end

I will continue dabbling with this while we wait for the Comment #7 response.
Comment 10 Jerry DeLisle 2012-06-30 00:32:29 UTC
For completeness, in the case I give in Comment #9, I get

Operating system error: Cannot allocate memory
Memory allocation failed

I have instrumented a few places to see what we are getting with the original test case included here:

integer(kind=8) :: s, r
open(unit=1, file='testsize.f90', status='old')
inquire(unit=1, size=s, recl=r)
print *, 'size=', s, ' recl=', r
end

And I get:

$ ./a.out 
In init_units()
max_offset set to 4294967295
In newunit(): 4294967295
Ping! recl=4294967295
 size=                  135  recl=                   -1

This confirms that we are setting the unit recl to a large number which is printed as -1 when using signed output, but unsigned is what we might expect.

max_offset is calculated at run time as follows:

  /* Calculate the maximum file offset in a portable manner.
     max will be the largest signed number for the type gfc_offset.
     set a 1 in the LSB and keep a running sum, stopping at MSB-1 bit.  */
  max_offset = 0;
  for (i = 0; i < sizeof (max_offset) * 8 - 1; i++)
    max_offset = max_offset + ((gfc_offset) 1 << i);
  printf("max_offset set to %u\n", max_offset);

I think we can decide what to do differently after we get Comment #7 feedback.

In the meantime, I will be looking at what we are doing wrong in the case of Comment #9, maybe a separate bug.
Comment 11 Jerry DeLisle 2012-06-30 01:35:12 UTC
Maybe a new PR for this is in order.

gdb output with test case in Comment #9

(gdb) 
634	  if (flags->form == FORM_FORMATTED)
(gdb) 
636	      if ((opp->common.flags & IOPARM_OPEN_HAS_RECL_IN))
(gdb) 
637	        fbuf_init (u, u->recl);
(gdb) 
_gfortrani_fbuf_init (u=u@entry=0x6eea60, len=-1)
    at ../../../trunk/libgfortran/io/fbuf.c:38
38	{
(gdb) 
39	  if (len == 0)
(gdb) 
38	{
(gdb) 
39	  if (len == 0)
(gdb) 
42	  u->fbuf = xmalloc (sizeof (struct fbuf));
(gdb) 
_gfortrani_xmalloc (n=n@entry=24)
    at ../../../trunk/libgfortran/runtime/memory.c:33
33	{
(gdb) 

I need to peak at u-recl to see what it is here.
37	    n = 1;
(gdb) 
39	  p = malloc (n);
(gdb) 
41	  if (p == NULL)
(gdb) 
45	}
(gdb) 
_gfortrani_fbuf_init (u=u@entry=0x6eea60, len=-1)
    at ../../../trunk/libgfortran/io/fbuf.c:43
43	  u->fbuf->buf = xmalloc (len);
(gdb) 
42	  u->fbuf = xmalloc (sizeof (struct fbuf));
(gdb) 
43	  u->fbuf->buf = xmalloc (len);
(gdb) 
_gfortrani_xmalloc (n=n@entry=18446744073709551615)
    at ../../../trunk/libgfortran/runtime/memory.c:33
33	{
(gdb) 
37	    n = 1;
(gdb) 
39	  p = malloc (n);
(gdb) 
41	  if (p == NULL)
(gdb) 
42	    os_error ("Memory allocation failed");
(gdb) 
_gfortrani_os_error (message=message@entry=0x4b6a6d "Memory allocation failed")
    at ../../../trunk/libgfortran/runtime/error.c:294
294	{
(gdb)
Comment 12 Clive Page 2012-06-30 08:40:34 UTC
Jerry

> Try this variation:
>
> integer(kind=8) :: s, r
> open(unit=1, file='testsize.f90', status='old', recl=500)
> inquire(unit=1, size=s, recl=r)
> print *, 'size=', s, ' recl=', r
> end

Yes, that works as expected, giving r=500.   The case that doesn't work 
as expected is when you don't use RECL in the OPEN statement, and use 
inquire to find the default size limit.

I assume you are right that -1 represents the maximum unsigned integer - 
just a pity that Fortran interprets it as -1 not the largest +ve value.

Regards
Comment 13 Janne Blomqvist 2012-06-30 20:09:46 UTC
Considering that we IMHO want to support unformatted sequential records > 2 GB on 64-bit targets without the user having to explicitly specify RECL= when opening, I'd suggest making DEFAULT_RECL match huge(0_C_SIZE_T).

The downside is then that if the user inquires for RECL on a 64-bit target where the RECL= variable is a 32-bit integer, we'll have to generate an error, per Steve Lionel's comments. 

Wrt. the signed vs. unsigned issue, POSIX specifies that off_t must be a signed type, so all file offset calculations (including record size stuff) should be done in signed arithmetic. Now, gfc_offset is a typedef for off_t (or the equivalent type on mingw), so AFAICS we should be ok.
Comment 14 Dominique d'Humieres 2015-10-20 17:14:35 UTC
What is the status of this PR (the last entry is more than three years old)?
Comment 15 Janne Blomqvist 2015-10-20 17:58:25 UTC
AFAICT it remains unfixed. GFortran development resources being what they are, not all bugs can be promptly fixed. 

But what's the rush to close old unfixed bugs anyway? "Number of open bugs" is a ridiculous metric for software quality,  especially if one then closes unfixed bugs in some kind of effort to boost said metric.
Comment 16 Dominique d'Humieres 2015-10-20 18:57:32 UTC
> AFAICT it remains unfixed. GFortran development resources being what they are,
> not all bugs can be promptly fixed. 

> But what's the rush to close old unfixed bugs anyway? "Number of open bugs"
> is a ridiculous metric for software quality,  especially if one then closes
> unfixed bugs in some kind of effort to boost said metric.

Well, I just moved the PR from UNCONFIRMED to WAITING. I am fine about moving it from WAITING to NEW.

However, I don't see the point to keep open PRs about dubious enhancement that will never be considered for the next decades.
Comment 17 toK 2016-11-14 21:22:11 UTC
Still exists in 6.2.0 . If anyone can believe it, I actually used this feature to work around an ifort bug. Talk about the catch 22. Good to know that recl in open for sequential files still work.
Comment 18 Janne Blomqvist 2017-11-20 20:10:23 UTC
Patch: https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01807.html
Comment 19 Janne Blomqvist 2017-11-28 19:29:22 UTC
Author: jb
Date: Tue Nov 28 19:28:50 2017
New Revision: 255215

URL: https://gcc.gnu.org/viewcvs?rev=255215&root=gcc&view=rev
Log:
PR 53796 Improve INQUIRE(RECL=...) handling

The current F2018 draft (N2137) specifies behavior of the RECL=
specifier in the INQUIRE statement, where it previously was left as
undefined. Namely:

- If the unit is not connected, RECL= should be given the value -1.
- If the unit is connected with stream access, RECL= should be given
  the value -2.

Further, as PR 53796 describes, the handling of RECL= is poor in other
ways as well. When the recl is set to the maximum possible
(GFC_INTEGER_8_HUGE / LLONG_MAX), which it does by default except for
preconnected units, and when INQUIRE(RECL=) is used with a 4 byte
integer, the value is truncated and the 4 byte value is thus
-1. Fixing this to generate an error is a lot of work, as currently
the truncation is done by the frontend, the library sees only an 8
byte value with no indication that the frontend is going to copy it to
a 4 byte one. Instead, this patch does a bit twiddling trick such that
the truncated 4 byte value is GFC_INTEGER_4_HUGE while still being
0.99999999 * GFC_INTEGER_8_HUGE which is large enough for all
practical purposes.

Finally, the patch removes GFORTRAN_DEFAULT_RECL which was used only
for preconnected units, and instead uses the same approach as describe
above.

Regtested on x86_64-pc-linux-gnu, Ok for trunk.

gcc/fortran/ChangeLog:

2017-11-28  Janne Blomqvist  <jb@gcc.gnu.org>

	PR fortran/53796
	* gfortran.texi: Remove mentions of GFORTRAN_DEFAULT_RECL.

libgfortran/ChangeLog:

2017-11-28  Janne Blomqvist  <jb@gcc.gnu.org>

	PR fortran/53796
	* io/inquire.c (inquire_via_unit): Set recl to -1 for unconnected
	units.
	* io/io.h (default_recl): New variable.
	* io/open.c (new_unit): Set recl to default_recl for sequential,
	-2 for stream access.
	* io/transfer.c (read_block_form): Test against default_recl
	instead of DEFAULT_RECL.
	(write_block): Likewise.
	* io/unit.c (init_units): Calculate max_offset, default_recl.
	* libgfortran.h (DEFAULT_RECL): Remove.
	* runtime/environ.c: Remove GFORTRAN_DEFAULT_RECL.

gcc/testsuite/ChangeLog:

2017-11-28  Janne Blomqvist  <jb@gcc.gnu.org>

	PR fortran/53796
	* gfortran.dg/inquire_recl_f2018.f90: New test.

Added:
    trunk/gcc/testsuite/gfortran.dg/inquire_recl_f2018.f90
Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/gfortran.texi
    trunk/gcc/testsuite/ChangeLog
    trunk/libgfortran/io/inquire.c
    trunk/libgfortran/io/io.h
    trunk/libgfortran/io/open.c
    trunk/libgfortran/io/transfer.c
    trunk/libgfortran/io/unit.c
    trunk/libgfortran/libgfortran.h
    trunk/libgfortran/runtime/environ.c
Comment 20 Thomas Koenig 2017-12-09 12:30:54 UTC
Is this fixed now?
Comment 21 Janne Blomqvist 2017-12-09 12:41:03 UTC
Yes, I had forgot to close it.
Comment 22 Janne Blomqvist 2018-09-07 19:00:27 UTC
Author: jb
Date: Fri Sep  7 18:59:50 2018
New Revision: 264163

URL: https://gcc.gnu.org/viewcvs?rev=264163&root=gcc&view=rev
Log:
Remove unused init_unsigned_integer function.

As pointed out by Bernhard Reutner-Fischer, this function is unused
since the fix for PR 53796 in November 2017.

2018-09-07  Janne Blomqvist  <jb@gcc.gnu.org>

        * runtime/environ.c (init_unsigned_integer): Remove.

Modified:
    trunk/libgfortran/ChangeLog
    trunk/libgfortran/runtime/environ.c
Comment 23 Janne Blomqvist 2019-08-07 07:34:44 UTC
Author: jb
Date: Wed Aug  7 07:34:10 2019
New Revision: 274160

URL: https://gcc.gnu.org/viewcvs?rev=274160&root=gcc&view=rev
Log:
PR 53796 Make inquire(file=, recl=) conform to F2018

In my original patch to fix PR 53796 I forgot to fix the behavior for
unconnected units when inquiring via filename. This patch fixes that.

Regtested on x86_64-pc-linux-gnu, committed as obvious.

libgfortran/ChangeLog:

2019-08-07  Janne Blomqvist  <jb@gcc.gnu.org>

	PR fortran/53796
	* io/inquire.c (inquire_via_filename): Set recl to -1 for
	unconnected units.

gcc/testsuite/ChangeLog:

2019-08-07  Janne Blomqvist  <jb@gcc.gnu.org>

	PR fortran/53796
	* gfortran.dg/inquire_recl_f2018.f90: Test for unconnected unit
	with inquire via filename.

Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gfortran.dg/inquire_recl_f2018.f90
    trunk/libgfortran/ChangeLog
    trunk/libgfortran/io/inquire.c