Consider the testcase program largewr integer(kind=1) :: a(2_8**31+1) a = 0 a(size(a, kind=8)) = 1 open(10, file="largewr.dat", access="stream", form="unformatted") write (10) a close(10) a(size(a, kind=8)) = 2 open(10, file="largewr.dat", access="stream", form="unformatted") read (10) a if (a(size(a, kind=8)) == 1) then print *, "All is well" else print *, "Oh no" end if end program largewr This fails on x86_64-pc-linux-gnu with ./a.out At line 10 of file largewr.f90 (unit = 10, file = 'largewr.dat') Fortran runtime error: End of file [snip] The reason is that Linux never reads or writes more than 2,147,479,552 bytes in a single syscall. For write libgfortran does the writing in a loop, and it's handled correctly, but for read we cannot do that loop since then it will hang when reading from the terminal where short reads are perfectly Ok.
I forgot a reference; see http://man7.org/linux/man-pages/man2/read.2.html https://stackoverflow.com/questions/38586144/error-when-trying-to-write-a-file-larger-than-2-gb-on-linux suggests that macOS isn't able to handle writes >= 2 GB either.
Author: jb Date: Tue Jan 2 13:25:10 2018 New Revision: 256074 URL: https://gcc.gnu.org/viewcvs?rev=256074&root=gcc&view=rev Log: PR libgfortran/83649 Chunk large reads and writes It turns out that Linux never reads or writes more than 2147479552 bytes in a single syscall. For writes this is not a problem as libgfortran already contains a loop around write() to handle short writes. But for reads we cannot do this, since then read will hang if we have a short read when reading from the terminal. Also, there are reports that macOS fails I/O's larger than 2 GB. Thus, to work around these issues do large reads/writes in chunks. The testcase from the PR program largewr integer(kind=1) :: a(2_8**31+1) a = 0 a(size(a, kind=8)) = 1 open(10, file="largewr.dat", access="stream", form="unformatted") write (10) a close(10) a(size(a, kind=8)) = 2 open(10, file="largewr.dat", access="stream", form="unformatted") read (10) a if (a(size(a, kind=8)) == 1) then print *, "All is well" else print *, "Oh no" end if end program largewr fails on trunk but works with the patch. Regtested on x86_64-pc-linux-gnu, committed to trunk. libgfortran/ChangeLog: 2018-01-02 Janne Blomqvist <jb@gcc.gnu.org> PR libgfortran/83649 * io/unix.c (MAX_CHUNK): New define. (raw_read): For reads larger than MAX_CHUNK, loop. (raw_write): Write no more than MAX_CHUNK bytes per iteration. Modified: trunk/libgfortran/ChangeLog trunk/libgfortran/io/unix.c
Should it be marked as ASSIGNED? IMO worth a back port (otherwise it should be closed as FIXED).
Yes, assigning to myself. So it works on macOS. Did you also test whether it fails without the patch? It'd be nice to have some test results on win64 as well, but I guess we have noone to run those. At least with the patch it shouldn't be worse than with trunk. Another question is whether the current value of MAX_CHUNK is optimal, or whether some smaller value might be better..
> Yes, assigning to myself. So it works on macOS. Did you also test whether > it fails without the patch? Yes, it did fail without the patch. > Another question is whether the current value of MAX_CHUNK is optimal, > or whether some smaller value might be better. First I feared that MAX_CHUNK was too large by one, but this was not the case.
Author: jb Date: Wed Jan 3 11:46:38 2018 New Revision: 256172 URL: https://gcc.gnu.org/viewcvs?rev=256172&root=gcc&view=rev Log: PR libgfortran/83649 Chunk large reads and writes Backport from trunk. It turns out that Linux never reads or writes more than 2147479552 bytes in a single syscall. For writes this is not a problem as libgfortran already contains a loop around write() to handle short writes. But for reads we cannot do this, since then read will hang if we have a short read when reading from the terminal. Also, there are reports that macOS fails I/O's larger than 2 GB. Thus, to work around these issues do large reads/writes in chunks. The testcase from the PR program largewr integer(kind=1) :: a(2_8**31+1) a = 0 a(size(a, kind=8)) = 1 open(10, file="largewr.dat", access="stream", form="unformatted") write (10) a close(10) a(size(a, kind=8)) = 2 open(10, file="largewr.dat", access="stream", form="unformatted") read (10) a if (a(size(a, kind=8)) == 1) then print *, "All is well" else print *, "Oh no" end if end program largewr fails on trunk but works with the patch. Regtested on x86_64-pc-linux-gnu, committed to trunk. libgfortran/ChangeLog: 2018-01-03 Janne Blomqvist <jb@gcc.gnu.org> PR libgfortran/83649 * io/unix.c (MAX_CHUNK): New define. (raw_read): For reads larger than MAX_CHUNK, loop. (raw_write): Write no more than MAX_CHUNK bytes per iteration. --- libgfortran/io/unix.c | 50 ++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 42 insertions(+), 8 deletions(-) diff --git a/libgfortran/io/unix.c b/libgfortran/io/unix.c index a07a3c9..7a982b3 100644 --- a/libgfortran/io/unix.c +++ b/libgfortran/io/unix.c @@ -292,18 +292,49 @@ raw_flush (unix_stream *s __attribute__ ((unused))) return 0; } +/* Write/read at most 2 GB - 4k chunks at a time. Linux never reads or + writes more than this, and there are reports that macOS fails for + larger than 2 GB as well. */ +#define MAX_CHUNK 2147479552 + static ssize_t raw_read (unix_stream *s, void *buf, ssize_t nbyte) { /* For read we can't do I/O in a loop like raw_write does, because that will break applications that wait for interactive I/O. We - still can loop around EINTR, though. */ - while (true) + still can loop around EINTR, though. This however causes a + problem for large reads which must be chunked, see comment above. + So assume that if the size is larger than the chunk size, we're + reading from a file and not the terminal. */ + if (nbyte <= MAX_CHUNK) { - ssize_t trans = read (s->fd, buf, nbyte); - if (trans == -1 && errno == EINTR) - continue; - return trans; + while (true) + { + ssize_t trans = read (s->fd, buf, nbyte); + if (trans == -1 && errno == EINTR) + continue; + return trans; + } + } + else + { + ssize_t bytes_left = nbyte; + char *buf_st = buf; + while (bytes_left > 0) + { + ssize_t to_read = bytes_left < MAX_CHUNK ? bytes_left: MAX_CHUNK; + ssize_t trans = read (s->fd, buf_st, to_read); + if (trans == -1) + { + if (errno == EINTR) + continue; + else + return trans; + } + buf_st += trans; + bytes_left -= trans; + } + return nbyte - bytes_left; } } @@ -317,10 +348,13 @@ raw_write (unix_stream *s, const void *buf, ssize_t nbyte) buf_st = (char *) buf; /* We must write in a loop since some systems don't restart system - calls in case of a signal. */ + calls in case of a signal. Also some systems might fail outright + if we try to write more than 2 GB in a single syscall, so chunk + up large writes. */ while (bytes_left > 0) { - trans = write (s->fd, buf_st, bytes_left); + ssize_t to_write = bytes_left < MAX_CHUNK ? bytes_left: MAX_CHUNK; + trans = write (s->fd, buf_st, to_write); if (trans == -1) { if (errno == EINTR)
Author: jb Date: Wed Jan 3 12:08:05 2018 New Revision: 256173 URL: https://gcc.gnu.org/viewcvs?rev=256173&root=gcc&view=rev Log: PR libgfortran/83649 Chunk large reads and writes Backport from trunk. It turns out that Linux never reads or writes more than 2147479552 bytes in a single syscall. For writes this is not a problem as libgfortran already contains a loop around write() to handle short writes. But for reads we cannot do this, since then read will hang if we have a short read when reading from the terminal. Also, there are reports that macOS fails I/O's larger than 2 GB. Thus, to work around these issues do large reads/writes in chunks. The testcase from the PR program largewr integer(kind=1) :: a(2_8**31+1) a = 0 a(size(a, kind=8)) = 1 open(10, file="largewr.dat", access="stream", form="unformatted") write (10) a close(10) a(size(a, kind=8)) = 2 open(10, file="largewr.dat", access="stream", form="unformatted") read (10) a if (a(size(a, kind=8)) == 1) then print *, "All is well" else print *, "Oh no" end if end program largewr fails on trunk but works with the patch. Regtested on x86_64-pc-linux-gnu, committed to trunk. libgfortran/ChangeLog: 2018-01-03 Janne Blomqvist <jb@gcc.gnu.org> PR libgfortran/83649 * io/unix.c (MAX_CHUNK): New define. (raw_read): For reads larger than MAX_CHUNK, loop. (raw_write): Write no more than MAX_CHUNK bytes per iteration. Modified: branches/gcc-6-branch/libgfortran/ChangeLog branches/gcc-6-branch/libgfortran/io/unix.c
Fixed on trunk/7/6, closing.