This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch, libgfortran] [4.8/4.9/5 Regression] error reading (and writing) large text files in gfortran


With the attached patch I create a special version of fbuf_flush that is only called with list directed I/O. There can be no tabbing back and forth so it is safe to flush the buffer whenever we want. The bug occurs when the buffer keeps growing to no end until no more allocations can be made and the OS gives an error.

fbuf_flush can be called as much as one wants to, but to keep overhead low, I introduce a new version with an arbitrary limit. If below the limit, just return doing no flushing. When the limit is exceeded, fbuf is flushed. The code to do this is duplicated from the original fbuf_flush so it is very safe and well tested.

I played with the allowable maximum position limit for flushing and ran some timing tests. My machine here uses a solid state drive which may bias the results somewhat. Others are welcome to test and see what values they get as well. I settled on 524588 based on these results, favoring writing over reading a little.

The patch has zero impact on any other type of I/O including normal formatted I/O. I also tested to confirm that formatted I/O does not have the problem. It is flushed regularly in the main transfer loop. As far as I can tell only list directed has the issue.

I get the following timings using the attached test program.

WRITING:
 Array Size-->	1000000		100000000
 Buf Limit

  16384		2.107		210.9
  32768		2.026		292.1
  65636		2.232		235.8
  524288	1.958		193.5
 1048576	2.023		203.5


READING:
 Buf Limit

  16384		1.843		184.4
  32768		1.841		186.8
  65636		1.816		197.6
  524288	1.879		186.5
 1048576	1.834		185.2

Regression tested on x86-64 Linux.

OK for trunk followed by backports?

I can not include a specific testsuite test case, it would take way too long to run.

Regards,

Jerry

2015-02-07  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

	PR libgfortran/60956
	* io/fbuf.c (fbuf_flush_list): New function that only flushes
	if current fbuf position exceeds a limit.
	* io/fbuf.h: Declare the new function.
	* io/io.h (enum unit_mode): Add two new modes.
	* io/list_read.c (list_formatted_read_scalar): Call new function.
	* io/write.c: Include fbuf.h. (list_formatted_write_scalar):
	Call new function.
Index: fbuf.c
===================================================================
--- fbuf.c	(revision 220315)
+++ fbuf.c	(working copy)
@@ -171,7 +171,43 @@ fbuf_flush (gfc_unit * u, unit_mode mode)
 }
 
 
+/* The mode argument is LIST_WRITING for write mode and LIST_READING for
+   read.  This should only be used for list directed  I/O.
+   Return value is 0 for success, -1 on failure.  */
+
 int
+fbuf_flush_list (gfc_unit * u, unit_mode mode)
+{
+  int nwritten;
+
+  if (!u->fbuf)
+    return 0;
+
+  if (u->fbuf->pos < 524288) /* Upper limit for list writing.  */
+    return 0;
+
+  fbuf_debug (u, "fbuf_flush_list with mode %d: ", mode);
+
+  if (mode == LIST_WRITING)
+    {
+      nwritten = swrite (u->s, u->fbuf->buf, u->fbuf->pos);
+      if (nwritten < 0)
+	return -1;
+    }
+
+  /* Salvage remaining bytes for both reading and writing.  */ 
+  if (u->fbuf->act > u->fbuf->pos)
+    memmove (u->fbuf->buf, u->fbuf->buf + u->fbuf->pos, 
+             u->fbuf->act - u->fbuf->pos);
+
+  u->fbuf->act -= u->fbuf->pos;
+  u->fbuf->pos = 0;
+
+  return 0;
+}
+
+
+int
 fbuf_seek (gfc_unit * u, int off, int whence)
 {
   if (!u->fbuf)
Index: fbuf.h
===================================================================
--- fbuf.h	(revision 220315)
+++ fbuf.h	(working copy)
@@ -59,6 +59,9 @@ internal_proto(fbuf_alloc);
 extern int fbuf_flush (gfc_unit *, unit_mode);
 internal_proto(fbuf_flush);
 
+extern int fbuf_flush_list (gfc_unit *, unit_mode);
+internal_proto(fbuf_flush_list);
+
 extern int fbuf_seek (gfc_unit *, int, int);
 internal_proto(fbuf_seek);
 
Index: io.h
===================================================================
--- io.h	(revision 220315)
+++ io.h	(working copy)
@@ -231,7 +231,7 @@ typedef enum
 unit_advance;
 
 typedef enum
-{READING, WRITING}
+{READING, WRITING, LIST_READING, LIST_WRITING}
 unit_mode;
 
 typedef enum
Index: list_read.c
===================================================================
--- list_read.c	(revision 220315)
+++ list_read.c	(working copy)
@@ -2210,6 +2210,7 @@ cleanup:
       free_line (dtp);
       hit_eof (dtp);
     }
+  fbuf_flush_list (dtp->u.p.current_unit, LIST_READING);
   return err;
 }
 
Index: write.c
===================================================================
--- write.c	(revision 220315)
+++ write.c	(working copy)
@@ -25,6 +25,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 <http://www.gnu.org/licenses/>.  */
 
 #include "io.h"
+#include "fbuf.h"
 #include "format.h"
 #include "unix.h"
 #include <assert.h>
@@ -1585,6 +1586,7 @@ list_formatted_write_scalar (st_parameter_dt *dtp,
       internal_error (&dtp->common, "list_formatted_write(): Bad type");
     }
 
+  fbuf_flush_list (dtp->u.p.current_unit, LIST_WRITING);
   dtp->u.p.char_flag = (type == BT_CHARACTER);
 }
 
PROGRAM test_complex_io
  IMPLICIT NONE
  INTEGER, PARAMETER :: RP = KIND(0.0D0)
  INTEGER, PARAMETER :: CP = RP
  COMPLEX(CP), ALLOCATABLE, DIMENSION(:) :: v, u
  INTEGER :: n
  INTEGER :: k

real :: value1, value2, value3
  DO k = 1, 4
    n = 10**(2*k)

    WRITE(*, *) 'k = ', k, ', n = ', n
    ALLOCATE(u(n), v(n))
    WRITE(*, *) '   ALLOCATE COMPLETE'

    u = 1
    v = 0

call cpu_time (value1)

    OPEN(UNIT=100, FILE='dummy.txt', ACTION='WRITE', STATUS='REPLACE')
    print *, 'Writing:'
    WRITE(100, "(2(f10.2,f10.2))") u
    print *, 'Writing done.'
    CLOSE(100)

call cpu_time (value2)
    
    OPEN(UNIT=101, FILE='dummy.txt', ACTION='READ', STATUS='OLD')
    print *, 'Reading:'
    READ(101, "(2(f10.2,f10.2))") v
    print *, 'Reading done.'
    CLOSE(101)

call cpu_time (value3)
print *, "Write Time =", value2 - value1
print *, " Read Time =", value3 - value2

    print *, 'IO complete' 
    WRITE(*, *) 'error = ', SUM(ABS(u - v))
    
    DEALLOCATE(u, v)

  ENDDO

END PROGRAM test_complex_io

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]