Bug 37754 - [4.4 Regression] READ I/O Performance regression from 4.3 to 4.4/4.5
Summary: [4.4 Regression] READ I/O Performance regression from 4.3 to 4.4/4.5
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libfortran (show other bugs)
Version: 4.4.0
: P4 normal
Target Milestone: 4.4.1
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-10-06 19:58 UTC by bartoldeman
Modified: 2009-06-04 04:03 UTC (History)
3 users (show)

See Also:
Host:
Target: i586-pc-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2008-10-07 02:55:15


Attachments
Experimental patch (859 bytes, patch)
2008-10-25 23:56 UTC, Jerry DeLisle
Details | Diff
The test program used above. (168 bytes, text/plain)
2008-10-25 23:58 UTC, Jerry DeLisle
Details
Program to generate simple test file used above. (168 bytes, text/plain)
2008-10-25 23:59 UTC, Jerry DeLisle
Details

Note You need to log in before you can comment on or make changes to this bug.
Description bartoldeman 2008-10-06 19:58:59 UTC
GFortran is slower with I/O than g77 was (I think that was known already).
But 4.4 is even slower than 4.3 in certain cases, e.g.:
a simple program to count lines:

countlines.f
---------------
      PROGRAM countlines

C Count lines on stdin

      I=0
      DO
         READ(*,*,END=1)
         I=I+1
      ENDDO
 1    CONTINUE
      PRINT *,I

      END PROGRAM
-----------------
Create a file with 10,000,000 empty lines, for instance like this:

$ python -c "import sys; sys.stdout.write('\n'*10000000)" > temp

Using: gcc version 4.4.0 20081005 (experimental) [trunk revision 140878] (GCC):

$ gfortran -O countlines.f
$ time ./a.out < temp
    10000000

real    0m3.745s
user    0m3.740s
sys     0m0.004s

Using: gcc version 4.3.1 (Debian 4.3.1-9)
    10000000

real    0m2.603s
user    0m2.588s
sys     0m0.016s

Using: g77 (gcc version 3.4.6 (Debian 3.4.6-6))
 10000000

real    0m0.733s
user    0m0.728s
sys     0m0.004s
Comment 1 Jerry DeLisle 2008-10-07 02:55:15 UTC
I am a bit stacked up, but I will explore this one a bit.
Comment 2 Jerry DeLisle 2008-10-07 03:58:51 UTC
strace shows no difference in number of system calles between 4.3 and 4.4.

gprof has some interesting things to see.

With 4.4:

  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 30.93      0.54     0.54                             data_transfer_init
  8.67      0.69     0.15                             fflush
  8.09      0.83     0.14                             get_external_unit
  6.94      0.95     0.12                             finalize_transfer
  6.65      1.06     0.12                             next_char
  6.36      1.17     0.11                             fd_read
  3.47      1.23     0.06                             memcpy
  2.60      1.28     0.05                             _gfortran_st_read
  2.31      1.32     0.04                             fd_sfree

With 4.3:

  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 26.25      0.42     0.42                             data_transfer_init
 12.50      0.62     0.20                             fflush
  9.38      0.77     0.15                             finalize_transfer
  6.25      0.87     0.10        1   100.00   100.00  MAIN__
  6.25      0.97     0.10                             get_external_unit
  5.94      1.07     0.10                             next_char
  4.69      1.14     0.08                             _gfortran_st_read
  2.81      1.19     0.05                             fd_alloc_r_at
  2.50      1.23     0.04                             _gfortrani_library_start
  2.19      1.26     0.04                             _gfortran_st_read_done

I will try some of my more aggressive I/O tests and report back.
Comment 3 Jerry DeLisle 2008-10-07 04:25:08 UTC
With the following test program I created a file to read useing a write in place of the read.

program testio
  implicit none
  integer      :: i, k
  real         :: x
  real(kind=8) :: y
  complex      :: c
  character(27) :: a
  integer, parameter :: n = 1000000
  x = 3.14159
  y = exp(1.0)
  c = complex(x,y)
  a = "abcdefghijklmnopqrstuvwxyz1"
  open(10,form="formatted")
  do i=1,n
    read(10, '(i10,1x,f7.5,1x,f12.10,1x,a27,1x,2f12.8)') k, x, y, a, c
  end do
  close(10, status="keep")
end program testio


With 4.4:

$ time ./a.out 

real	0m9.307s
user	0m9.238s
sys	0m0.063s

With 4.3:

$ time ./a.out 

real	0m8.167s
user	0m8.113s
sys	0m0.034s

That's about 13% slowdown in formatted reads.
Comment 4 Jerry DeLisle 2008-10-25 23:56:03 UTC
Created attachment 16547 [details]
Experimental patch

With this patch, I see some improvement with a more realistic test case.  Here are test results using gprof.  I am not sure I can completely trust what I am seeing, especially when gprof is reporting data on read_logical and that is not being used.

Flat profile: trunk 4.4, no patch

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 36.57      1.32     1.32                             next_char
 23.82      2.18     0.86                             fd_read
  9.14      2.51     0.33                             push_char
  8.03      2.80     0.29                             memcpy
  7.48      3.07     0.27                             read_character
  2.22      3.15     0.08                             eat_spaces
  1.94      3.22     0.07                             formatted_transfer_scalar
  1.94      3.29     0.07                             read_logical
  0.83      3.32     0.03                             __read_nocancel
  0.83      3.35     0.03                             _int_free
  0.83      3.38     0.03                             eat_separator
  0.83      3.41     0.03                            list_formatted_read_scalar
  0.83      3.44     0.03                             malloc
  0.55      3.46     0.02        1    20.00    20.00  MAIN__
  0.55      3.48     0.02                           _gfortrani_free_format_data
  0.55      3.50     0.02                             _int_malloc
  0.55      3.52     0.02                             pre_position

1.29  1.25  1.20  1.20  1.21  1.34  1.28 ----> 1.25 average for next_char

3.978  4.005  3.989  3.997  3.986  3.981  4.005 ---> 3.992 for test program.

Flat profile: perf1

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ts/call  Ts/call  name    
 26.57      0.89     0.89                             next_char
 22.69      1.65     0.76                             fd_read
 15.52      2.17     0.52                             push_char
 14.93      2.67     0.50                             memcpy
  5.97      2.87     0.20                             read_character
  4.18      3.01     0.14                             read_logical
  2.39      3.09     0.08                             eat_separator
  1.19      3.13     0.04                             __read_nocancel
  1.19      3.17     0.04                             _int_free
  1.19      3.21     0.04                             malloc
  0.90      3.24     0.03                             formatted_transfer_scalar
  0.60      3.26     0.02                       _gfortrani_list_formatted_read
  0.60      3.28     0.02                             _int_malloc
  0.60      3.30     0.02                             unformatted_write
  0.30      3.31     0.01                             _gfortrani_free_ionml
  0.30      3.32     0.01                             fd_sfree
  0.30      3.33     0.01                             get_external_unit

.96  .96  1.04  1.0  1.32  .86  1.05 ----> 1.03 average for next_char

3.732  3.710  3.713  3.717  3.737  3.735  3.704 ---> 3.721 for test program.


Flat profile: gfortran 4.3

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 27.57      0.75     0.75                             format_lex
 27.57      1.50     0.75                             parse_format_list
 12.50      1.84     0.34                             push_char
 10.29      2.12     0.28                             do_read
  8.46      2.35     0.23                             mem_read
  2.21      2.41     0.06                            list_formatted_read_scalar
  1.47      2.45     0.04                             __cache_sysconf
  1.47      2.49     0.04                             fd_alloc
  1.10      2.52     0.03                             malloc_consolidate
  0.74      2.54     0.02                             arena_get2
  0.74      2.56     0.02                             formatted_transfer_scalar
  0.37      2.57     0.01        1    10.00    10.00  MAIN__
  0.37      2.58     0.01                             _gfortran_st_write_done
  0.37      2.59     0.01                             _gfortran_store_exe_path
  0.37      2.60     0.01                         _gfortrani_get_internal_unit
  0.37      2.61     0.01                       _gfortrani_list_formatted_read
Comment 5 Jerry DeLisle 2008-10-25 23:58:22 UTC
Created attachment 16548 [details]
The test program used above.
Comment 6 Jerry DeLisle 2008-10-25 23:59:51 UTC
Created attachment 16549 [details]
Program to generate simple test file used above.
Comment 7 Jerry DeLisle 2008-11-21 05:23:43 UTC
From some experiments I have done, we can make substantial improvement by streamlining next_char.  What I have in mind is reading a whole or partial block of a file and returning a pointer.  Then advancing forward next char is a matter of incrementing the pointer,or back as in unget_char, decrementing the pointer.

push_char then becomes simply an assignment.  This approach would get rid of all the function calls and and do the necessary manipulations with pointer ops and assignments.  It will take some careful rework, but I think it can be done.

Janne, is this what you had in mind? are you doing this?
Comment 8 Janne Blomqvist 2008-11-21 07:43:29 UTC
(In reply to comment #7)
> From some experiments I have done, we can make substantial improvement by
> streamlining next_char.  What I have in mind is reading a whole or partial
> block of a file and returning a pointer.  Then advancing forward next char is a
> matter of incrementing the pointer,or back as in unget_char, decrementing the
> pointer.
> 
> push_char then becomes simply an assignment.  This approach would get rid of
> all the function calls and and do the necessary manipulations with pointer ops
> and assignments.  It will take some careful rework, but I think it can be done.
> 
> Janne, is this what you had in mind? are you doing this?

Essentially yes. I have converted the read side to use the fbuf_* machinery as well, and I have a fbuf_read(gfc_unit *, size_t *) function that fills the fbuf buffer with a specified number of bytes (by calling sread()) and then returns a pointer to the first element (the maximum size to iterate to is returned via the pointer argument). 

Well, that's the general idea, I still have some nasty bugs to figure out. And as I said, I think the changes are invasive enough that even if I'd get it done soon, I don't think it's 4.4 material.
Comment 9 Jerry DeLisle 2008-11-22 05:34:17 UTC
*** Bug 38199 has been marked as a duplicate of this bug. ***
Comment 10 Jerry DeLisle 2008-11-26 03:59:24 UTC
Un assigning myself since I think Janne is working on this and I would hate to duplicate effort. If I need to pick back up on this, let me know.
Comment 11 Janne Blomqvist 2009-01-05 22:14:02 UTC
Patch that improves countlines.f test (and a bunch of other things):

http://gcc.gnu.org/ml/gcc-patches/2009-01/msg00222.html
Comment 12 Jerry DeLisle 2009-01-09 05:34:34 UTC
With Janne's patch and some minor tweaks.

countlines.f

gfortran 4.4 patched:  2.2 seconds

gfortran 4.3        :  3.5 seconds

g77                 :  2.7 seconds

ifort               :  1.1 seconds
Comment 13 Jerry DeLisle 2009-03-29 18:55:20 UTC
Subject: Bug 37754

Author: jvdelisle
Date: Sun Mar 29 18:55:05 2009
New Revision: 145258

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=145258
Log:
2009-03-29  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

        PR libfortran/37754
	* io/io.h (format_hash_entry): New structure for hash table.
	(format_hash_table): The hash table itself.
	(free_format_data): Revise function prototype.
	(free_format_hash_table, init_format_hash,
	free_format_hash): New function prototypes.
	* io/unit.c (close_unit_1): Use free_format_hash_table.
	* io/transfer.c (st_read_done, st_write_done): Free format data if
	internal unit.
	* io/format.c (free_format_hash_table): New function that frees any
	memory allocated previously for cached format data.
	(reset_node): New static helper function to reset the format counters
	for a format node.
	(reset_fnode_counters): New static function recursively calls reset_node
	to traverse the	fnode tree.
	(format_hash): New simple hash function based on XOR, probabalistic,
	tosses collisions.
	(save_parsed_format): New static function to save the parsed format
	data to use again.
	(find_parsed_format): New static function searches the hash table
	looking for a match.
	(free_format_data): Revised to accept pointer to format data rather than
	the dtp pointer so that the function can be used in more places.
	(format_lex): Editorial.
	(parse_format_list): Set flag used to determine of format data hashing
	is to be used.  Internal units are not persistent enough for this.
	(revert): Move to ne location in file.
	(parse_format): Use new functions to look for previously parsed
	format strings and use them rather than re-parse.  If not found, saves
	the parsed format data for later use.
	
2009-03-29  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

        PR libfortran/37754
	* io/transfer.c (formatted_transfer_scalar): Remove this function by
	factoring it into two new functions, one for read and one for right,
	eliminating all the conditionals for read or write mode.
	(formatted transfer_scalar_read): New function.
	(formatted transfer_scalar_write): New function.
	(formatted_transfer): Use new functions.

Modified:
    branches/fortran-dev/libgfortran/ChangeLog.dev
    branches/fortran-dev/libgfortran/io/format.c
    branches/fortran-dev/libgfortran/io/io.h
    branches/fortran-dev/libgfortran/io/transfer.c
    branches/fortran-dev/libgfortran/io/unit.c

Comment 14 Jerry DeLisle 2009-04-05 20:14:22 UTC
Subject: Bug 37754

Author: jvdelisle
Date: Sun Apr  5 20:13:56 2009
New Revision: 145571

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=145571
Log:
2009-04-05  Daniel Kraft  <d@domob.eu>

	PR fortran/38654
	* io/read.c (read_f): Reworked to speed up floating point parsing.
	(convert_real): Use pointer-casting instead of memcpy and temporaries.

2009-04-05  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

        PR libfortran/37754
	* io/io.h (format_hash_entry): New structure for hash table.
	(format_hash_table): The hash table itself.
	(free_format_data): Revise function prototype.
	(free_format_hash_table, init_format_hash,
	free_format_hash): New function prototypes.
	* io/unit.c (close_unit_1): Use free_format_hash_table.
	* io/transfer.c (st_read_done, st_write_done): Free format data if
	internal unit.
	* io/format.c (free_format_hash_table): New function that frees any
	memory allocated previously for cached format data.
	(reset_node): New static helper function to reset the format counters
	for a format node.
	(reset_fnode_counters): New static function recursively calls reset_node
	to traverse the	fnode tree.
	(format_hash): New simple hash function based on XOR, probabalistic,
	tosses collisions.
	(save_parsed_format): New static function to save the parsed format
	data to use again.
	(find_parsed_format): New static function searches the hash table
	looking for a match.
	(free_format_data): Revised to accept pointer to format data rather than
	the dtp pointer so that the function can be used in more places.
	(format_lex): Editorial.
	(parse_format_list): Set flag used to determine of format data hashing
	is to be used.  Internal units are not persistent enough for this.
	(revert): Move to ne location in file.
	(parse_format): Use new functions to look for previously parsed
	format strings and use them rather than re-parse.  If not found, saves
	the parsed format data for later use.
	
2009-04-05  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

        PR libfortran/37754
	* io/transfer.c (formatted_transfer_scalar): Remove this function by
	factoring it into two new functions, one for read and one for write,
	eliminating all the conditionals for read or write mode.
	(formatted transfer_scalar_read): New function.
	(formatted transfer_scalar_write): New function.
	(formatted_transfer): Use new functions.

2009-04-05  Janne Blomqvist  <jb@gcc.gnu.org>

        PR libfortran/25561 libfortran/37754
	* io/io.h (struct stream): Define new stream interface function
	pointers, and inline functions for accessing it.
	(struct fbuf): Use int instead of size_t, remove flushed element.
	(mem_alloc_w): New prototype.
	(mem_alloc_r): New prototype.
	(stream_at_bof): Remove prototype.
	(stream_at_eof): Remove prototype.
	(file_position): Remove prototype.
	(flush): Remove prototype.
	(stream_offset): Remove prototype.
	(unit_truncate): New prototype.
	(read_block_form): Change to return pointer, int* argument.
	(hit_eof): New prototype.
	(fbuf_init): Change prototype.
	(fbuf_reset): Change prototype.
	(fbuf_alloc): Change prototype.
	(fbuf_flush): Change prototype.
	(fbuf_seek): Change prototype.
	(fbuf_read): New prototype.
	(fbuf_getc_refill): New prototype.
	(fbuf_getc): New inline function.
        * io/fbuf.c (fbuf_init): Use int, get rid of flushed.
	(fbuf_debug): New function.
	(fbuf_reset): Flush, and return position offset.
	(fbuf_alloc): Simplify, don't flush, just realloc.
	(fbuf_flush): Make usable for read mode, salvage remaining bytes.
	(fbuf_seek): New whence argument.
	(fbuf_read): New function.
	(fbuf_getc_refill): New function.
	* io/file_pos.c (formatted_backspace): Use new stream interface.
	(unformatted_backspace): Likewise.
	(st_backspace): Make sure format buffer is reset, use new stream
	interface, use unit_truncate.
	(st_endfile): Likewise.
	(st_rewind): Likewise.
	* io/intrinsics.c: Use new stream interface.
	* io/list_read.c (push_char): Don't use u.p.scratch, use realloc
	to resize.
	(free_saved): Don't check u.p.scratch.
	(next_char): Use new stream interface, use fbuf_getc() for external files.
	(finish_list_read): flush format buffer.
	(nml_query): Update to use modified interface:s
	* io/open.c (test_endfile): Use new stream interface.
	(edit_modes): Likewise.
	(new_unit): Likewise, set bytes_left to 1 for stream files.
	* io/read.c (read_l): Use new read_block_form interface.
	(read_utf8): Likewise.
	(read_utf8_char1): Likewise.
	(read_default_char1): Likewise.
	(read_utf8_char4): Likewise.
	(read_default_char4): Likewise.
	(read_a): Likewise.
	(read_a_char4): Likewise.
	(read_decimal): Likewise.
	(read_radix): Likewise.
	(read_f): Likewise.
	* io/transfer.c (read_sf): Use fbuf_read and mem_alloc_r, remove
	usage of u.p.line_buffer.
	(read_block_form): Update interface to return pointer, use
	fbuf_read for direct access.
	(read_block_direct): Update to new stream interface.
	(write_block): Use mem_alloc_w for internal I/O.
	(write_buf): Update to new stream interface.
	(formatted_transfer_scalar): Don't use u.p.line_buffer, use
	fbuf_seek for external files.
	(us_read): Update to new stream interface.
	(us_write): Likewise.
	(data_transfer_init): Always check if we switch modes and flush.
	(skip_record): Use new stream interface, fix comparison.
	(next_record_r): Check for and reset u.p.at_eof, use new stream
	interface, use fbuf_getc for spacing.
	(write_us_marker): Update to new stream interface, don't inline.
	(next_record_w_unf): Likewise.
	(sset): New function.
	(next_record_w): Use new stream interface, use fbuf for printing
	newline.
	(next_record): Use new stream interface.
	(finalize_transfer): Remove sfree call, use new stream interface.
	(st_iolength_done): Don't use u.p.scratch.
	(st_read): Don't check for end of file.
	(st_read_done): Don't use u.p.scratch, use unit_truncate.
	(hit_eof): New function.
	* io/unit.c (init_units): Always init fbuf for formatted units.
	(update_position): Use new stream interface.
	(unit_truncate): New function.
	(finish_last_advance_record): Use fbuf to print newline.
	* io/unix.c: Remove unused SSIZE_MAX macro.
	(BUFFER_SIZE): Make static const variable rather than macro.
	(struct unix_stream): Remove dirty_offset, len, method,
	small_buffer. Order elements by decreasing size.
	(struct int_stream): Remove.
	(move_pos_offset): Remove usage of dirty_offset.
	(reset_stream): Remove.
	(do_read): Rename to raw_read, update to match new stream
	interface.
	(do_write): Rename to raw_write, update to new stream interface.
	(raw_seek): New function.
	(raw_tell): New function.
	(raw_truncate): New function.
	(raw_close): New function.
	(raw_flush): New function.
	(raw_init): New function.
	(fd_alloc): Remove.
	(fd_alloc_r_at): Remove.
	(fd_alloc_w_at): Remove.
	(fd_sfree): Remove.
	(fd_seek): Remove.
	(fd_truncate): Remove.
	(fd_sset): Remove.
	(fd_read): Remove.
	(fd_write): Remove.
	(fd_close): Remove.
	(fd_open): Remove.
	(fd_flush): Rename to buf_flush, update to new stream interface
	and unix_stream.
	(buf_read): New function.
	(buf_write): New function.
	(buf_seek): New function.
	(buf_tell): New function.
	(buf_truncate): New function.
	(buf_close): New function.
	(buf_init): New function.
	(mem_alloc_r_at): Rename to mem_alloc_r, change prototype.
	(mem_alloc_w_at): Rename to mem_alloc_w, change prototype.
	(mem_read): Change to match new stream interface.
	(mem_write): Likewise.
	(mem_seek): Likewise.
	(mem_tell): Likewise.
	(mem_truncate): Likewise.
	(mem_close): Likewise.
	(mem_flush): New function.
	(mem_sfree): Remove.
	(empty_internal_buffer): Cast to correct type.
	(open_internal): Use correct type, init function pointers.
	(fd_to_stream): Test whether to open file as buffered or raw.
	(output_stream): Remove mode set.
	(error_stream): Likewise.
	(flush_all_units_1): Use new stream interface.
	(flush_all_units): Likewise.
	(stream_at_bof): Remove.
	(stream_at_eof): Remove.
	(file_position): Remove.
	(file_length): Update logic to use stream interface.
	(flush): Remove.
	(stream_offset): Remove.
	* io/write.c (write_utf8_char4): Use int instead of size_t.
	(write_x): Extra safety check.
	(namelist_write_newline): Use new stream interface.

Modified:
    trunk/libgfortran/ChangeLog
    trunk/libgfortran/io/fbuf.c
    trunk/libgfortran/io/file_pos.c
    trunk/libgfortran/io/format.c
    trunk/libgfortran/io/intrinsics.c
    trunk/libgfortran/io/io.h
    trunk/libgfortran/io/list_read.c
    trunk/libgfortran/io/open.c
    trunk/libgfortran/io/read.c
    trunk/libgfortran/io/transfer.c
    trunk/libgfortran/io/unit.c
    trunk/libgfortran/io/unix.c
    trunk/libgfortran/io/write.c

Comment 15 Jerry DeLisle 2009-04-05 22:35:05 UTC
Fixed on 4.5, good for 4.4 after some main line testing/exercise.
Comment 16 Jerry DeLisle 2009-05-20 00:16:58 UTC
Subject: Bug 37754

Author: jvdelisle
Date: Wed May 20 00:16:38 2009
New Revision: 147725

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147725
Log:
2009-05-19  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

	PR libfortran/37754
	* io/write_float.def: Simplify format calculation.

Modified:
    trunk/libgfortran/ChangeLog
    trunk/libgfortran/io/write_float.def

Comment 17 Jerry DeLisle 2009-05-27 01:22:19 UTC
Subject: Bug 37754

Author: jvdelisle
Date: Wed May 27 01:21:22 2009
New Revision: 147887

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147887
Log:
2009-05-23  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

	Backport from mainline:
	PR libfortran/37754
	* io/write_float.def: Simplify format calculation.
	
2009-05-23  Francois-Xavier Coudert  <fxcoudert@gcc.gnu.org>

	Backport from mainline:
	PR fortran/22423
	* io/transfer.c (read_block_direct): Avoid warning.

2009-05-23  Janne Blomqvist  <jb@gcc.gnu.org>

	Backport from mainline:
	PR libfortran/39667
	* io/file_pos.c (st_rewind): Don't truncate or flush.
	* io/intrinsics.c (fgetc): Flush if switching mode.
	(fputc): Likewise.

2009-05-23  Janne Blomqvist  <jb@gcc.gnu.org>

	Backport from mainline:
	PR libfortran/39782
	* io/transfer.c (data_transfer_init): Don't flush before seek.

2009-05-23  Janne Blomqvist  <jb@gcc.gnu.org>

	Backport from mainline:
	* io/io.h (is_preconnected): Remove prototype.
	* io/unix.c (is_preconnected): Remove function.

2009-05-23  Janne Blomqvist  <jb@gcc.gnu.org>

	Backport from mainline:
	PR libfortran/38668
	* io/transfer.c (finalize_transfer): Don't flush for advance='no'.

2009-05-23 Danny Smith  <dannysmith@clear.net.nz>

	Backport from mainline:
	* io/write.c (itoa) : Rename back to gfc_itoa.
	(write_i): Adjust call to write_decimal.
	(write_integer):  Use gfc_itoa.

2009-05-23  Janne Blomqvist  <jb@gcc.gnu.org>

	Backport from mainline:
	* io/io.h (move_pos_offset): Remove prototype.
	* io/transfer.c (formatted_transfer_scalar_read): Use sseek
	instead of move_pos_offset.
	* io/unix.c (move_pos_offset): Remove.

2009-05-23  Janne Blomqvist  <jb@gcc.gnu.org>

	Backport from mainline:
	PR libfortran/39665 libfortran/39702 libfortran/39709
	* io/io.h (st_parameter_dt): Revert aligned attribute from u.p.value.
	* io/list_read.c (read_complex): Read directly into user pointer.
	(read_real): Likewise.
	(list_formatted_read_scalar): Update read_complex and read_real calls.
	(nml_read_obj): Read directly into user pointer.

2009-05-23  Janne Blomqvist  <jb@gcc.gnu.org>

	Backport from mainline:
	PR libfortran/39665
	* io/io.h (st_parameter_dt): Add aligned attribute to u.p.value.
	* io/read.c (convert_real): Add note about alignment requirements.

2009-05-23  Janne Blomqvist  <jb@gcc.gnu.org>

	Backport from mainline:
	* io/open.c (already_open): Test for POSIX close return value.
	* io/unit.c (close_unit_1): Likewise.
	* io/unix.c (raw_close): Return 0 for success for preconnected units.

2009-05-23  Janne Blomqvist  <jb@gcc.gnu.org>

	Backport from mainline:
	* runtime/error.c (gfc_itoa): Move to io/write.c
	(xtoa): Rename to gfc_xtoa.
	* runtime/backtrace.c (show_backtrace): Call gfc_xtoa.
	* libgfortran.h (gfc_itoa): Remove prototype.
	(xtoa): Rename prototype to gfc_xtoa.
	* io/list_read.c (nml_read_obj): Use size_t for string length.
	* io/transfer.c (read_block_direct): Change nbytes arg from
	pointer to value.
	(unformatted_read): Minor cleanup, call read_block_directly properly.
	(skip_record): Use ssize_t.
	(next_record_w_unf): Avoid stell() call by calling sseek with SEEK_CUR.
	(iolength_transfer): Make sure to multiply before cast.
	* io/intrinsics.c (fgetc): Remove unnecessary variable.
	* io/format.c (format_hash): Use gfc_charlen_type.
	* io/write.c (itoa): Move from runtime/error.c:gfc_itoa, rename,
	make static.
	(write_i): Call with pointer to itoa.
	(write_z): Call with pointer to gfc_xtoa.
	(write_integer): Pointer to itoa.
	(nml_write_obj): Type cleanup, don't call strlen in loop.
	
2009-05-23  H.J. Lu  <hongjiu.lu@intel.com>

	Backport from mainline:
	PR libgfortran/39664
	* io/unix.c (raw_close): Don't close STDOUT_FILENO,
	STDERR_FILENO nor STDIN_FILENO.

2009-05-23  David Edelsohn  <edelsohn@gnu.org>
	
	Backport from mainline:
	* io/io.h (struct stream): Rename truncate to trunc.
	(struncate): Same.
	* io/unix.c (raw_init): Rename truncate to trunc.
	(buf_init): Same.
	(open_internal): Same.
	
2009-05-23  Daniel Kraft  <d@domob.eu>

	Backport from mainline:
	PR fortran/38654
	* io/read.c (read_f): Reworked to speed up floating point parsing.
	(convert_real): Use pointer-casting instead of memcpy and temporaries.

2009-05-23  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

	Backport from mainline:
	PR libfortran/37754
	* io/io.h (format_hash_entry): New structure for hash table.
	(format_hash_table): The hash table itself.
	(free_format_data): Revise function prototype.
	(free_format_hash_table, init_format_hash,
	free_format_hash): New function prototypes.
	* io/unit.c (close_unit_1): Use free_format_hash_table.
	* io/transfer.c (st_read_done, st_write_done): Free format data if
	internal unit.
	* io/format.c (free_format_hash_table): New function that frees any
	memory allocated previously for cached format data.
	(reset_node): New static helper function to reset the format counters
	for a format node.
	(reset_fnode_counters): New static function recursively calls reset_node
	to traverse the	fnode tree.
	(format_hash): New simple hash function based on XOR, probabalistic,
	tosses collisions.
	(save_parsed_format): New static function to save the parsed format
	data to use again.
	(find_parsed_format): New static function searches the hash table
	looking for a match.
	(free_format_data): Revised to accept pointer to format data rather than
	the dtp pointer so that the function can be used in more places.
	(format_lex): Editorial.
	(parse_format_list): Set flag used to determine of format data hashing
	is to be used.  Internal units are not persistent enough for this.
	(revert): Move to ne location in file.
	(parse_format): Use new functions to look for previously parsed
	format strings and use them rather than re-parse.  If not found, saves
	the parsed format data for later use.
	
2009-05-23  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

	Backport from mainline:
	PR libfortran/37754
	* io/transfer.c (formatted_transfer_scalar): Remove this function by
	factoring it into two new functions, one for read and one for write,
	eliminating all the conditionals for read or write mode.
	(formatted transfer_scalar_read): New function.
	(formatted transfer_scalar_write): New function.
	(formatted_transfer): Use new functions.

2009-05-23  Janne Blomqvist  <jb@gcc.gnu.org>

	Backport from mainline:
	PR libfortran/25561 libfortran/37754
	* io/io.h (struct stream): Define new stream interface function
	pointers, and inline functions for accessing it.
	(struct fbuf): Use int instead of size_t, remove flushed element.
	(mem_alloc_w): New prototype.
	(mem_alloc_r): New prototype.
	(stream_at_bof): Remove prototype.
	(stream_at_eof): Remove prototype.
	(file_position): Remove prototype.
	(flush): Remove prototype.
	(stream_offset): Remove prototype.
	(unit_truncate): New prototype.
	(read_block_form): Change to return pointer, int* argument.
	(hit_eof): New prototype.
	(fbuf_init): Change prototype.
	(fbuf_reset): Change prototype.
	(fbuf_alloc): Change prototype.
	(fbuf_flush): Change prototype.
	(fbuf_seek): Change prototype.
	(fbuf_read): New prototype.
	(fbuf_getc_refill): New prototype.
	(fbuf_getc): New inline function.
	* io/fbuf.c (fbuf_init): Use int, get rid of flushed.
	(fbuf_debug): New function.
	(fbuf_reset): Flush, and return position offset.
	(fbuf_alloc): Simplify, don't flush, just realloc.
	(fbuf_flush): Make usable for read mode, salvage remaining bytes.
	(fbuf_seek): New whence argument.
	(fbuf_read): New function.
	(fbuf_getc_refill): New function.
	* io/file_pos.c (formatted_backspace): Use new stream interface.
	(unformatted_backspace): Likewise.
	(st_backspace): Make sure format buffer is reset, use new stream
	interface, use unit_truncate.
	(st_endfile): Likewise.
	(st_rewind): Likewise.
	* io/intrinsics.c: Use new stream interface.
	* io/list_read.c (push_char): Don't use u.p.scratch, use realloc
	to resize.
	(free_saved): Don't check u.p.scratch.
	(next_char): Use new stream interface, use fbuf_getc() for external files.
	(finish_list_read): flush format buffer.
	(nml_query): Update to use modified interface:s
	* io/open.c (test_endfile): Use new stream interface.
	(edit_modes): Likewise.
	(new_unit): Likewise, set bytes_left to 1 for stream files.
	* io/read.c (read_l): Use new read_block_form interface.
	(read_utf8): Likewise.
	(read_utf8_char1): Likewise.
	(read_default_char1): Likewise.
	(read_utf8_char4): Likewise.
	(read_default_char4): Likewise.
	(read_a): Likewise.
	(read_a_char4): Likewise.
	(read_decimal): Likewise.
	(read_radix): Likewise.
	(read_f): Likewise.
	* io/transfer.c (read_sf): Use fbuf_read and mem_alloc_r, remove
	usage of u.p.line_buffer.
	(read_block_form): Update interface to return pointer, use
	fbuf_read for direct access.
	(read_block_direct): Update to new stream interface.
	(write_block): Use mem_alloc_w for internal I/O.
	(write_buf): Update to new stream interface.
	(formatted_transfer_scalar): Don't use u.p.line_buffer, use
	fbuf_seek for external files.
	(us_read): Update to new stream interface.
	(us_write): Likewise.
	(data_transfer_init): Always check if we switch modes and flush.
	(skip_record): Use new stream interface, fix comparison.
	(next_record_r): Check for and reset u.p.at_eof, use new stream
	interface, use fbuf_getc for spacing.
	(write_us_marker): Update to new stream interface, don't inline.
	(next_record_w_unf): Likewise.
	(sset): New function.
	(next_record_w): Use new stream interface, use fbuf for printing
	newline.
	(next_record): Use new stream interface.
	(finalize_transfer): Remove sfree call, use new stream interface.
	(st_iolength_done): Don't use u.p.scratch.
	(st_read): Don't check for end of file.
	(st_read_done): Don't use u.p.scratch, use unit_truncate.
	(hit_eof): New function.
	* io/unit.c (init_units): Always init fbuf for formatted units.
	(update_position): Use new stream interface.
	(unit_truncate): New function.
	(finish_last_advance_record): Use fbuf to print newline.
	* io/unix.c: Remove unused SSIZE_MAX macro.
	(BUFFER_SIZE): Make static const variable rather than macro.
	(struct unix_stream): Remove dirty_offset, len, method,
	small_buffer. Order elements by decreasing size.
	(struct int_stream): Remove.
	(move_pos_offset): Remove usage of dirty_offset.
	(reset_stream): Remove.
	(do_read): Rename to raw_read, update to match new stream
	interface.
	(do_write): Rename to raw_write, update to new stream interface.
	(raw_seek): New function.
	(raw_tell): New function.
	(raw_truncate): New function.
	(raw_close): New function.
	(raw_flush): New function.
	(raw_init): New function.
	(fd_alloc): Remove.
	(fd_alloc_r_at): Remove.
	(fd_alloc_w_at): Remove.
	(fd_sfree): Remove.
	(fd_seek): Remove.
	(fd_truncate): Remove.
	(fd_sset): Remove.
	(fd_read): Remove.
	(fd_write): Remove.
	(fd_close): Remove.
	(fd_open): Remove.
	(fd_flush): Rename to buf_flush, update to new stream interface
	and unix_stream.
	(buf_read): New function.
	(buf_write): New function.
	(buf_seek): New function.
	(buf_tell): New function.
	(buf_truncate): New function.
	(buf_close): New function.
	(buf_init): New function.
	(mem_alloc_r_at): Rename to mem_alloc_r, change prototype.
	(mem_alloc_w_at): Rename to mem_alloc_w, change prototype.
	(mem_read): Change to match new stream interface.
	(mem_write): Likewise.
	(mem_seek): Likewise.
	(mem_tell): Likewise.
	(mem_truncate): Likewise.
	(mem_close): Likewise.
	(mem_flush): New function.
	(mem_sfree): Remove.
	(empty_internal_buffer): Cast to correct type.
	(open_internal): Use correct type, init function pointers.
	(fd_to_stream): Test whether to open file as buffered or raw.
	(output_stream): Remove mode set.
	(error_stream): Likewise.
	(flush_all_units_1): Use new stream interface.
	(flush_all_units): Likewise.
	(stream_at_bof): Remove.
	(stream_at_eof): Remove.
	(file_position): Remove.
	(file_length): Update logic to use stream interface.
	(flush): Remove.
	(stream_offset): Remove.
	* io/write.c (write_utf8_char4): Use int instead of size_t.
	(write_x): Extra safety check.
	(namelist_write_newline): Use new stream interface.


Modified:
    branches/gcc-4_4-branch/libgfortran/ChangeLog
    branches/gcc-4_4-branch/libgfortran/io/fbuf.c
    branches/gcc-4_4-branch/libgfortran/io/file_pos.c
    branches/gcc-4_4-branch/libgfortran/io/format.c
    branches/gcc-4_4-branch/libgfortran/io/intrinsics.c
    branches/gcc-4_4-branch/libgfortran/io/io.h
    branches/gcc-4_4-branch/libgfortran/io/list_read.c
    branches/gcc-4_4-branch/libgfortran/io/open.c
    branches/gcc-4_4-branch/libgfortran/io/read.c
    branches/gcc-4_4-branch/libgfortran/io/transfer.c
    branches/gcc-4_4-branch/libgfortran/io/unit.c
    branches/gcc-4_4-branch/libgfortran/io/unix.c
    branches/gcc-4_4-branch/libgfortran/io/write.c
    branches/gcc-4_4-branch/libgfortran/io/write_float.def
    branches/gcc-4_4-branch/libgfortran/libgfortran.h
    branches/gcc-4_4-branch/libgfortran/runtime/backtrace.c
    branches/gcc-4_4-branch/libgfortran/runtime/error.c

Comment 18 Jerry DeLisle 2009-05-27 01:47:09 UTC
Fixed on 4.4, closing
Comment 19 bartoldeman 2009-06-04 04:03:50 UTC
Thanks for all the work -- another text processing program which changes some headers in big ASCII files (which is what inspired this bug) went from around
real    0m2.026s
user    0m1.764s
sys     0m0.148s
to around
real    0m0.657s
user    0m0.392s
sys     0m0.140s
It's hard to beat C or even Python here but it's certainly very good!