31501 – libgfortran internal unit I/O performance issues

Bug 31501 - libgfortran internal unit I/O performance issues

Summary: libgfortran internal unit I/O performance issues

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	libfortran (show other bugs)
Version:	4.3.0

Importance:	P3 enhancement
Target Milestone:	---
Assignee:	Jerry DeLisle

URL:
Keywords:

Depends on:	20278 32382
Blocks:
	Show dependency tree / graph

Reported:	2007-04-07 01:43 UTC by Jerry DeLisle
Modified:	2007-10-26 22:43 UTC (History)
CC List:	1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2007-04-07 18:28:02

Attachments
Test case used to get a profile (308 bytes, text/plain) 2007-04-07 01:45 UTC, Jerry DeLisle	Details
Patch to improve read-sf (470 bytes, patch) 2007-04-22 20:24 UTC, Jerry DeLisle	Details \| Diff
Modified patch for further improvement (974 bytes, patch) 2007-04-22 21:18 UTC, Jerry DeLisle	Details \| Diff
Refinement on the previous, using macros (1.37 KB, patch) 2007-04-22 22:09 UTC, Jerry DeLisle	Details \| Diff
Show Obsolete (2) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jerry DeLisle 2007-04-07 01:43:26 UTC

Keeping in mind that correct is better than fast.  I would like to start looking at refactoring the I/O library to eliminate some of the overhead.  Beginning with internal units where there is no need for alloc buffering like we do now,

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 18.02      1.84     1.84                             next_char
 10.87      2.95     1.11                             _gfortrani_read_sf
  7.84      3.75     0.80                             _gfortrani_is_array_io
  5.83      4.35     0.60                             memcpy
  5.83      4.94     0.60                             push_char
  5.68      5.52     0.58                             read_real
  5.58      6.09     0.57                             ____strtod_l_internal
  5.19      6.62     0.53                             fd_alloc_r_at
  4.60      7.09     0.47                             mem_alloc_r_at
  3.92      7.49     0.40                             eat_spaces
  3.77      7.88     0.39                             _gfortrani_is_stream_io
  2.89      8.17     0.30                             list_formatted_read_scalar
  2.55      8.43     0.26                             _gfortrani_is_internal_unit
  2.25      8.66     0.23                             __read_nocancel
  1.13      8.78     0.12                             _gfortran_transfer_array
  1.08      8.89     0.11                             _gfortrani_convert_real
  1.08      9.00     0.11                             nml_bad_return
  0.88      9.09     0.09                             eat_line
  0.88      9.18     0.09                             str_to_mpn
  0.83      9.26     0.09                             _gfortrani_list_formatted_read
  0.78      9.34     0.08                             __lseek_nocancel

Comment 1 Jerry DeLisle 2007-04-07 01:45:33 UTC

Created attachment 13336 [details]
Test case used to get a profile

This is a reference test case we can use to measure progress.

Comment 2 Francois-Xavier Coudert 2007-04-09 11:07:25 UTC

See PR20278. Do we really want to separate PR for these?

Comment 3 Jerry DeLisle 2007-04-10 00:19:47 UTC

I am not sure this is formatted I/O related.  I will investigate further, but I suspect we are allocating buffer memory to write to memory and we should not have to do that.

Comment 4 Jerry DeLisle 2007-04-18 06:12:06 UTC

Looking at sf_read in transfer.c, we can see that we are reading one character at a time with internal units.  This was done for external units because we can not anticipate where the end of the file is until we hit it.  This does not make sense with internal units because we know the record length at run time.

I should be able to improve this quite a bit by doing one larger read.  I also may be able to do a straight memcpy which will further improve this.

Comment 5 Jerry DeLisle 2007-04-22 20:24:30 UTC

Created attachment 13425 [details]
Patch to improve read-sf

This patch knocks read_sf off the profile.  Thats a start.

Comment 6 Jerry DeLisle 2007-04-22 21:18:45 UTC

Created attachment 13426 [details]
Modified patch for further improvement

This modified patch, gets the time for the test case on my system from about 12 seconds down to about 7.6.

before$ time ./a.out

real    0m11.162s
user    0m10.845s
sys     0m0.306s

after$ time ./a.out
real    0m7.680s
user    0m7.666s
sys     0m0.002s

We probably want to make is_array_io a macro so we can retain code readability.  The big culprit here is now push_char in list_read.c

Comment 7 Jerry DeLisle 2007-04-22 22:09:28 UTC

Created attachment 13427 [details]
Refinement on the previous, using macros

With this patch I replaced is_array_io, is_internal_unit, and is_stream_io with macros.  Now on my system I get this down to 7 seconds.

$ time ./a.out

real    0m6.954s
user    0m6.951s
sys     0m0.001s

Comment 8 patchapp@dberlin.org 2007-04-22 23:40:19 UTC

Subject: Bug number PR31501

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is http://gcc.gnu.org/ml/gcc-patches/2007-04/msg01439.html

Comment 9 Jerry DeLisle 2007-04-29 00:23:49 UTC

Subject: Bug 31501

Author: jvdelisle
Date: Sun Apr 29 00:23:35 2007
New Revision: 124266

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124266
Log:
2007-04-28  Jerry DeLisle  <jvdelisle@gcc.gnu.org>

	PR libfortran/31501
	* io/list_read.c (next_char): Fix whitespace.
	* io/io.h: Remove prototypes and define macros for is_array_io,
	is_stream_io, and is_internal_unit.
	* io/unit.c (is_array_io), (is_internal_unit), (is_stream_io): Delete
	these functions.
	* io/transfer.c (read_sf): Change handling of internal_unit to make a
	single call to salloc_r and use memcpy to transfer the data. 

Modified:
    trunk/libgfortran/ChangeLog
    trunk/libgfortran/io/io.h
    trunk/libgfortran/io/list_read.c
    trunk/libgfortran/io/transfer.c
    trunk/libgfortran/io/unit.c

Comment 10 Jerry DeLisle 2007-06-22 20:20:12 UTC

Keeping track of these here.

Comment 11 Francois-Xavier Coudert 2007-10-26 22:43:33 UTC

Probably too generic to be useful, and we're not doing so bad. Closing (with Jerry's approval).