This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch,libgfortran] PR37754 [4.4 Regression] READ I/O Performance regression from 4.3 to 4.4


Jack Howarth wrote:
On Thu, Dec 25, 2008 at 09:32:57AM -0800, Jerry DeLisle wrote:
This is a Merry Christmas patch.

This patch recovers the performance from this regression by creating a stream read_char function which is simply a trimmed down version of sread (fd_read). I was actually surprised when I saw the test results. I suspect that the simplification allows some better optimizations.

The patch also refactors next_char in list_read.c to eliminate goto's and inlining a small portion of the "done:" code. The refactoring of next_char alone gains 2.8% over current trunk. The use of the new read_char function gains significant additional performance.

Using the countlines.f test case in the PR for comparison, average 5 runs.

gfortran 4.3: 3.357 seconds

gfortran 4.4 current trunk: 3.821 seconds

gfortran 4.4 patched: 3.164 seconds

This is a 5.7% improvement over 4.3 for this test case and 17% improvement over current trunk.

I also believe this refactoring will make for some easier further improvements. I don't know the status of Janne's patch so this patch may end up being short lived. However, it is not very intrusive in the sense that it is mostly reorganizing in simple ways our existing code paths. Since it involves a regression, I think it would be OK for 4.4

Regression tested on x86-64.

OK to commit?

Jerry

Jerry, I am seeing about a 10% performance improvement with the patch when using...

gfortran -O countlines.f

to compile the testcase and using the temp4 file created by the maketemp4.f
program in the PR. I used average of the last five of ten runs each time to minimize
effects of any disk caching. What did you use for the test file? I noticed the
temp4 file has identical lines. It may not be unfair to use the same line length
but we should probably randomize the contents of the lines.
              Jack
ps This was on x86_64-apple-darwin10.

Yes, I used the temp4 file.

So it appears we do get some benefits in performance in a system dependent way. No one is seeing degradation from the patch. So thats a plus.

Thanks for testing. Waiting to here from Janne.

Jerry


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]