On Thu, Dec 25, 2008 at 09:32:57AM -0800, Jerry DeLisle wrote:
This patch recovers the performance from this regression by creating a
stream read_char function which is simply a trimmed down version of sread
(fd_read). I was actually surprised when I saw the test results. I
suspect that the simplification allows some better optimizations.
The patch also refactors next_char in list_read.c to eliminate goto's and
inlining a small portion of the "done:" code. The refactoring of next_char
alone gains 2.8% over current trunk. The use of the new read_char function
gains significant additional performance.
Using the countlines.f test case in the PR for comparison, average 5 runs.
gfortran 4.3: 3.357 seconds
gfortran 4.4 current trunk: 3.821 seconds
gfortran 4.4 patched: 3.164 seconds
This is a 5.7% improvement over 4.3 for this test case and 17% improvement
over current trunk.
Here's some numbers using /usr/bin/time on i686-*-freebsd
real user sys
4.2.5 34.33 24.91 8.82 dynamic linked
4.3.3 30.49 21.48 8.90 dynamic linked
4.4.0 29.28 20.48 8.71 static linked, w/o patch
4.4.0 30.51 20.83 9.32 static linked, patched
This is reading a 17.2 million line file with lines ranging from
0 to 100 or so characters. The numbers are the averages of 5
consecutive runs. The patch does appear to either help or
hinder gfortran on FreeBSD. I'll leave it to Janne to review
since he's working in this area.