This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
Re: gcj's IO performance vs blackdown JDK
- From: Dalibor Topic <robilad at kaffe dot org>
- To: Bryce McKinlay <bryce at mckinlay dot net dot nz>
- Cc: gnustuff at thisiscool dot com, GCC-Java Java <java at gcc dot gnu dot org>
- Date: Fri, 09 Jan 2004 16:34:50 +0100
- Subject: Re: gcj's IO performance vs blackdown JDK
- References: <AUA507KHXRDATQ4W1TMKUPGLFZU86.3fec7709@p733> <359666B8-424A-11D8-AB43-003065F97F7C@mckinlay.net.nz>
Hi Bryce,
Bryce McKinlay wrote:
Here's my BufferedReader and InputStreamReader improvements. Perhaps you
would like to bake these versions off against your modifications. In the
case of BufferedReader the new code is a lot simpler as well as
potentially more efficient, since it gets rid of all the special-case
checks for "\r" at the end of each line.
Sorry for coming late into this BufferedReader discussion. Since
according to the CVS you wrote the huge chunk of the \r\n handling code,
it would be interesting to hear why it's possible to get rid of it now.
(A diff would have been better to see what exactly the changes are
between your implementation and the one in Classpath. A ChangeLog, too ;))
public void mark(int readLimit) throws IOException
{
if (readLimit < 0)
throw new IllegalArgumentException();
synchronized (lock)
{
checkStatus();
markPos = pos;
markLimit = readLimit;
}
}
Doesn't seem to handle the case when you put the mark on \r of a \r\n
sequence properly. Citing from Classpath's implementations (i.e. your)
comments:
// In this method we need to be aware of the special case where
// pos + 1 == limit. This indicates that a '\r' was the last char
// in the buffer during a readLine. We'll want to maintain that
// condition after we shift things around and if a larger buffer is
// needed to track readLimit, we'll have to make it one element
// larger to ensure we don't invalidate the mark too early, if the
// char following the '\r' is NOT a '\n'. This is ok because, per
// the spec, we are not required to invalidate when passing readLimit.
/**
* Reset the stream to the point where the <code>mark()</code> method
* was called. Any chars that were read after the mark point was set will
* be re-read during subsequent reads.
* <p>
* This method will throw an IOException if the number of chars read from
* the stream since the call to <code>mark()</code> exceeds the mark limit
* passed when establishing the mark.
*
* @exception IOException If an error occurs;
*/
public void reset() throws IOException
{
synchronized (lock)
{
checkStatus();
if (markPos < 0)
throw new IOException("mark never set or invalidated");
pos = markPos;
}
}
Citing from Classpath's (i.e. your) implementation:
// Need to handle the extremely unlikely case where a readLine was
// done with a '\r' as the last char in the buffer; which was then
// immediately followed by a mark and a reset with NO intervening
// read of any sort. In that case, setting pos to markPos would
// lose that info and a subsequent read would thus not skip a '\n'
// (if one exists).
public int read(char[] buf, int offset, int count) throws IOException
{
synchronized (lock)
{
checkStatus();
if (pos == limit && !refill())
return -1;
int charsTotal = 0;
while (true)
{
int charsInBuffer = limit - pos;
int charsToRead = Math.min(charsInBuffer, count - charsTotal);
System.arraycopy(buffer, pos, buf, offset, charsToRead);
offset += charsToRead;
pos += charsToRead;
charsTotal += charsToRead;
if (charsTotal == count || !refill())
return charsTotal;
}
}
}
You don't seem to do any special-casing for \r\n here. A simple
arraycopy that ends on a '\r' of a '\r\n' sequence would produce wrong
results in a subsequent readLine(). Instead of reading the characters
after the \n, your code would read just the \n that belongs to the
previous \r\n and return an empty line.
/** Read more data into the buffer, updating pos and limit appropriately,
* and taking mark and markLimit into account. Assumes pos==limit initially.
* Return true if chars were added to the buffer, or false on EOF. */
private boolean refill() throws IOException
I think this function could yield a significant perfomance boost if we
used a circular buffer in the BufferedReader. Then the arraycopys to
shuffle bytes around to the beginning may not be necessary.
public String readLine() throws IOException
{
[snip]
// Consume the line termination character(s).
I think that's a bad idea, because it can cause deadlocks. See
http://developer.apple.com/technotes/tn/tn1157.html . Citing from it :
"Fixing the server
If you happen to be developing the server as well, and can make changes
to its code, then there's an even better fix you can make on the server
side instead of changing the client. In the previous section, I
described how the server blocks waiting for a character to follow the CR
so it can tell whether it's an LF or not. A more intelligent way to
handle nonstandard line breaks is for the server to process the input
line immediately once it receives the CR, but to set a flag that
indicates that if the next character received is an LF it should be ignored.
This way the server will not block after it receives the CR, will
process the command and return a response, and the client code will
receive the response and continue running. No deadlock."
cheers,
dalibor topic