[cp-patches] [Patch] Improve/fix gnu.java.net.LineInputStream...

Chris Burdess dog@bluezoo.org
Tue Sep 13 08:15:00 GMT 2005


David Daney wrote:
> gnu.java.net.LineInputStream has at least one bug in it, but think its 
> whole approach is incorrect.
>
> First the bug:
>
>              len = in.available();
>              len = (len < MIN_LENGTH) ? MIN_LENGTH : len;
>
> I think the idea was to read all available bytes into its buffer (but 
> not more that the size of the buffer).  However the conditional was 
> reversed leading it to try to read more than is available.  This can 
> cause LineInputStream to block even if enough data is available to 
> form an entire line.
>
> I have not fully researched it but this was causing HTTP connections 
> to block for several seconds.  I think under the proper circumstances 
> they could block forever, but I am not positive about this.

If that's the case, there is a problem with the underlying stream. len 
is simply initialised to a reasonable value: either 1024 bytes, or more 
if the underlying input stream states that more can be read. The 
underlying stream is not required to read len bytes. It should read as 
many bytes as it can, and return the number of bytes read. The minimum 
value is to prevent LineInputStream trying to allocate a buffer of 0 or 
-1 bytes when the underlying stream doesn't know how many bytes are 
available. We can reduce this minimum value to 1 if it's causing 
problems.

> The main problem I have with LineInputStream, is that in all cases I 
> know about it is either reading from a raw socket or a 
> BufferedInputStream.
>
> In the raw socket case LineInputStream reads one character at a time 
> (as is required).  If the stream supports mark/reset then 
> LineInputStream reads big blocks of data and then resets if it finds 
> that it read too far.
>
> My problem with this is that if the stream supports mark/reset, then 
> it is already buffered and additional buffering is unlikely to produce 
> any additional benefit.  An as an added bad point you are using more 
> memory for the redundant buffer.

What additional buffering? The line buffer? It's hardly redundant, 
since you're not storing the same bytes in two places and we're freeing 
up the underlying stream's buffer to read more bytes. If the line is 
longer than the underlying stream's buffer, or the underlying stream is 
not buffered, it's necessary.

> I did take the liberty of adding my own micro-optimization, in that if 
> the encoding is US-ASCII, we can skip using String's character 
> encoding system and just request hibyte of 0.  I did this because a 
> year ago with  libgcj-3.4.3 we were seeing a vast increase in speed 
> doing this in a different situation.

This micro-optimisation should be applied to 
ByteArrayOutputStream.toString, not here.

> 	* classpath/gnu/java/net/LineInputStream.java (blockReads): Removed.
> 	(Constructor): Don't initialize blockReads.
> 	(bufToString): New method.
> 	(readLine): Removed block reading logic.
> 	(indexOf): Removed.
>
> I suppose if this is OK that I would commit it to classpath (when my 
> CVS access is enabled).
>
> Opinions?

Really bad idea. You're removing block reading capabilities, for what?
-- 
Chris Burdess



More information about the Java-patches mailing list