This is the mail archive of the java-patches@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Patch] Improve/fix gnu.java.net.LineInputStream...


gnu.java.net.LineInputStream has at least one bug in it, but think its whole approach is incorrect.

First the bug:

             len = in.available();
             len = (len < MIN_LENGTH) ? MIN_LENGTH : len;

I think the idea was to read all available bytes into its buffer (but not more that the size of the buffer). However the conditional was reversed leading it to try to read more than is available. This can cause LineInputStream to block even if enough data is available to form an entire line.

I have not fully researched it but this was causing HTTP connections to block for several seconds. I think under the proper circumstances they could block forever, but I am not positive about this.

The main problem I have with LineInputStream, is that in all cases I know about it is either reading from a raw socket or a BufferedInputStream.

In the raw socket case LineInputStream reads one character at a time (as is required). If the stream supports mark/reset then LineInputStream reads big blocks of data and then resets if it finds that it read too far.

My problem with this is that if the stream supports mark/reset, then it is already buffered and additional buffering is unlikely to produce any additional benefit. An as an added bad point you are using more memory for the redundant buffer.

I did take the liberty of adding my own micro-optimization, in that if the encoding is US-ASCII, we can skip using String's character encoding system and just request hibyte of 0. I did this because a year ago with libgcj-3.4.3 we were seeing a vast increase in speed doing this in a different situation.

Note that this micro-optimization is dependent on my ByteArrayOutputStream patch.


Bootstrapped and tested with make -k check in libjava on i686-pc-linux-gnu.


2005-09-12 David Daney <ddaney@avtrex.com>

	* classpath/gnu/java/net/LineInputStream.java (blockReads): Removed.
	(Constructor): Don't initialize blockReads.
	(bufToString): New method.
	(readLine): Removed block reading logic.
	(indexOf): Removed.

I suppose if this is OK that I would commit it to classpath (when my CVS access is enabled).

Opinions?


David Daney.
Index: classpath/gnu/java/net/LineInputStream.java
===================================================================
RCS file: /cvs/gcc/gcc/libjava/classpath/gnu/java/net/LineInputStream.java,v
retrieving revision 1.1.1.1
diff -c -p -r1.1.1.1 LineInputStream.java
*** classpath/gnu/java/net/LineInputStream.java	16 Jul 2005 00:33:56 -0000	1.1.1.1
--- classpath/gnu/java/net/LineInputStream.java	12 Sep 2005 18:30:39 -0000
***************
*** 1,5 ****
  /* LineInputStream.java --
!    Copyright (C) 2002, 2003, 2004  Free Software Foundation, Inc.
  
  This file is part of GNU Classpath.
  
--- 1,5 ----
  /* LineInputStream.java --
!    Copyright (C) 2002, 2003, 2004, 2005  Free Software Foundation, Inc.
  
  This file is part of GNU Classpath.
  
*************** public class LineInputStream
*** 67,77 ****
    private boolean eof;
  
    /**
-    * Whether we can use block reads.
-    */
-   private final boolean blockReads;
- 
-   /**
     * Constructor using the US-ASCII character encoding.
     * @param in the underlying input stream
     */
--- 67,72 ----
*************** public class LineInputStream
*** 91,99 ****
      buf = new ByteArrayOutputStream();
      this.encoding = encoding;
      eof = false;
-     blockReads = in.markSupported();
    }
  
    /**
     * Read a line of input.
     */
--- 86,107 ----
      buf = new ByteArrayOutputStream();
      this.encoding = encoding;
      eof = false;
    }
  
+ 
+   private String bufToString() throws IOException
+   {
+     if ("US-ASCII".equals(encoding))
+       {
+         return buf.toString(0);
+       }
+     else
+       {
+         return buf.toString(encoding);
+       }
+   }
+     
+ 
    /**
     * Read a line of input.
     */
*************** public class LineInputStream
*** 106,198 ****
        }
      do
        {
!         if (blockReads)
!           {
!             // Use mark and reset to read chunks of bytes
!             final int MIN_LENGTH = 1024;
!             int len, pos;
!             
!             len = in.available();
!             len = (len < MIN_LENGTH) ? MIN_LENGTH : len;
!             byte[] b = new byte[len];
!             in.mark(len);
!             // Read into buffer b
!             len = in.read(b, 0, len);
!             // Handle EOF
!             if (len == -1)
!               {
!                 eof = true;
!                 if (buf.size() == 0)
!                   {
!                     return null;
!                   }
!                 else
!                   {
!                     // We don't care about resetting buf
!                     return buf.toString(encoding);
!                   }
!               }
!             // Get index of LF in b
!             pos = indexOf(b, len, (byte) 0x0a);
!             if (pos != -1)
!               {
!                 // Write pos bytes to buf
!                 buf.write(b, 0, pos);
!                 // Reset stream, and read pos + 1 bytes
!                 in.reset();
!                 pos += 1;
!                 while (pos > 0)
!                   {
!                     len = in.read(b, 0, pos);
!                     pos = (len == -1) ? -1 : pos - len;
!                   }
!                 // Return line
!                 String ret = buf.toString(encoding);
!                 buf.reset();
!                 return ret;
!               }
!             else
!               {
!                 // Append everything to buf and fall through to re-read.
!                 buf.write(b, 0, len);
!               }
!           }
!         else
            {
!             // We must use character reads in order not to read too much
!             // from the underlying stream.
!             int c = in.read();
!             switch (c)
                {
!               case -1:
!                 eof = true;
!                 if (buf.size() == 0)
!                   {
!                     return null;
!                   }
!                 // Fall through and return contents of buffer.
!               case 0x0a:                // LF
!                 String ret = buf.toString(encoding);
!                 buf.reset();
!                 return ret;
!               default:
!                 buf.write(c);
                }
            }
        }
      while (true);
    }
- 
-   private int indexOf(byte[] b, int len, byte c)
-   {
-     for (int pos = 0; pos < len; pos++)
-       {
-         if (b[pos] == c)
-           {
-             return pos;
-           }
-       }
-     return -1;
-   }
  }
  
--- 114,140 ----
        }
      do
        {
!         // We must use character reads in order not to read too much
!         // from the underlying stream.
!         int c = in.read();
!         switch (c)
            {
!           case -1:
!             eof = true;
!             if (buf.size() == 0)
                {
!                 return null;
                }
+             // Fall through and return contents of buffer.
+           case 0x0a:                // LF
+             String ret = bufToString();
+             buf.reset();
+             return ret;
+           default:
+             buf.write(c);
            }
        }
      while (true);
    }
  }
  

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]