This is the mail archive of the java-patches@gcc.gnu.org mailing list for the Java project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[PATCH] Fix PR libgcj/36252, OutOfMemoryError in String constructor.

From: David Daney <ddaney at avtrex dot com>
To: GCJ-patches <java-patches at gcc dot gnu dot org>
Date: Sat, 17 May 2008 22:53:31 -0700
Subject: [PATCH] Fix PR libgcj/36252, OutOfMemoryError in String constructor.

This bug is caused by ambiguity in the return value of gnu.gcj.convert.BytesToUnicode.read().  A return value of zero indicates that no characters were placed in the output buffer.  There are two possible causes:

1) The output buffer is too small.
2) The input buffer contains an incomplete code sequence.

In String(byte[] input_bytes, int offset, int count, String encoding) we were assuming that the zero return indicated that the output buffer was too small, so we would double its length and try again.  If we pass input_bytes that ends in an incomplete code sequence, we end up looping forever allocating ever larger char[] until memory is exhausted.

My fix is to assume that the size of the largest atomic output sequence for *all* encodings is bounded and if the output buffer is larger than this bound, then there is a problem with the input and we should quit trying rather than expanding the size of the output buffer.  I arbitrarily decided that this magic bound has a value less than or equal to 20.  The reasoning being that stringing together several UTF16 encoded composing code points is the worse case and that this would certainly be smaller than 20 characters.

A bonus fix is to catch java.io.CharConversionException and stop the conversion.  This keeps this checked exception from escaping from the constructor which does not declare that that it throws said exception.

Bootstrapped and tested on x86_64-pc-linux-gnu with no failures in libjava both on the trunk and 4.3 branch.

OK to commit to the trunk and branch?

2008-05-17  David Daney  <ddaney@avtrex.com>

	PR libgcj/36252
	* java/lang/natString.ccn: Add
	#include <java/io/CharConversionException.h>.
	(init (byte[], int, int, String)): Catch and ignore
	CharConversionException.  Break out of conversion loop
	on incomplete input.
	* testsuite/libjava.lang/PR36252.java: New test.
	* testsuite/libjava.lang/PR36252.out: New file, its expected output.
	* testsuite/libjava.lang/PR36252.jar: New file, its pre-compiled
	jar file.

Index: testsuite/libjava.lang/PR36252.out
===================================================================
--- testsuite/libjava.lang/PR36252.out	(revision 0)
+++ testsuite/libjava.lang/PR36252.out	(revision 0)
@@ -0,0 +1 @@
+ok
Index: testsuite/libjava.lang/PR36252.jar
===================================================================
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream

Property changes on: testsuite/libjava.lang/PR36252.jar
___________________________________________________________________
Name: svn:mime-type
   + application/octet-stream

Index: testsuite/libjava.lang/PR36252.java
===================================================================
--- testsuite/libjava.lang/PR36252.java	(revision 0)
+++ testsuite/libjava.lang/PR36252.java	(revision 0)
@@ -0,0 +1,16 @@
+import java.io.UnsupportedEncodingException;
+
+public class PR36252
+{
+  public static void main(String[] args)
+  {
+    try {
+      byte[] txt = new byte[] {-55, 87, -55, -42, -55, -20};
+      // This new String(...) should not throw an OutOfMemoryError.
+      String s = new String(txt, 0, 6, "MS932");
+    } catch (UnsupportedEncodingException e) {
+      e.printStackTrace();
+    }
+    System.out.println("ok");
+  }
+}
Index: java/lang/natString.cc
===================================================================
--- java/lang/natString.cc	(revision 135124)
+++ java/lang/natString.cc	(working copy)
@@ -1,6 +1,7 @@
 // natString.cc - Implementation of java.lang.String native methods.
 
-/* Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007  Free Software Foundation
+/* Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006,
+   2007, 2008  Free Software Foundation
 
    This file is part of libgcj.
 
@@ -23,6 +24,7 @@ details.  */
 #include <java/lang/NullPointerException.h>
 #include <java/lang/StringBuffer.h>
 #include <java/io/ByteArrayOutputStream.h>
+#include <java/io/CharConversionException.h>
 #include <java/io/OutputStreamWriter.h>
 #include <java/io/ByteArrayInputStream.h>
 #include <java/io/InputStreamReader.h>
@@ -493,9 +495,28 @@ java::lang::String::init (jbyteArray byt
   converter->setInput(bytes, offset, offset+count);
   while (converter->inpos < converter->inlength)
     {
-      int done = converter->read(array, outpos, avail);
+      int done;
+      try
+	{
+	  done = converter->read(array, outpos, avail);
+	}
+      catch (::java::io::CharConversionException *e)
+	{
+	  // Ignore it and silently throw away the offending data.
+	  break;
+	}
       if (done == 0)
 	{
+	  // done is zero if either there is no space available in the
+	  // output *or* the input is incomplete.  We assume that if
+	  // there are 20 characters available in the output, the
+	  // input must be incomplete and there is no more work to do.
+	  // This means we may skip several bytes of input, but that
+	  // is OK as the behavior is explicitly unspecified in this
+	  // case.
+	  if (avail - outpos > 20)
+	    break;
+
 	  jint new_size = 2 * (outpos + avail);
 	  jcharArray new_array = JvNewCharArray (new_size);
 	  memcpy (elements (new_array), elements (array),

Follow-Ups:
- Re: [PATCH] Fix PR libgcj/36252, OutOfMemoryError in String constructor.
  - From: David Daney
- Re: [PATCH] Fix PR libgcj/36252, OutOfMemoryError in String constructor.
  - From: Tom Tromey

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]