This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Bug in natString.cc / command line args wrong


>>>>> "Martin" == Martin Kahlert <martin.kahlert@infineon.com> writes:

Martin> Here is a bug i found with the help of a Solaris 2.8 iconv bug.
Martin> Here is a patch:

Martin> --- natString.cc.old	Thu Sep 19 11:20:47 2002
Martin> +++ natString.cc	Thu Sep 19 11:21:44 2002
Martin> @@ -506,7 +506,7 @@
converter-> setInput(bytes, offset, offset+count);
Martin>    while (converter->inpos < converter->inlength)
Martin>      {
Martin> -      int done = converter->read(array, outpos, avail);
Martin> +      int done = converter->read(array, outpos, count);
Martin>        if (done == 0)
Martin>  	{
Martin>  	  jint new_size = 2 * (outpos + avail);
Martin> @@ -520,6 +520,7 @@
Martin>  	{
Martin>  	  outpos += done;
Martin>  	  avail -= done;
Martin> +         count -= done;
Martin>  	}
Martin>      }
Martin>    converter-> done ();

Martin> converter->read takes the number of chars to convert, not the
Martin> number of bytes available in the output buffer!

I'm afraid I don't understand.

My understanding is that the 3rd argument to converter->read is the
number of characters we can convert -- that is, the number of free
slots in the output buffer.

So to me it looks like the current code is ok.

Martin>   size_t outavail = count * sizeof (jchar);

Martin> Thus outavail is set to twice the number of input characters.

Yes.  iconv() uses byte counts for its arguments.  So we convert from
a character count to a byte count.

Martin> With the current setting of OUTSIZE = 2 Solaris 2.8 spits out
Martin> iconv: Arg list too long If OUTSIZE >= 4 it works. Some of
Martin> Sun's people seem to have a problem with their math ;-)

Ok.

Martin> Because there is a fall through in
Martin> gnu::gcj::convert::Input_iconv::read for E2BIG the value
Martin> returned by converter->read for done is 0.

We have some known problems in that code -- workarounds for iconv bugs
in an earlier glibc.  The appended patch, which I still haven't
checked in, will probably fix the bug you are seeing.  Could you try
it?

glibc 2.1.3 is pretty old, and nobody responded to my query asking for
testing when I first sent this patch out (I don't have such a system).
So if this patch works for you, I'll check it in.

Tom


Index: ChangeLog
from  Tom Tromey  <tromey@redhat.com>

	* gnu/gcj/convert/natIconv.cc (write): Handle case where
	output buffer is too small.

Index: gnu/gcj/convert/natIconv.cc
===================================================================
RCS file: /cvs/gcc/gcc/libjava/gnu/gcj/convert/natIconv.cc,v
retrieving revision 1.13
diff -u -r1.13 natIconv.cc
--- gnu/gcj/convert/natIconv.cc 18 Feb 2002 02:52:44 -0000 1.13
+++ gnu/gcj/convert/natIconv.cc 16 Aug 2002 21:37:39 -0000
@@ -1,6 +1,6 @@
-// Input_iconv.java -- Java side of iconv() reader.
+// natIconv.cc -- Java side of iconv() reader.
 
-/* Copyright (C) 2000, 2001  Free Software Foundation
+/* Copyright (C) 2000, 2001, 2002  Free Software Foundation
 
    This file is part of libgcj.
 
@@ -201,25 +201,39 @@
       inbuf = (char *) temp_buffer;
     }
 
-  // If the conversion fails on the very first character, then we
-  // assume that the character can't be represented in the output
-  // encoding.  There's nothing useful we can do here, so we simply
-  // omit that character.  Note that we can't check `errno' because
-  // glibc 2.1.3 doesn't set it correctly.  We could check it if we
-  // really needed to, but we'd have to disable support for 2.1.3.
   size_t loop_old_in = old_in;
   while (1)
     {
       size_t r = iconv_adapter (iconv, (iconv_t) handle,
 				&inbuf, &inavail,
 				&outbuf, &outavail);
-      if (r == (size_t) -1 && inavail == loop_old_in)
+      if (r == (size_t) -1)
 	{
-	  inavail -= 2;
-	  if (inavail == 0)
-	    break;
-	  loop_old_in -= 2;
-	  inbuf += 2;
+	  if (errno == EINVAL)
+	    {
+	      // Incomplete byte sequence at the end of the input
+	      // buffer.  This shouldn't be able to happen here.
+	      break;
+	    }
+	  else if (errno == E2BIG)
+	    {
+	      // Output buffer is too small.
+	      break;
+	    }
+	  else if (errno == EILSEQ || inavail == loop_old_in)
+	    {
+	      // Untranslatable sequence.  Since glibc 2.1.3 doesn't
+	      // properly set errno, we also assume that this is what
+	      // is happening if no conversions took place.  (This can
+	      // be a bogus assumption if in fact the output buffer is
+	      // too small.)  We skip the first character and try
+	      // again.
+	      inavail -= 2;
+	      if (inavail == 0)
+		break;
+	      loop_old_in -= 2;
+	      inbuf += 2;
+	    }
 	}
       else
 	break;


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]