This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
Re: Bug in natString.cc / command line args wrong
- From: Tom Tromey <tromey at redhat dot com>
- To: martin dot kahlert at infineon dot com
- Cc: java at gcc dot gnu dot org, java-patches at gcc dot gnu dot org
- Date: 19 Sep 2002 19:45:58 -0600
- Subject: Re: Bug in natString.cc / command line args wrong
- References: <20020919115213.A19325@keksy.muc.infineon.com>
- Reply-to: tromey at redhat dot com
>>>>> "Martin" == Martin Kahlert <martin.kahlert@infineon.com> writes:
Martin> Here is a bug i found with the help of a Solaris 2.8 iconv bug.
Martin> Here is a patch:
Martin> --- natString.cc.old Thu Sep 19 11:20:47 2002
Martin> +++ natString.cc Thu Sep 19 11:21:44 2002
Martin> @@ -506,7 +506,7 @@
converter-> setInput(bytes, offset, offset+count);
Martin> while (converter->inpos < converter->inlength)
Martin> {
Martin> - int done = converter->read(array, outpos, avail);
Martin> + int done = converter->read(array, outpos, count);
Martin> if (done == 0)
Martin> {
Martin> jint new_size = 2 * (outpos + avail);
Martin> @@ -520,6 +520,7 @@
Martin> {
Martin> outpos += done;
Martin> avail -= done;
Martin> + count -= done;
Martin> }
Martin> }
Martin> converter-> done ();
Martin> converter->read takes the number of chars to convert, not the
Martin> number of bytes available in the output buffer!
I'm afraid I don't understand.
My understanding is that the 3rd argument to converter->read is the
number of characters we can convert -- that is, the number of free
slots in the output buffer.
So to me it looks like the current code is ok.
Martin> size_t outavail = count * sizeof (jchar);
Martin> Thus outavail is set to twice the number of input characters.
Yes. iconv() uses byte counts for its arguments. So we convert from
a character count to a byte count.
Martin> With the current setting of OUTSIZE = 2 Solaris 2.8 spits out
Martin> iconv: Arg list too long If OUTSIZE >= 4 it works. Some of
Martin> Sun's people seem to have a problem with their math ;-)
Ok.
Martin> Because there is a fall through in
Martin> gnu::gcj::convert::Input_iconv::read for E2BIG the value
Martin> returned by converter->read for done is 0.
We have some known problems in that code -- workarounds for iconv bugs
in an earlier glibc. The appended patch, which I still haven't
checked in, will probably fix the bug you are seeing. Could you try
it?
glibc 2.1.3 is pretty old, and nobody responded to my query asking for
testing when I first sent this patch out (I don't have such a system).
So if this patch works for you, I'll check it in.
Tom
Index: ChangeLog
from Tom Tromey <tromey@redhat.com>
* gnu/gcj/convert/natIconv.cc (write): Handle case where
output buffer is too small.
Index: gnu/gcj/convert/natIconv.cc
===================================================================
RCS file: /cvs/gcc/gcc/libjava/gnu/gcj/convert/natIconv.cc,v
retrieving revision 1.13
diff -u -r1.13 natIconv.cc
--- gnu/gcj/convert/natIconv.cc 18 Feb 2002 02:52:44 -0000 1.13
+++ gnu/gcj/convert/natIconv.cc 16 Aug 2002 21:37:39 -0000
@@ -1,6 +1,6 @@
-// Input_iconv.java -- Java side of iconv() reader.
+// natIconv.cc -- Java side of iconv() reader.
-/* Copyright (C) 2000, 2001 Free Software Foundation
+/* Copyright (C) 2000, 2001, 2002 Free Software Foundation
This file is part of libgcj.
@@ -201,25 +201,39 @@
inbuf = (char *) temp_buffer;
}
- // If the conversion fails on the very first character, then we
- // assume that the character can't be represented in the output
- // encoding. There's nothing useful we can do here, so we simply
- // omit that character. Note that we can't check `errno' because
- // glibc 2.1.3 doesn't set it correctly. We could check it if we
- // really needed to, but we'd have to disable support for 2.1.3.
size_t loop_old_in = old_in;
while (1)
{
size_t r = iconv_adapter (iconv, (iconv_t) handle,
&inbuf, &inavail,
&outbuf, &outavail);
- if (r == (size_t) -1 && inavail == loop_old_in)
+ if (r == (size_t) -1)
{
- inavail -= 2;
- if (inavail == 0)
- break;
- loop_old_in -= 2;
- inbuf += 2;
+ if (errno == EINVAL)
+ {
+ // Incomplete byte sequence at the end of the input
+ // buffer. This shouldn't be able to happen here.
+ break;
+ }
+ else if (errno == E2BIG)
+ {
+ // Output buffer is too small.
+ break;
+ }
+ else if (errno == EILSEQ || inavail == loop_old_in)
+ {
+ // Untranslatable sequence. Since glibc 2.1.3 doesn't
+ // properly set errno, we also assume that this is what
+ // is happening if no conversions took place. (This can
+ // be a bogus assumption if in fact the output buffer is
+ // too small.) We skip the first character and try
+ // again.
+ inavail -= 2;
+ if (inavail == 0)
+ break;
+ loop_old_in -= 2;
+ inbuf += 2;
+ }
}
else
break;