This is the mail archive of the
java-patches@gcc.gnu.org
mailing list for the Java project.
Bug in natString.cc / command line args wrong
- From: Martin Kahlert <martin dot kahlert at infineon dot com>
- To: java at gcc dot gnu dot org
- Cc: java-patches at gcc dot gnu dot org
- Date: Thu, 19 Sep 2002 11:52:14 +0200
- Subject: Bug in natString.cc / command line args wrong
- Reply-to: martin dot kahlert at infineon dot com
Hi!
Here is a bug i found with the help of a Solaris 2.8 iconv bug.
Here is a patch:
--- natString.cc.old Thu Sep 19 11:20:47 2002
+++ natString.cc Thu Sep 19 11:21:44 2002
@@ -506,7 +506,7 @@
converter->setInput(bytes, offset, offset+count);
while (converter->inpos < converter->inlength)
{
- int done = converter->read(array, outpos, avail);
+ int done = converter->read(array, outpos, count);
if (done == 0)
{
jint new_size = 2 * (outpos + avail);
@@ -520,6 +520,7 @@
{
outpos += done;
avail -= done;
+ count -= done;
}
}
converter->done ();
converter->read takes the number of chars to convert, not the number
of bytes available in the output buffer!
Here is the long version of the description:
As you might know the default encoding on Solaris is "646".
Solaris up to 2.7 do not even know what to do with that
encoding (obtained from Solaris' own nl_langinfo (CODESET))
and return an Invalid argument error from iconv_open.
This triggers the switch over to the built in ASCII encoding
in gcj which works fine for me.
Starting with Solaris 2.8 iconv_open succeedes in iconv_open for "646"
so from now on gcj will use that encoding.
If you look inside gnu/gcj/convert/natIconv.cc you find this:
gnu::gcj::convert::Input_iconv::read (jcharArray outbuffer,
jint outpos, jint count)
{
#ifdef HAVE_ICONV
jbyte *bytes = elements (inbuffer);
jchar *out = elements (outbuffer);
size_t inavail = inlength - inpos;
size_t old_in = inavail;
size_t outavail = count * sizeof (jchar);
size_t old_out = outavail;
char *inbuf = (char *) &bytes[inpos];
char *outbuf = (char *) &out[outpos];
....
Thus outavail is set to twice the number of input characters.
Now look at this program:
#include <iconv.h>
#include <stdlib.h>
#include <stdio.h>
#define OUTSIZE 2
int main(int argc, char *argv[])
{
char inbuffer[] = "a";
char *inbuf = inbuffer;
size_t inavail = 1;
char buffer[OUTSIZE];
size_t outavail = sizeof(buffer);
char *outbuf = buffer;
iconv_t h = iconv_open ("UCS-2", "646");
size_t r;
if (h == (iconv_t) -1)
{
perror("iconv_open");
return 1;
}
r = iconv(h, &inbuf, &inavail, &outbuf, &outavail);
if (r == (size_t) -1)
{
perror("iconv");
return 1;
}
printf("inavail = %d outavail = %d\n", inavail, outavail);
return 0;
}
With the current setting of OUTSIZE = 2 Solaris 2.8 spits out
iconv: Arg list too long
If OUTSIZE >= 4 it works. Some of Sun's people seem to have a
problem with their math ;-)
The problem above came up in prims.cc (function JvConvertArgv).
It runs into natString.cc:
java::lang::String::init (jbyteArray bytes, jint offset, jint count,
jstring encoding)
and here into this loop:
while (converter->inpos < converter->inlength)
{
int done = converter->read(array, outpos, avail);
if (done == 0)
{
jint new_size = 2 * (outpos + avail);
jcharArray new_array = JvNewCharArray (new_size);
memcpy (elements (new_array), elements (array),
outpos * sizeof(jchar));
array = new_array;
avail = new_size - outpos;
}
else
{
outpos += done;
avail -= done;
}
}
Because there is a fall through in gnu::gcj::convert::Input_iconv::read for
E2BIG the value returned by converter->read for done is 0.
Now the array size is doubled and the next call succeeds but with the number
of the input characters doubled. The resulting string is strange, of course.
The above patch fixes this bug resulting in an infinite loop on Solaris 2.8 ;-)
Any idea, how to handle this Solaris 2.8 iconv bug?
You should only have to enlarge the output buffer but is this doable inside
gnu::gcj::convert::Input_iconv::read at all?
Thanks
Martin.
--
The early bird catches the worm. If you want something else for
breakfast, get up later.