Partial fix for libgcj/9802
Mark Wielaard
mark@klomp.org
Sat Feb 22 18:03:00 GMT 2003
Hi,
The following is a partial fix for libgcj/9802 (Bug in surrogate
handling in Unicode to UTF-8 conversion). This only fixes the case for
UTF-8 surrogates but as James Clark explains this can also occur in
other multibyte encodings.
In principle the other encoders can also be rewritten to use the new
bytes_todo field to indicate that more output is available. But I am
hoping that converting the encoders to the new java.nio.charset
framework will eliminate this problem since it has explicit support for
this (see CoderResult, Jesse Rosenstock will certainly correct me if I
am wrong). But I do not expect that we can finish that work for 3.3, so
just fixing it now for UTF-8 seems worthwhile.
I also added the testcase that James Clark made to Mauve and it passes
with this patch. Since non of the other encoders use the bytes_todo
field this does not impact any other encoders, just UTF-8.
2002-02-22 Mark Wielaard <mark@klomp.org>
Partial fix for PR libgcj/8738:
* gnu/gcj/convert/UnicodeToBytes.java (bytes_todo): New field.
(done): Reset bytes_todo field.
* gnu/gcj/convert/Output_UTF8.java (bytes_todo): Removed field.
(write): Always decrease avail when count is increased.
* java/lang/natString.cc (getByes): Check converter->bytes_todo.
OK for branch and mainline?
Cheers,
Mark
-------------- next part --------------
A non-text attachment was scrubbed...
Name: convert.patch
Type: text/x-patch
Size: 2852 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/java-patches/attachments/20030222/b0a6bddd/attachment.bin>
More information about the Java-patches
mailing list