RFC: String.getBytes(String) and Charsets...

David Daney ddaney@avtrex.com
Thu Sep 23 18:59:00 GMT 2004


Short story:

We are running JCIFS (see: http://jcifs.samba.org/ ) in libgcj.  There were
two modifications to libgcj required.  The first I committed yesterday.  See:

http://gcc.gnu.org/ml/java-patches/2004-q3/msg00985.html

The second change is that String.getBytes("UnicodeLittleUnmarked") must
work.  Sun's runtime has this conversion, and libgcj does not.

The initial approach I took for our internal use was to add
gnu.gcj.convert.Output_UnicodeLittleUnmarked to our build of libgcj.

My second (partially implemented) approach was the attached patch plus an
as of yet unwritten java.nio.charset.Charset that would do the conversion.

I am now having second thoughts.  The attached UnicodeLittleTest.java shows
that Sun's runtime does not have a Charset for UnicodeLittleUnmarked but
can still do the encoding with String.getBytes().

This leads me to believe that they have a mechanism similar to
gnu.gcj.convert.Output that String uses in addition to using a Charset.

Perhaps the best bet would be just to add
gnu.gcj.convert.Output_UnicodeLittleUnmarked and forget about my second
approach.

Thoughts?

David Daney.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: charset.patch
URL: <http://gcc.gnu.org/pipermail/java/attachments/20040923/f282d658/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: UnicodeLittleTest.java
URL: <http://gcc.gnu.org/pipermail/java/attachments/20040923/f282d658/attachment-0001.ksh>


More information about the Java mailing list