This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
Re: Solaris -vs- iconv
- To: tromey at redhat dot com
- Subject: Re: Solaris -vs- iconv
- From: Per Bothner <per at bothner dot com>
- Date: 03 Apr 2001 12:01:26 -0700
- Cc: java at gcc dot gnu dot org
- References: <Pine.LNX.4.10.10103261550370.9543-100000@mars.deadcafe.org><3ABFE959.458D6691@albatross.co.nz> <8766gqd9fp.fsf@creche.redhat.com><m28zlmn2r3.fsf@kelso.bothner.com> <8766gm6s7a.fsf@creche.redhat.com>
Tom Tromey <tromey@redhat.com> writes:
> I just mean that a UCS-2 value fits in an int. It is easy to
> manipulate in C. A UTF-8 encoded character requires buffer
> manipulation and is a pain.
Of course characters should be UCS-2 ints (One could make
an argument for UCS-4, which is what glibc does, but for Java
UCS-2 currently makes more sense. The difference is whether
surrogate characters are treated as two or one characters.
I don't think it matters much.)
But input buffers should I think be UTF-8.
--
--Per Bothner
per@bothner.com http://www.bothner.com/~per/