This is the mail archive of the
java-patches@gcc.gnu.org
mailing list for the Java project.
Re: Patch for Review: Replace "Unicode to UTF8 conversions" with"Unicode to 'Win32 locale' conversions" when sending/receiving file namesto/from Win32 API.
- From: João Garcia <jgarcia at uk2 dot net>
- To: tromey at redhat dot com
- Cc: java-patches at gcc dot gnu dot org, Mohan Embar <gnustuff at thisiscool dot com>
- Date: Thu, 05 Jun 2003 21:29:25 +0100
- Subject: Re: Patch for Review: Replace "Unicode to UTF8 conversions" with"Unicode to 'Win32 locale' conversions" when sending/receiving file namesto/from Win32 API.
- References: <3ED40497.1050104@uk2.net> <87u1b4sddv.fsf@fleche.redhat.com>
Tom Tromey wrote:
What if, instead of having special code here for Windows, we have new
charset converters for these things? Then we could just use generic
code to convert the representations.
If the code is also useful for other platforms then that is the best
solution for all of us.
Does Windows not have these tables built in somewhere?
====================
My defence:
Logical question... but we are talking about Windows... "One OS to rule
them all...".
Writing conversion tables (even using automated methods) is not my
favorit sport...
====================
Useful anwser:
=============
Win9x branch doesn't have this kind of built in conversions. And it uses
"localized code-pages" at file system level (nightmare...).
That is the reason why MS has released this thing:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/mslu/winprog/microsoft_layer_for_unicode_on_windows_95_98_me_systems.asp
But I don't think this is a good solution for libgcj... It would have to
be linked to libgcj (with probable licensing issues...).
If anyone knows of another "win9x iconv equivalent" just speak up! It
would be a much better solution.
The NT branch has UTF-16 support at the file system level. I have a
combined patch almost ready that takes advantage of that when running on
Win NT/2000/XP. I get a "stupid" segmentation fault when using wide-char
strings with more than 19 chars... I didn't have time to debug it (that
is the main reason for not releasing the new version of the patch yet...).
Table memory issues:
Is it a good idea to use a Map or something like that (from libgcj)? Or
should we keep wasting some memory by using arrays (the major problem is
the conversion from unicode to locale)?
Anyway, I think this idea is definitely needed. We should probably
do the same thing for file names in the posix I/O code.
I can take a look at that source if you want me to do this. We will see
what I can manage...
João