This is the mail archive of the java-patches@gcc.gnu.org mailing list for the Java project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Patch for Review: Replace "Unicode to UTF8 conversions" with"Unicode to 'Win32 locale' conversions" when sending/receiving file namesto/from Win32 API.

From: João Garcia <jgarcia at uk2 dot net>
To: tromey at redhat dot com
Cc: java-patches at gcc dot gnu dot org, Mohan Embar <gnustuff at thisiscool dot com>
Date: Thu, 05 Jun 2003 21:29:25 +0100
Subject: Re: Patch for Review: Replace "Unicode to UTF8 conversions" with"Unicode to 'Win32 locale' conversions" when sending/receiving file namesto/from Win32 API.
References: <3ED40497.1050104@uk2.net> <87u1b4sddv.fsf@fleche.redhat.com>

Tom Tromey wrote:


What if, instead of having special code here for Windows, we have new
charset converters for these things?  Then we could just use generic
code to convert the representations.

If the code is also useful for other platforms then that is the best solution for all of us.

Does Windows not have these tables built in somewhere?

==================== My defence: Logical question... but we are talking about Windows... "One OS to rule them all...". Writing conversion tables (even using automated methods) is not my favorit sport... ====================

Useful anwser: ============= Win9x branch doesn't have this kind of built in conversions. And it uses "localized code-pages" at file system level (nightmare...).

That is the reason why MS has released this thing:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/mslu/winprog/microsoft_layer_for_unicode_on_windows_95_98_me_systems.asp

But I don't think this is a good solution for libgcj... It would have to be linked to libgcj (with probable licensing issues...).

If anyone knows of another "win9x iconv equivalent" just speak up! It would be a much better solution.

The NT branch has UTF-16 support at the file system level. I have a combined patch almost ready that takes advantage of that when running on Win NT/2000/XP. I get a "stupid" segmentation fault when using wide-char strings with more than 19 chars... I didn't have time to debug it (that is the main reason for not releasing the new version of the patch yet...).

Table memory issues: Is it a good idea to use a Map or something like that (from libgcj)? Or should we keep wasting some memory by using arrays (the major problem is the conversion from unicode to locale)?

Anyway, I think this idea is definitely needed.  We should probably
do the same thing for file names in the posix I/O code.

I can take a look at that source if you want me to do this. We will see what I can manage...

João

Follow-Ups:
- Re: Patch for Review: Replace "Unicode to UTF8 conversions" with "Unicode to 'Win32 locale' conversions" when sending/receiving file names to/from Win32 API.
  - From: Mohan Embar

References:
- Patch for Review: Replace "Unicode to UTF8 conversions" with "Unicodeto 'Win32 locale' conversions" when sending/receiving file names to/fromWin32 API.
  - From: João Garcia
- Re: Patch for Review: Replace "Unicode to UTF8 conversions" with "Unicode to 'Win32 locale' conversions" when sending/receiving file names to/from Win32 API.
  - From: Tom Tromey

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]