This is the mail archive of the libstdc++@sourceware.cygnus.com mailing list for the libstdc++ project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

FW: Unicode and C++

To: "libstdc++" <libstdc++ at sourceware dot cygnus dot com>
Subject: FW: Unicode and C++
From: "Shiv Shankar Ramakrishnan" <Shiv at pspl dot co dot in>
Date: Fri, 7 Jul 2000 13:44:56 +0530
Reply-To: <Shiv at pspl dot co dot in>

Sorry I have to resend this mail since it did not get through originally
as the mail servers kept throwing me out due to the domain name change.

-----Original Message-----
From: Shiv Shankar Ramakrishnan [mailto:Shiv@pspl.co.in] 
Sent: Thursday, July 06, 2000 12:39 PM
To: Owen Taylor; gtk-i18n-list@redhat.com
Cc: libstdc++@sourceware.cygnus.com
Subject: RE: Unicode and C++

| - The encoding of wchar_t

Isn't that implicitly supposed to mean Unicode? I for one do not know of
any system where it means anything other than Unicode. All the other
wide character stuff are known as MBCS and are encoded in char*'s and
not wchar_t*'s. And in case it is forgotten let me point out that
wchar_t in C and C++ are *different*. In C its a standard typedef where-
as in C++ its proper type (and keyword) and hence can be used to
overload etc and appears as wchar_t in error messages and all.

| - The width of wchar_t

Well this is the most problematic but can anyone tell me why *NIXes
chose 32bit wchar_t? It seems that for most of the living languages
16bit UTF-16 or the BMP plane of ISO-10646 is more than enough. Why
waste another 16 bits? For example if you read this page about the
various Unicode encodings -

http://czyborra.com/utf/

then you'll come up thinking that 16 bit Unicode is good mix between
speed and space wastage as most of normal Unicode can be done with in
16 bits. So I for one really don't understand the 32bit wchar_t of most
*NIXes. Also one of the best I18N libraries ICU -
http://oss.software.ibm.com/developerworks/opensource/icu/project/
also uses 16 bit Unicode.
So is there something obvious that I am missing about 32bit unicode? 
Thanks,
Shiv

Follow-Ups:
- Re: FW: Unicode and C++
  - From: Martin v. Loewis

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]