This is the mail archive of the libstdc++@sourceware.cygnus.com mailing list for the libstdc++ project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: FW: Unicode and C++

To: Shiv at pspl dot co dot in
Subject: Re: FW: Unicode and C++
From: "Martin v. Loewis" <martin at loewis dot home dot cs dot tu-berlin dot de>
Date: Fri, 7 Jul 2000 12:00:07 +0200
CC: libstdc++ at sourceware dot cygnus dot com
References: <000601bfe7eb$6fdbcd70$8d02a8c0@intranet.pspl.co.in>

> | - The encoding of wchar_t
> 
> Isn't that implicitly supposed to mean Unicode? 

Not in the C++ standard, which leaves it implementation-defined.

> I for one do not know of any system where it means anything other
> than Unicode.

All of the Unix systems use such a scheme in the EUC locales, see
http://cns-web.bu.edu/pub/djohnson/web_files/i18n/euc.html

Linux is an exception to the rule, it uses ISO 10646 for wchar_t in
all locales.

A C99 implementation may define __STDC_ISO_10646__ if wchar_t is
indeed ISO 10646 compliant.

> Well this is the most problematic but can anyone tell me why *NIXes
> chose 32bit wchar_t?

For one think, ISO 10646 says a character is coded in four
octets. Furthermore, the BMP is not sufficient in the long run.

> It seems that for most of the living languages 16bit UTF-16 or the
> BMP plane of ISO-10646 is more than enough.

It is by far not enough. Assignments to plane 1 and plane 2 are in
progress; plane 14 is reserved for language tagging. See the Unicode
Consortium pages for details.

Regards,
Martin

Follow-Ups:
- RE: FW: Unicode and C++
  - From: Shiv Shankar Ramakrishnan

References:
- FW: Unicode and C++
  - From: Shiv Shankar Ramakrishnan

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]