This is the mail archive of the libstdc++@sourceware.cygnus.com mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: FW: Unicode and C++


> | - The encoding of wchar_t
> 
> Isn't that implicitly supposed to mean Unicode? 

Not in the C++ standard, which leaves it implementation-defined.

> I for one do not know of any system where it means anything other
> than Unicode.

All of the Unix systems use such a scheme in the EUC locales, see
http://cns-web.bu.edu/pub/djohnson/web_files/i18n/euc.html

Linux is an exception to the rule, it uses ISO 10646 for wchar_t in
all locales.

A C99 implementation may define __STDC_ISO_10646__ if wchar_t is
indeed ISO 10646 compliant.

> Well this is the most problematic but can anyone tell me why *NIXes
> chose 32bit wchar_t?

For one think, ISO 10646 says a character is coded in four
octets. Furthermore, the BMP is not sufficient in the long run.

> It seems that for most of the living languages 16bit UTF-16 or the
> BMP plane of ISO-10646 is more than enough.

It is by far not enough. Assignments to plane 1 and plane 2 are in
progress; plane 14 is reserved for language tagging. See the Unicode
Consortium pages for details.

Regards,
Martin



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]