UTF-8, UTF-16 and UTF-32
Eljay Love-Jensen
eljay@adobe.com
Sun Aug 24 19:11:00 GMT 2008
Hi Dallas,
> Once again, there are no legacy issues because no one is currently using
16-bit Unicode in GCC, it does not exist.
I'm using UTF-16 Unicode in GCC. I've done so for years.
I do not use wchar_t to specify UTF-16 Unicode, since that is not portable.
The same code runs on different platforms, the Windows platform being
compiled with MSVC++.
Although what you say is not without merit, in that C/C++ do not specify the
character set (let alone the encoding of the character set).
> So I have to ask - what are your arguments for not providing support for all
> three, 8-bit, 16-bit and 32-bit Unicode strings?
It is not part of ISO 9899 (for C), nor ISO 14882 (for C++).
There are languages which support UTF-8, UTF-16, and UTF-32 Unicode strings.
C and C++ are not those languages.
There are support libraries for Unicode (UTF-8, UTF-16, and UTF-32) for C
and C++. They work on Linux and on Windows. You are at liberty to use
those.
If you use Microsoft's extensions to C++, your code is no longer C++... it
is MS-C++. Portability issues will be problematic, at least until Microsoft
comes out with MSVC++ for Linux and OS X and whatever other platform you are
interested in.
Maybe a future version of C and/or C++ will be more Unicode friendly.
Sincerely,
--Eljay
More information about the Gcc-help
mailing list