This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: std::codecvt::out() returns a no-op in marginal situations
Sam Varshavchik wrote:
> Anyone know if there exists some encoding where a multibyte sequence
> produces more than one wchar_t?
Again, from available information, the glibc docs:
But for GNU systems `wchar_t' is always 32 bits wide and,
therefore, capable of representing all UCS-4 values and,
therefore, covering all of ISO 10646. Some Unix systems define
`wchar_t' as a 16-bit type and thereby follow Unicode very
strictly. This definition is perfectly fine with the standard,
but it also means that to represent all characters from Unicode
and ISO 10646 one has to use UTF-16 surrogate characters, which is
in fact a multi-wide-character encoding. But resorting to
multi-wide-character encoding contradicts the purpose of the
`wchar_t' type.
Paolo.