This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: std::codecvt::out() returns a no-op in marginal situations


Sam Varshavchik wrote:
> Anyone know if there exists some encoding where a multibyte sequence
> produces more than one wchar_t? 
Again, from available information, the glibc docs:

     But for GNU systems `wchar_t' is always 32 bits wide and,
     therefore, capable of representing all UCS-4 values and,
     therefore, covering all of ISO 10646.  Some Unix systems define
     `wchar_t' as a 16-bit type and thereby follow Unicode very
     strictly.  This definition is perfectly fine with the standard,
     but it also means that to represent all characters from Unicode
     and ISO 10646 one has to use UTF-16 surrogate characters, which is
     in fact a multi-wide-character encoding.  But resorting to
     multi-wide-character encoding contradicts the purpose of the
     `wchar_t' type.

Paolo.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]