This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: codecvt as an abstract base class


> To keep the focus on concrete issues, I would suggest also considering
> 
>    http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006
> 
> when returning to codecvt (see in particular Comment 6), it's something
> people would really like to see us delivering.

Oops, I was looking at a tangent before. I think that entire thread was off track. codecvt_byname<char,char> must be a degenerate conversion because there aren't two distinct encodings to convert between. He got garbage because UTF-8 numpunct of any specialization knows only about Unicode, because that is what the OS gives it. numpunct<char> cannot return UTF-8 results because it cannot return multicharacter strings. The only meaningful way to solve this problem is to use a locale no wider than the desired internal character. The 0xA0 nonbreaking space he says he expected is ISO 8859.

Of course, C++ makes it easy to obtain and use multiple locale objects at once.

Hmm, looking at config/locale/gnu/numeric_members.cc, it "casts" a string to a char by dereferencing it. As a security issue, we should sanitize that and check that each string is actually length 1. Probably he was getting a UTF-8 lead byte and corrupting the string.

Hmm, looking at Darwin, it really seems that there's simply no (well, very little) libc localization support. You need the CoreFoundation framework.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]