This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: codecvt as an abstract base class
> To keep the focus on concrete issues, I would suggest also considering
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006
>
> when returning to codecvt (see in particular Comment 6), it's something
> people would really like to see us delivering.
Oops, I was looking at a tangent before. I think that entire thread was off track. codecvt_byname<char,char> must be a degenerate conversion because there aren't two distinct encodings to convert between. He got garbage because UTF-8 numpunct of any specialization knows only about Unicode, because that is what the OS gives it. numpunct<char> cannot return UTF-8 results because it cannot return multicharacter strings. The only meaningful way to solve this problem is to use a locale no wider than the desired internal character. The 0xA0 nonbreaking space he says he expected is ISO 8859.
Of course, C++ makes it easy to obtain and use multiple locale objects at once.
Hmm, looking at config/locale/gnu/numeric_members.cc, it "casts" a string to a char by dereferencing it. As a security issue, we should sanitize that and check that each string is actually length 1. Probably he was getting a UTF-8 lead byte and corrupting the string.
Hmm, looking at Darwin, it really seems that there's simply no (well, very little) libc localization support. You need the CoreFoundation framework.