This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libstdc++/29379] New: bad thousand separator with UTF-8 locales


[forwarded from http://bugs.debian.org/351786]

#include <iostream>
#include <locale>

int main()
{
        std::cout << "cout no locale : " << 1024 << '\n';
        std::cout.imbue(std::locale(""));
        std::cout << "cout with locale : " << 1024 << '\n';
}

$ LC_ALL=cs_CZ.UTF-8 ./a.out | od -c
  0000000   c   o   u   t       n   o       l   o   c   a   l   e   :
  0000020       1   0   2   4  \n   c   o   u   t       w   i   t   h
  0000040   l   o   c   a   l   e       :       1 302   0   2   4  \n
  0000057
Here 302 is the first byte of 0xA0 (no-break-space) in UTF-8


the following C program is OK : it outputs c2 a0 which seems OK. So it
looks like libstdc++ is truncating multibyte thousand sep char to the
first byte.

and indeed :

virtual char std::numpunct<char>::do_thousands_sep() const;


#include <stdio.h>
#include <locale.h>

int main()
{
        printf("%'d\n", 1024);
        puts(setlocale(LC_ALL, ""));
        printf("%'d\n", 1024);
        return 0;
}

$ LC_ALL=cs_CZ.UTF-8 ./a.out | od -c
0000000   1   0   2   4  \n   c   s   _   C   Z   .   U   T   F   -   8
0000020  \n   1 302 240   0   2   4  \n


-- 
           Summary: bad thousand separator with UTF-8 locales
           Product: gcc
           Version: 4.1.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: debian-gcc at lists dot debian dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29379


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]