This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug libstdc++/18678] std::time_put<wchar_t> is broken with UTF-8 locales
- From: "rleigh at debian dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 26 Nov 2004 12:13:09 -0000
- Subject: [Bug libstdc++/18678] std::time_put<wchar_t> is broken with UTF-8 locales
- References: <20041126002106.18678.rleigh@debian.org>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Additional Comments From rleigh at debian dot org 2004-11-26 12:13 -------
Yes, I'm using 3.4.3 (and glibc-2.3.2.ds1-18). With respect to the comparisons,
I've now added wcsftime() to the test, and it /does/ match std::time_put<wchar_t>:
$ LANG=ru_RU.UTF-8 LC_ALL=ru_RU.UTF-8 ./date3
asctime: Fri Nov 26 11:32:47 2004
strftime: Ð?Ñ?н 26 Ð?оÑ? 2004 11:32:47
wcsftime: B= 26 >O 2004 11:32:47
std::time_put<char>: Ð?Ñ?н 26 Ð?оÑ? 2004 11:32:47
std::time_put<wchar_t>: B= 26 >O 2004 11:32:47
Viewed as hexadecimal (aligned for comparison):
"Narrow" UTF-8:
Ð? Ñ? н 2 6 Ð? о Ñ?
==> d0 9f d1 82 d0 bd 20 32 36 20 d0 9d d0 be d1 8f
2 0 0 4 1 1 : 3 2 : 4 7 \n
==> 20 32 30 30 34 20 31 31 3a 35 31 3a 30 34 0a
"Wide" unknown:
B = 2 6 > O
==> 1f 42 3d 20 32 36 20 1d 3e 4f
2 0 0 4 1 1 : 3 2 : 4 7 \n
==> 20 32 30 30 34 20 31 31 3a 35 31 3a 30 34 0a
However... I am using a UTF-8-capable terminal (GNOME-terminal, and the Linux
console with LatArCyrHeb font in UTF-8 mode). In both of these cases, the
output appears as above; the "narrow" output to cout is displayed correctly (I
verified this is valid UTF-8), but the "wide" output to wcout is not valid UTF-8.
I expected valid UTF-8 in both cases, since this is what the locale codeset
specifies. I'm not sure what encoding wchar_t would be using, but I assumed I
would get readable output (maybe I am wrong about that?). It looks like the
"wide" output is a different encoding, but for some reason has not affected the
7-bit ASCII range (I would have expected something like padding with \0 if it
was outputting UCS-4).
Regards,
Roger
#include <iostream>
#include <locale>
#include <ctime>
#include <cwchar>
int main()
{
// Set up locale stuff...
std::locale::global(std::locale(""));
std::cout.imbue(std::locale());
std::wcout.imbue(std::locale());
// Get current time
time_t simpletime = time(0);
// Break down time.
std::tm brokentime;
localtime_r(&simpletime, &brokentime);
// Normalise.
mktime(&brokentime);
std::cout << "asctime: " << asctime(&brokentime);
// Print with strftime(3)
char buffer[40];
std::strftime(&buffer[0], 40, "%c", &brokentime);
std::cout << "strftime: " << &buffer[0] << '\n';
wchar_t wbuffer[40];
std::wcsftime(&wbuffer[0], 40, L"%c", &brokentime);
std::wcout << "wcsftime: " << &wbuffer[0] << '\n';
// Try again, but use proper locale facets...
const std::time_put<char>& tp =
std::use_facet<std::time_put<char> >(std::cout.getloc());
std::string pattern("std::time_put<char>: %c\n");
tp.put(std::cout, std::cout, std::cout.fill(),
&brokentime, &*pattern.begin(), &*pattern.end());
// And again, but using wchar_t...
const std::time_put<wchar_t>& wtp =
std::use_facet<std::time_put<wchar_t> >(std::wcout.getloc());
std::wstring wpattern(L"std::time_put<wchar_t>: %c\n");
wtp.put(std::wcout, std::wcout, std::wcout.fill(),
&brokentime, &*wpattern.begin(), &*wpattern.end());
return 0;
}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18678