The following code works: #include <iostream> #include <clocale> #include <string> int main() { using namespace std; char *p= setlocale( LC_ALL, "greek" ); if (!p) cerr<< "NULL returned!\n"; wstring ws; wcin>> ws; wcout<< ws<< endl; } [john@localhost src]$ ./foobar-cpp Δοκιμαστικό Δοκιμαστικό [john@localhost src]$ The following code DOES NOT work: #include <iostream> #include <locale> #include <string> int main() { using namespace std; wcout.imbue(locale("greek")); wstring ws; wcin>> ws; wcout<< ws<< endl; } [john@localhost src]$ ./foobar-cpp Δοκιμαστικό [john@localhost src]$ For the code that does not work: [john@localhost src]$ g++ -v -save-temps -ansi -pedantic-errors -Wall main.cc -o foobar-cpp Using built-in specs. Target: i386-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=i386-redhat-linux Thread model: posix gcc version 4.1.2 20070626 (Red Hat 4.1.2-14) /usr/libexec/gcc/i386-redhat-linux/4.1.2/cc1plus -E -quiet -v -D_GNU_SOURCE main.cc -mtune=generic -ansi -pedantic-errors -Wall -fpch-preprocess -o main.ii ignoring nonexistent directory "/usr/lib/gcc/i386-redhat-linux/4.1.2/../../../../i386-redhat-linux/include" #include "..." search starts here: #include <...> search starts here: /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../../include/c++/4.1.2 /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../../include/c++/4.1.2/i386-redhat-linux /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../../include/c++/4.1.2/backward /usr/local/include /usr/lib/gcc/i386-redhat-linux/4.1.2/include /usr/include End of search list. /usr/libexec/gcc/i386-redhat-linux/4.1.2/cc1plus -fpreprocessed main.ii -quiet -dumpbase main.cc -mtune=generic -ansi -auxbase main -pedantic-errors -Wall -ansi -version -o main.s GNU C++ version 4.1.2 20070626 (Red Hat 4.1.2-14) (i386-redhat-linux) compiled by GNU C version 4.1.2 20070626 (Red Hat 4.1.2-14). GGC heuristics: --param ggc-min-expand=99 --param ggc-min-heapsize=129413 Compiler executable checksum: a9d7d7ea3146608fff5ae7eec9c8ae61 as -V -Qy -o main.o main.s GNU assembler version 2.17.50.0.6-5.el5 (i386-redhat-linux) using BFD version 2.17.50.0.6-5.el5 20061020 /usr/libexec/gcc/i386-redhat-linux/4.1.2/collect2 --eh-frame-hdr -m elf_i386 --hash-style=gnu -dynamic-linker /lib/ld-linux.so.2 -o foobar-cpp /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crt1.o /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crti.o /usr/lib/gcc/i386-redhat-linux/4.1.2/crtbegin.o -L/usr/lib/gcc/i386-redhat-linux/4.1.2 -L/usr/lib/gcc/i386-redhat-linux/4.1.2 -L/usr/lib/gcc/i386-redhat-linux/4.1.2/../../.. main.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/i386-redhat-linux/4.1.2/crtend.o /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crtn.o [john@localhost src]$
Created attachment 15217 [details] The main.ii file produced by -save-temps option This is the file created by the g++ -v -save-temps -ansi -pedantic-errors -Wall main.cc -o foobar-cpp command, on the non-working code.
Created attachment 15218 [details] The produced main.s file The main.s file produced by "g++ -v -save-temps -ansi -pedantic-errors -Wall main.cc -o foobar-cpp"
Not a bug, given our implementation-defined behavior: the various cin / wcin, streams are by default synced with stdio (per the standard requirements) and thus not converting. You can either call sync_with_stdio(false) before any I/O or use converting stream, like fstreams.
Created attachment 15219 [details] Screenshot of the standard I/O of the working code and of the non-working code. This screenshot shows the I/O of the working code and of the non-working code respectively.
sync_with_stdio (false) doesn't work. Actually it crashes the code. Check the screenshot I have attached in the latest attachment, to see the difference between the C++ working code and the C++ non-working code.
sync_with_stdio(false) works, and is tested dozens of times a day in our testsuites. And that is only half of my answer. Please understand what I said, study the details of the ISO C++ Standard and then come back.
I am sorry for insisting on this, but I think there is an issue, and I want the best for GCC. So please have a look at the messages of this link: http://tinyurl.com/384u3n and use Unicode (UTF-8) character encoding in your browser, to see the issues. Thanks.
Summary of the case: What doesn't work: #include <iostream> #include <locale> #include <string> int main() { using namespace std; wcin.imbue(locale("greek")); wcout.imbue(locale("greek")); wstring ws; wcin>> ws; wcout<< ws<< endl; } What works (under 2 conditions): 1. Only when "locale::global()" statement is used: #include <iostream> #include <locale> #include <string> int main() { using namespace std; locale::global(locale("en_US")); wcin.imbue(locale("greek")); wcout.imbue(locale("greek")); wstring ws; wcin>> ws; wcout<< ws<< endl; } 2. Only when "ios_base::sync_with_stdio(false)" statement is used. #include <iostream> #include <locale> #include <string> int main() { using namespace std; ios_base::sync_with_stdio(false); wcin.imbue(locale("greek")); wcout.imbue(locale("greek")); wstring ws; wcin>> ws; wcout<< ws<< endl; }
Maybe we can improve the behavior when the stdio is synced, that is we can transcode each wchar_t and sync after each transcoding. Very likely, you can also simulate that behavior right now by using sync_with_stdio(false) + a custom single-char I/O buffer. In any case, any enhancement will be implemented only when the binary compatibility will be broken.
Note, anyway, that there is a serious blocker to any enhancement in this area (and of course it explains the current behavior): if wcin & co are converting, they deal with the underlying stream as a narrow-character oriented stream. But when the stream is synced it must be possible to mix char-by-char with wchar_t C stdio operations, which require a wide-character orientation of the stream, whereas, per C99 7.19.2, the orientation of a stream cannot be changed after opening.
About my last reply: I checked, and within the current implementation of the underlying I/O the last issue (per libstdc++/9662) doesn't exist anymore, in other terms, when sync_with_stdio(false), C++ I/O on wcin/wcout doesn't change the orientation of the stream to byte (i.e, fwide < 0). Good. We have re-investigate all the other reasons that led to the separate non-converting synced (default) implementation of wcin & co...
*** Bug 37298 has been marked as a duplicate of this bug. ***
*** Bug 37673 has been marked as a duplicate of this bug. ***
*** Bug 33852 has been marked as a duplicate of this bug. ***
Why can't wcout simply convert to the selected encoding, and append the results to the cout buffer, as if the converted string had been directly output to cout? I'm not sure about the implementation details, but I fail to see how anything could prevent adopting this rather obvious solution. Of course, if cout is in the middle of the byte sequence of a character, this will not result in sensible output, but that is user error and I fail to see how such use could be made meaningful. BTW, doesn't cout share the stdout buffer via the GNU libio FILE/iostream sharing mechanism, making sync_with_stdio do nothing anyway?
We may make progress on this for 4.6.0, but I don't make promises. If, after having studied the relevant bits of the Standard and the current implementation of these features (I remind you that this is Free Software, thus no mysteries, no need for black-box thinking) I would recommend going ahead and proposing a patch (after having filed the required Copyright Assignment). Thanks.