This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Localization: problems with toupper/tolower transformations of latin characters.
- From: Jan Pfeifer <pfjan at yahoo dot com dot br>
- To: gcc-help at gcc dot gnu dot org
- Date: Sun, 11 Dec 2005 12:45:19 -0200
- Subject: Localization: problems with toupper/tolower transformations of latin characters.
hi all,
I was trying to get propper transformation for upper/lower case
characters in a generic way, at first in C++, but I noticed that the
libc is apparently not converting correctly the latin characters.
Running the attached code, that tries the C++ and C functions -- later I
found out that libstdc++ uses libc for these, I get:
$ ./test
C++ version:
Órfão (original string)
ÓRFãO (should be upper case string)
Órfão (should be lower case string)
C version:
Órfão
ÓRFãO
Órfão
Any ideas about what I could be missing ? Or is the library missing ?
(I tried different locales, as commented in the code, and using UTF-8
for encoding)
thanks in advance for any help/pointers!
- jan
ps.:
$ gcc --version
gcc (GCC) 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)
Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#include <iostream>
#include <iterator> // for back_inserter
#include <locale>
#include <string>
#include <algorithm>
#include <cctype> // old <ctype.h>
using namespace std;
struct MyToUpper
{
MyToUpper(std::locale const& l) : loc(l) {;}
char operator() (char c) const { return std::toupper(c,loc); }
private:
std::locale const& loc;
};
struct MyToLower
{
MyToLower(std::locale const& l) : loc(l) {;}
char operator() (char c) const { return std::tolower(c,loc); }
private:
std::locale const& loc;
};
int main(int argc, char** argv)
{
// locale loc( "" );
// locale loc( "en_US.UTF-8" );
locale loc( "pt_BR.UTF-8" );
MyToUpper up( loc );
MyToLower down( loc );
cout << "C++ version: " << endl;
const char *reference = "�rfão";
string normal = reference;
cout << normal << endl;
transform( normal.begin(), normal.end(), normal.begin(), up );
cout << normal << endl;
transform( normal.begin(), normal.end(), normal.begin(), down );
cout << normal << endl;
// C version
//setlocale(LC_ALL, "");
//setlocale(LC_ALL, "pt_BR.UTF-8");
setlocale(LC_ALL, "en_US.UTF-8");
cout << endl << "C version: " << endl;
char buffer[256];
strcpy( buffer, reference );
char *buffer_end = buffer + strlen(buffer);
cout << buffer << endl;
transform( buffer, buffer_end, buffer, ::toupper );
cout << buffer << endl;
transform( buffer, buffer_end, buffer, ::tolower );
cout << buffer << endl;
return 0;
return 0;
}