This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Localization: problems with toupper/tolower transformations of latin characters.


hi all,

I was trying to get propper transformation for upper/lower case
characters in a generic way, at first in C++, but I noticed that the
libc is apparently not converting correctly the latin characters.

Running the attached code, that tries the C++ and C functions -- later I
found out that libstdc++ uses libc for these, I get:

$ ./test
C++ version:
Órfão       (original string)
ÓRFãO   (should be upper case string)
Órfão    (should be lower case string)

C version:
Órfão
ÓRFãO
Órfão


Any ideas about what I could be missing ? Or is the library missing ?

(I tried different locales, as commented in the code, and using UTF-8
for encoding)


thanks in advance for any help/pointers!

- jan

ps.:
$ gcc --version
gcc (GCC) 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)
Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.



#include <iostream>
#include <iterator>    // for back_inserter
#include <locale>
#include <string>
#include <algorithm>
#include <cctype>      // old <ctype.h>

using namespace std;

struct MyToUpper
{
	MyToUpper(std::locale const& l) : loc(l) {;}
	char operator() (char c) const  { return std::toupper(c,loc); }
	private:
		std::locale const& loc;
};
   
struct MyToLower
{
	MyToLower(std::locale const& l) : loc(l) {;}
	char operator() (char c) const  { return std::tolower(c,loc); }
	private:
		std::locale const& loc;
};


int main(int argc, char** argv)
{
	// locale loc( "" );
	// locale loc( "en_US.UTF-8" );
	locale loc( "pt_BR.UTF-8" );
	MyToUpper up( loc );
	MyToLower down( loc );

	cout << "C++ version: " << endl;
	const char *reference = "�rfão";
	string normal = reference; 
	cout << normal << endl;
	transform( normal.begin(), normal.end(), normal.begin(), up );
	cout << normal << endl;
	transform( normal.begin(), normal.end(), normal.begin(), down );
	cout << normal << endl;


	// C version
	//setlocale(LC_ALL, "");
	//setlocale(LC_ALL, "pt_BR.UTF-8");
	setlocale(LC_ALL, "en_US.UTF-8");
	cout << endl << "C version: " << endl;
	char buffer[256];
	strcpy( buffer, reference );
	char *buffer_end = buffer + strlen(buffer);
	cout << buffer << endl;
	transform( buffer, buffer_end, buffer, ::toupper );
	cout << buffer << endl;
	transform( buffer, buffer_end, buffer, ::tolower );
	cout << buffer << endl;

   return 0;
   return 0;
}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]