This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug libstdc++/14970] New: isalpha() inconsistent on 8-bit LATIN-1 chars
- From: "svoboda at cs dot cmu dot edu" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 15 Apr 2004 20:06:41 -0000
- Subject: [Bug libstdc++/14970] New: isalpha() inconsistent on 8-bit LATIN-1 chars
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
The following C++ code demonstrates a bug in the isalpha() function
(but more likely in general l10n in libstdc++).
Here is the C++ source code to tickle the bug. To compile it, type:
g++ foo.C
or
g++ foo.i
and to run it, type:
./a.out
foo.C:
#include <locale>
#include <iostream>
void test_ctype() {
unsigned char c = '\363';
int i = 0;
std::cerr << i << ": Locale: " << setlocale( LC_CTYPE, 0)
<< " Isalpha " << c << " ? "
<< (isalpha( c) ? "yes" : "no") << std::endl;
}
int main(int argc, char** argv) {
test_ctype();
test_ctype();
setlocale(LC_CTYPE, "en_US");
test_ctype();
test_ctype();
}
BAD output:
0: Locale: C Isalpha ? no
0: Locale: C Isalpha ? no
0: Locale: en_US Isalpha ? yes
0: Locale: en_US Isalpha ? no
GOOD output:
0: Locale: C Isalpha ? no
0: Locale: C Isalpha ? no
0: Locale: en_US Isalpha ? yes
0: Locale: en_US Isalpha ? yes
I have tested this program on several systems, with the following results:
Host OS Compiler Result
---------------------------------------------------------------
mandal RedHat 9 g++ 3.2.2 BAD
sevilla RedHat 7 g++ 3.2.3 GOOD
iim RedHat 6 g++ 2.95.3 GOOD
testsis AIX 5.1 xlC 6 GOOD
testsis AIX 5.1 g++ 3.2.1 BAD
Clues:
-> The program always returns GOOD results if I remove outputting the
'i' integer on line 7; that is, line 7 is changed to:
std::cerr << ": Locale: " << setlocale( LC_CTYPE, 0)
Ergo, printing an integer is crucial to tickling the bug somehow. This
happens on both RH9 and AIX5.
-> I can't reproduce this bug using C & gcc...it seems to only happen
under C++/g++.
Interpretation:
At this point, I believe the bug lies in g++ 3.2.1 and 3.2.2, and
doesn't exist in 3.2.3 or earlier g++ versions or other compilers.
Questions:
-> What *IS* the correct result of isalpha('\363') in the en_US
locale? (\363 is an 'o' with an acute accent: ''.
-> I presume it is not my fault if the isalpha() is being
inconsistent...so how do I get it to stay consistent? (assuming I
can't dictate the compiler or OS?)
-> Does AIX 5 support g++ 3.2.3? Would upgrading to g++ 3.2.3 on AIX5
and RH9 solve the problem?
--
Summary: isalpha() inconsistent on 8-bit LATIN-1 chars
Product: gcc
Version: 3.2.2
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: libstdc++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: svoboda at cs dot cmu dot edu
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14970