--- Begin Message ---
- From: James W.McKelvey <mckelvey at maskull dot com>
- To: gcc-help at gcc dot gnu dot org
- Date: Sat, 1 May 2004 20:04:43 -0700
- Subject: Erroneous Comparisons of Negative Characters
- Reply-to: mckelvey at maskull dot com
- Xref: uniton.integrable-solutions.net gcc.help:1181
When chars are implemented as signed, characters with negative values compare
properly as individual chars, but improperly when part of a char array or
std::string -- they compare as unsigned. The problem appears to be that the
specialization of std::char_traits<char> uses memcmp. This is observed on an
Alpha running RH 7.1 and gcc version 3.5.0 20040207.
Specifically, std::char_traits<char>::compare is inconsistent with
std::char_traits<char>::lt, which affects std::string (which is just
std::basic_string<char>.) The problem also affects strcmp, strncmp, and
strcoll.
The attached program demonstrates the problem.
I post this because I want to be sure that there isn't some bizarre reason
that this behavior is intended before I file a bug report; or maybe I am
doing something wrong.
Result of running test program, with my analysis:
122 Expected character value of 'z'.
-64 Expected value of signed character '\0300'
192 Expected value of unsigned character '\0300'
-64 Shows that chars are signed
SC 1 Expected character comparison as signed char
UC 0 Expected character comparison as unsigned char
CH 1 Expected character comparison as char (signed)
SV 1 Demonstrates that std::string::value_type is signed
ST 0 Error: std::string of length 1 does not compare the same as CH
BS 1 std::basic_string<signed char> compares as expected
BU 0 std::basic_string<unsigned char> compares as expected
BC 0 Error: Same as ST, as expected (std::basic_string<char>)
TS 1 std::char_traits<signed char> compares signed char * as expected
TU 0 std::char_traits<signed char> compares unsigned char * as expected
TC 0 Error: std::char_traits<char>, char * does not compare properly
LS 1 std::char_traits<signed char> compares signed char as expected
LU 0 std::char_traits<unsigned char> compares unsigned char as expected
LC 1 std::char_traits<char> compares char as expected
(Note: inconsistent with TC)
MC 0 std::memcmp is comparing as unsigned (I guess that's OK)
SC 0 Error: std::strcmp is comparing as unsigned
SN 0 Error: std::strncmp is comparing as unsigned
SL 0 Error: std::strcoll is comparing as unsigned
Comments?
#include <iostream>
#include <string>
int
main(int,
char **)
{
typedef signed char sc;
typedef unsigned char uc;
std::cout << int('z') << std::endl;
std::cout << int(sc('\300')) << std::endl;
std::cout << int(uc('\300')) << std::endl;
std::cout << int(char('\300')) << std::endl << std::endl;
std::cout << "SC " << (sc('z') > sc('\300')) << std::endl;
std::cout << "UC " << (uc('z') > uc('\300')) << std::endl;
std::cout << "CH " << (char('z') > char('\300')) << std::endl;
std::cout << "SV " << (std::string::value_type('z') >
std::string::value_type('\300'))
<< std::endl;
std::cout << "ST " << (std::string(1, 'z').compare(
std::string(1, '\300')) > 0)
<< std::endl;
std::cout << "BS " << (std::basic_string<sc>(1, 'z').compare(
std::basic_string<sc>(1, '\300')) > 0)
<< std::endl;
std::cout << "BU " << (std::basic_string<uc>(1, 'z').compare(
std::basic_string<uc>(1, '\300')) > 0)
<< std::endl;
std::cout << "BC " << (std::basic_string<char>(1, 'z').compare(
std::basic_string<char>(1, '\300')) > 0)
<< std::endl;
std::cout << "TS " << (std::char_traits<sc>::compare(
reinterpret_cast<const sc *>("z"),
reinterpret_cast<const sc *>("\300"), 1) > 0)
<< std::endl;
std::cout << "TU " << (std::char_traits<uc>::compare(
reinterpret_cast<const uc *>("z"),
reinterpret_cast<const uc *>("\300"), 1) > 0)
<< std::endl;
std::cout << "TC " << (std::char_traits<char>::compare("z", "\300", 1) > 0)
<< std::endl;
std::cout << "LS " << (! std::char_traits<sc>::lt(static_cast<sc>('z'),
static_cast<sc>('\300')))
<< std::endl;
std::cout << "LU " << (! std::char_traits<uc>::lt(static_cast<uc>('z'),
static_cast<uc>('\300')))
<< std::endl;
std::cout << "LC " << (! std::char_traits<char>::lt('z', '\300'))
<< std::endl;
std::cout << "MC " << (std::memcmp("z", "\300", 1) > 0)
<< std::endl;
std::cout << "SC " << (std::strcmp("z", "\300") > 0)
<< std::endl;
std::cout << "SN " << (std::strncmp("z", "\300", 1) > 0)
<< std::endl;
std::cout << "SL " << (std::strcoll("z", "\300") > 0)
<< std::endl;
return 0;
}
--- End Message ---