This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Patch for 8-bit character problem in stdc++-v2


It is related with a bug first reported in the MinGW (Minimalist GNU for
Windows) mailing list at

http://www.geocrawler.com/archives/3/6013/2002/1/100/7620462/

and is discussed in detail in the following thread:

http://www.geocrawler.com/mail/thread.php3?subject=%5BMingw-users%5D+g%2B%2B+problem+%28BUG+IDENTIFIED%21%29&list=6013

I'll summarize the problem as follows:

1) The Microsoft Visual C Runtime (MSVCRT) does not accept a minus
number (i.e. a signed char with the eighth bit set) except EOF as
argument to isspace and other is.. routines. It is declared in MSDN at

http://msdn.microsoft.com/library/en-us/vccore98/HTML/_crt_is.2c_.isw_routines.asp

2) The stdc++ header file of stdc++-v2 uses isspace to judge whether an
input is a space without checking its boundary. It is Line 120 in
std/straits.h, specialization of string_char_traits<char>::is_del.

3) If a character like 245 is given to is_del, it will be converted to a
call like isspace(245 - 256), which might be UNDEFINED, at least for the
Microsoft runtime. According to the C99 standard, it is conformant:

"The header <ctype.h> declares several functions useful for testing and
mapping characters. In all cases the argument is an int, the value of
which shall be representable as an unsigned char or shall equal the
value of the macro EOF. If the argument has any other value, the
behavior is undefined."

A patch is thus as follows:

--------------------- Beginning of patch ---------------------
--- straits.h.orig      Tue Nov 06 08:34:48 2001
+++ straits.h   Sun Feb 03 15:09:09 2002
@@ -119,3 +119,4 @@
   static char_type eos () { return 0; }
-  static bool is_del(char_type a) { return isspace(a); }
+  static bool is_del(char_type a)
+    { return isspace (static_cast<unsigned char>(a)); }

------------------------ End of patch ------------------------

This problem is not found to affect wchar_t-related classes.

Libstdc++-v3 already uses a static_cast<unsigned char> in similar
circumstances so is not affected either.

This is the first time I submit a patch to this list, so I apologize in
advance for any mistakes and non-conformances.

Best regards,

Wu Yongwei


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]