This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Char_traits (part 2+)


Benjamin,

I have now had a chance to review the changes for char_traits. You are correct, I do have a problem (still). On the other hand, you might find my recommendations closer to where you are headed – or further out in left field – than you think.
Note: the rest if this is a philosophical point-of-view statement. You waste your time reading it on your own conscious.

To summarize my interpretation of the Standard:
1. 21.1.1/1 is not a definition of template class char_traits<>. It is a statement of requirements – just like numerous similar tables in the Standard.
2. 21.1.1/2 requires a declaration of template class char_traits<>. It is not a definition either. It is also all we get.
3. 21.1.3.1 and 21.1.3.2 provide complete definitions of the explicit specializations template<> struct char_traits<char> and template<> struct char_traits<wchar_t> respectively. This suggests that the absence of a complete definition in (2) is deliberate.
4. It is not possible to provide a generic implementation of char_traits that meets the requirements of 21.1.1/1 without imposing additional requirements on the character type that are not mandated by the Standard.

Given these points, I consider any definition of template struct char_traits<> to be an implementation defined extension. Since an extension can be and do whatever it wants, we could just draw the line there and let it go at that.

Nevertheless, I had problems with the original char_traits implementation because they were incorrect for most cases, and because of what I call the “like a duck” problem. This is: if it looks like a ‘character traits’ class, and calls itself a ‘character traits’ class, and is provided by the implementation, then reasonable users can be excused when they think it is a ‘character traits’ class – and for getting annoyed when they discover that it is not correct.

The changes address only part of this problem – now attempts to use char_traits<MyChar> will compile, but not link unless the user provides the definitions. First, the Standard is clear that adding explicit specializations of templates for built-in types to namespace std results in undefined behavior (17.4.3.1/1), so this doesn’t work for the expected common cases. Additionally, the solution provides no useful functionality for the user-defined types, so I do not see any point. In fact, it seems to promise functionality that is NOT provided, so unless the requirement that the users provide the function definitions is clearly spelled out somewhere, I think it is actually more confusing than just going with only what the Standard requires.

What would I do?
This question is complicated by 3 issues:
1. char_traits is declared as a template.
2. Some of char_traits requirements are only applicable to iostreams usage.
3. The Standard requirement that a ‘character’ must be a POD type (added between CD1 and CD2) makes some ‘character traits’ functions redundant.

Tackling these in reverse order:
(3) The ‘character traits’ functions assign (both versions), move, and copy are redundant in the sense that no conforming program can tell if they are used or not. Even length can be implemented generically. While tangential to this discussion, I would recommend that the library implementation actually NOT use them. The point for this discussion is that there is no reason NOT to provide a generic definition of char_traits that provides these functions.

(2) Let’s face it – the real issue is basic_string<>. Weirdo’s like me who mess with clause 27 are rare and presumably should know what they are doing.

(3) We know that the language specifically forbids the instantiation of ordinary template functions that are not used. This makes it possible to provide functions that have more specific requirements than those imposed on the template arguments in general.

Taking these things into account causes me to offer the following possibilities:
A. Provide only what the Standard specifies without extensions. This has the advantage that code will be more likely to be portable. It has the major disadvantage that useful things like basic_string<unsigned char> will not work.
B. Provide A plus implementation defined explicit specializations of the other built-in character types. This is my preference. I think it is in keeping with both the letter and the spirit of the Standard. It allows the useful things like declaring a basic_string<unsigned char>, but prevents accidental usage of more esoteric types.
C. Provide a partial definition and implementation of char_traits<> that would allow common cases such as basic_string<unsigned char>. Specifically do not provide the declarations for those char_traits<> types and functions that are intended only for iostreams support. Define eq, lt, compare, and find in terms of operator== and operator< with the usual caveat that if these operators are not available then those functions will not compile. Strictly speaking, this is not correct, but this IS an extension. Obviously it works for the built-in types. This is my second preferred choice. I think it is weaker than (B) but only a little. It has the advantage that it allows basic_string to be used as a more efficient container than vector for any POD type.
D. Provide a complete definition of template char_traits<> based on the table in 21.1.1/1 but only provide implementations for those functions which can be provided correctly and allow/force the user to provide the others. For specializing basic_string this is almost equivalent to (C), but perhaps slightly more useful. I would note that you can NOT define int_type to be an unsigned int and have it work correctly. This is primarily why I prefer (C) to (D). If you define int_type to be a std::pair<char_type, bool> where the bool flag indicates EOF or not, you can make it work, but ideally the user needs to provide his/her own type for int_type. This means that specializing iostreams probably always requires a user defined char_traits specialization.

Jack



_________________________________________________________________
Join the world’s largest e-mail service with MSN Hotmail. http://www.hotmail.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]