This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: basic_streambuf / locale::ctype problems


> I have finally decided to stop lurking (and sniveling about bugs) and 
> contribute to this effort (why does that phrase immediately bring to
> mind the image of Gollum – oh well). To that end, I am going to submit
> several patches in separate emails. To “prime the pump” for them
> however, I want to reopen a can of worms that has not been anywhere
> near properly addressed.

Ok. Welcome.

> Back in December Richard Burkert submitted a question about problems
> with basic_ios::init. I should have gotten involved in the discussion
> then because I was intimately aware of the problem, but other
> commitments intervened. Better late than never, I hope.
> 
> 1. The original problem was based on the following:
> -----
> #include <sstream>
> int main() {
> 	std::basic_stringstream<int> test;
> }
> ------
> As written, this program has undefined behavior.
> Basic_stringstream<int> depends upon char_traits<int>. The Standard
> requires only a declaration for the template char_traits, and the two
> explicit specializations char_traits<char> and char_traits<wchar_t>.
> The Standard does not provide a definition of the template
> char_traits<>, only requirements for what a traits-type class must
> provide. The current library does provide a definition for template
> char_traits<>, along with default implementations of the functions,
> but this has to be considered an implementation defined enhancement
> and is non-portable. On other conforming platforms this program will
> probably not compile. While undefined behavior is allowed to do
> anything it wants, I would still recommend that the definition of
> char_traits<> be removed.

Agreed. Note this logic, as I'll apply it below for the non-required
facet issue below.

If the char_traits generic definitions are removed, then at least the
library will be consistent about this issue. I'm in support of
consistency.

> 2. Adding an explicit specialization for char_traits<int> yields:
> ------
> #include <string>
> #include <sstream>
> template<> std::char_traits<int> { /* … */ }
> int main() {
> 	std::basic_stringstream<int> test;
> }
> -----
> This should compile, but may still result in undefined behavior. In 
> particular, the specialization is likely to contain
> 	typedef  unsigned long	int_type;
> After all, this is what the default implementation provided.
> Unfortunately, the Standard is clear that char_traits<>::int_type is
> suppose to be something “wider” than the char_type. In particular, it
> must be able to contain all values of char_type, plus a unique eof
> value. On platforms where the underlying representation of ‘int’ and
> ‘unsigned long’ are the same (most of them these days) I do not
> believe that this requirement can be met. This has implications for
> char_traits<wchar_t>. The sizeof wchar_t is implementation defined,
> but must be the same as one of the underlying integral types. On the
> other hand, the Standard requires
> 	char_traits<wchar>::int_type 	č wint_t
> I do not know the Standard C-library requirements, but I do know they
> are independent of the Standard C++ library requirements and I find
> nothing in the C++ Standard requiring that the sizeof wchar_t be
> smaller than sizeof wint_t. I suspect that this is a defect in the
> Standard. I did not find it in the current issues list (nobody uses
> wchar_t very much I guess), so I will submit it unless somebody can
> correct my understanding.

This is indeed an issue. The library committee is waffling on what
wchar_t is, and thus what wint_t is.


> 3. The Standard allows char_traits<>::int_type to be a class, so
> suppose we have –
> -----
> #include <string>
> #include <sstream>
> template <> std::char_traits<int> {
> 	typedef int	char_type;
> 	struct int_type {
> 		char_type	_c;
> 		bool		_eof;
> 		// … etc
> 	};
> 	// … etc
> };
> int main() {
> 	std::basic_stringstream<int> test;
> }
> -----
> The Standard does not require any operations be supported for this
> int_type other than what are provided by char_traits<> itself. This
> will not compile with the current library – it uses
> basic_streambuf<int>::int_type, which is a typedef for
> char_traits<int>::int_type, incorrectly in several places. I will
> submit a patch for this in a follow-on email.

Thanks. I'd not thought of this, actually. I'd tested with non-standard
char_traits types but not with a POD int_type.

Bummer for me.

> 4. Assuming that basic_streambuf is patched as in (3), then the above
> will compile. Now we are back to a correct version of Mr. Burkert’s
> original problem. Since I did not like the idea of having to install a
> ctype<int> facet in the global locale just to get the constructor to
> run correctly any more than anyone else, I am happy to see that this
> problem has been fixed in the current release. But, for the sake of
> argument, let us assume that someone actually wanted to use
> basic_ios::widen() and basic_ios::narrow(). These are not virtual
> functions but a derived class could obviously provide its own
> versions. Still, the default behavior of these functions use the
> ctype<> facet and that facet does provide virtual functions, so an
> equally valid approach seems to be to override the virtual functions
> in ctype<>. So now assume we have:
> -----
> #include <string>
> #include <locale>
> #include <sstream>
> template <> std::char_traits<int> { /* … */ };  // as necessary
> class MyFacet : public std::ctype<int> {
> 	virtual int do_widen(char) const;
> 	virtual char do_narrow(int x) const;
> };
> int main() {
> 	std::locale::global(std::locale(std::locale(), new MyFacet));
> 	std::basic_stringstream<int> test;
> }
> -----
> Assuming that char_traits<int> is correctly specialized, and that 
> definitions for MyFacet::do_widen and do_narrow are provided, I would
> say this was a well defined program.
> 
> Unfortunately, this will compile, but will not link with the current 
> library. The linker reports missing symbols for all of the virtual
> functions in the ctype<> facet. It was reported over a year ago that
> the library is missing definitions for these functions. The argument
> then was that the Standard does not specify their behavior so they are
> not required to be defined. I disagree. Unlike the char_traits<>
> template, the Standard provides a complete definition for the ctype<>
> template. In that definition are a number of virtual functions which
> are not marked pure virtual. I argue that the language – not the
> library – requires that definitions for those functions have to be
> provided somewhere. I agree that the default behavior of these
> functions is not well defined – so they can do whatever we want -- but
> they have to be there or programs like this will not build and I think

I think you might want to re-examine this issue, and think about the
ramifications of this decision as the library evolves over time, without
convenience issues.

You say that ctype generic functions are defined in the standard. Where?
22.2.1.1.2? This is not enough to implement a generic ctype, at least I
don't see how. Then you say that the default behavior is not well
defined. Well? Which one is it?

I think the latter. 

As I see it, ctype<char> and ctype<wchar_t> are required. The rest are
optional. 

I think the link errors, and current ctype (and other locale facets)
behavior, at least allows end-users to do what they want to do without
derivation or otherwise hacking around generic defintions that don't
work for custom character types. If there is no clear behavior specified
in the standard, I think the best thing to do is for the library to get
out of the way.

Reasonable people may differ, of course. There have been some long
threads about this in the past, as you've noticed. I have yet to be
convinced in your approach, but am willing to listen to arguments.

(Some thing that I had thought might be a good idea, which I consider
orthogonal, is to actually come up with optional "generic" code for all
the facets that are needed for custom char_types to do io. These could
be in ext. This is work that I plan on doing when I get time to actually
document the locale implementation.)

-benjamin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]