This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch] Fix the narrow/widen problem in money_get::do_get


Nathan Myers wrote:
...

I don't know the details of this implementation. Does it assume that only one wide-character value maps to each digit? I.e., suppose I supply a string 1X3Y, where X and Y narrow to 2 and 4, respectively. Does your optimization work sensibly? Does it extract 1, or 1234, or do something strange? The standard as published specifies, deliberately, that it extract 1.

I think that recent versions of our implementation may behave as you expect, although not necessarily thanks to anything in the xxx_get facets.

But I'm not sure whether the required behavior is sensible or not.
I have only a very vague recollection of a discussion surrounding
the decision to call widen() (in the unrelated context of DR 221:
http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-defects.html#221)
and the decision to widen() vs narrow() seemed to me rather
arbitrary. Why don't you think it makes sense to extract 1234 in
your example above? Do you know how most implementations of scanf()
behave? According to my reading of C99, scanf() is permitted but
not required to accept sequences of non-digit characters in non-C
locales. More importantly, do you know of any locales where there
are two or more characters that ctype narrows to the same digit?

Regardless of what the answer is, however, given the loosely
specified semantics of ctype<charT>::narrow() in cases where the
value of the first argument exceeds UCHAR_MAX, I claim it's
conforming for narrow() not to convert such "dual" characters
into digits (our ctype<wchar_t> won't because it converts such
characters to a multibyte sequence, as if by wctomb()). If I'm
right the requirement to call either widen() (or even narrow()
for that matter) might actually be entirely unnecessary.


...

Refusal by implementers to conform is not prima facie an argument that the requirement is bogus. In all scrupulousness you need to show that the requirement is also harmful.

If implementions simply chose to disregard a reasonable requirement in the standard and provided an alternative reasonable behavior, the document is irrelevant and existing practice is what matters. Since changing the implementations would break programs that rely on that behavior I think the burden of proof should be on those advocating a change to existing implementations to demonstrate that their behavior is harmful. If neither is, existing practice wins :)

...

The much simpler approach, and which I thought was the consensus,
was that many of the char_traits members -- particularly move,
copy, and comparison -- were leftovers from a failed experiment.
I.e. early drafts didn't require charT to be a POD. Since in the
published standard it must be a POD, those members serve no real
purpose. (IIRC the need for them to be PODs was realized at the Stockholm meeting.) Therefore, implementations should be encouraged to use ordinary assignment and operator==, and let the unused traits members be deprecated. I don't see any value in combing through the
sources finding places where the traits could be used; instead, I'd
like to see the standard changed to acknowledge that the simpler code
necessarily has the same effect.

Again, by following this (IMHO perfectly reasonable) suggestion before it has been codified in the standard an implementation will risk breaking programs (even if they are just tests) that rely on the existing requirements that the Traits be used (string is littered with those).

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]