[PATCH] PR libstdc++/80624 satisfy invariant for char_traits<char16_t>::eof()

Jonathan Wakely jwakely@redhat.com
Fri May 5 17:19:00 GMT 2017


As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda
hinted at by http://wg21.link/lwg1200) there's a problem with
char_traits<char16_t>::eof() because it returns int_type(-1) which is
the same value as u'\uFFFF', a valid UTF-16 code point.

i.e. because all values of int_type are also valid values of char_type
we cannot meet the requirement that:

"The member eof() shall return an implementation-defined constant
that cannot appear as a valid UTF-16 code unit."

I've reported this as a defect, suggesting that the wording above
needs to change.

One consequence is that basic_streambuf<char16_t>::sputc(u'\uFFFF')
always returns the same value, whether it succeeds or not. On success
it returns to_int_type(u'\uFFFF') and on failure it returns eof(),
which is the same value. I think that can be solved with the attached
change, which preserves the invariant in [char.traits.require] that
eof() returns:

"a value e such that X::eq_int_type(e,X::to_int_type(c)) is false for
all values c."

This can be true if we ensure that to_int_type never returns the eof()
value. http://www.unicode.org/faq/private_use.html#nonchar10 suggests
doing something like this.

It means that when writing u'\uFFFF' to a streambuf we write that
character successfully, but return u'\uFFFD' instead; and when reading
u'\uFFFF' from a streambuf we return u'\uFFFD' instead. This is
asymmetrical, as we can write that character but not read it back.  It
might be better to refuse to write u'\uFFFF' and write it as the
replacement character instead, but I think I prefer to write the right
character when possible. It also doesn't require any extra changes.

All tests pass with this, does anybody see any problems with this
approach?


-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.txt
Type: text/x-patch
Size: 7405 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20170505/7945020e/attachment.bin>


More information about the Gcc-patches mailing list