This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits<char16_t>::eof()


On 08/05/17 17:41 +0200, Stephan Bergmann wrote:
On 05/08/2017 04:59 PM, Jonathan Wakely wrote:
On 08/05/17 13:37 +0200, Stephan Bergmann via libstdc++ wrote:
To me, it feels like this change is not an improvement over keeping living with the current defect.

So rather than transforming U+FFFF into U+FFFD you'd prefer to write
nothing instead?

#include <sstream>
#include <cassert>

int main()
{
 std::basic_ostringstream<char16_t> s;
 s.put(u'\uFFFF');
 assert( s.str().length() == 1 );
}

a.out: ex.cc:8: int main(): Assertion `s.str().length() == 1' failed.
Aborted (core dumped)

With my patch it inserts U+FFFD.

Actually that's not true, with my patch it inserts U+FFFF exactly as
the code requested, but basic_streambuf::sputc returns U+FFFD. Since
basic_ostream::put doesn't really care about the precise value
returned by sputc (only if it equals eof) then everything Just Works.

(But if you tried to read that U+FFFF back you'd get U+FFFD.)

Oh, from what I'd read in this thread I'd naively assumed that today reading and writing U+FFFF actually succeeds, just some functions return values so that client code can't distinguish success from failure.

std::basic_ostream and std::basic_streambuf are both clients of
basic_streambuf's member functions, so they're affected by that
problem distinguishing success from failure. We can't just add special
hacks using some side channel between ostream and streambuf, because
it wouldn't work for non-standard streambufs that override the virtual
members.

Here's another example of how it can go wrong:

#include <sstream>

int main()
{
 std::basic_ostringstream<char16_t> s(u"foo");
 s.exceptions(std::ios_base::badbit);
 s.put(u'\uFFFF');
}

terminate called after throwing an instance of 'std::ios_base::failure'
 what():  basic_ios::clear
Aborted (core dumped)

I always say that using exceptions with iostreams is bonkers, but the
point is that inserting a valid UTF-16 character sets the badbit on
the stream, even though in this case it has actually been correctly
inserted!

In the first example the put(u'\uFFFF') call triggered a call to
overflow(c) which doesn't insert anything when c == eof(). In the
second example there's no overflow, so the character is inserted, but
the return value equals eof so is treated as failure and sets badbit.

With my patch both examples behave as you'd expect.

My patch makes more significant changes for input though:

#include <sstream>
#include <cassert>

int main()
{
 const char16_t ffff = u'\uFFFF';
 std::basic_stringstream<char16_t> s(u"\uFFFF\u006F\u006F");
 s.exceptions(std::ios_base::eofbit);
 assert( s.str()[0] == ffff );
 auto c = s.get();
 assert( c == ffff );
}

With the current code the get() call returns U+FFFF which equals eof,
so eofbit gets set (even though there are more characters in the
streambuf's get area):

terminate called after throwing an instance of 'std::ios_base::failure'
 what():  basic_ios::clear
Aborted (core dumped)

With my patch there's no exception, but the second assertion fails,
because when U+FFFF is read it gets transformed by to_int_type to
U+FFFD:

a.out: u16.cc:11: int main(): Assertion `c == ffff' failed.
Aborted (core dumped)

So with the current code the behaviour is just broken. Inserting or
extracting U+FFFF can set badbit or eofbit spuriously, and may or may
not actually insert anything depending on the state of the streambuf
(whether there is space in the put area).

With my patch you get a slight change on insertion (which will go
unnoticed by most code) and a larger change on extraction, but you
always insert or extract the right number of characters, and don't set
any iostate flags spuriously.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]