The following program hangs: #include <iostream> #include <sstream> int main() { std::stringstream str; str.put('a'); // 1 str.put('\x80'); // 2 str.put('a'); // 3 str.ignore(32, '\x80'); // hangs std::cout << str.tellg() << "\n"; } Removing any of the numbered lines (1,2,3) makes the problem go away. Using std::basic_stringstream<unsigned char> does not exhibit the issue. Passing unsigned delim (that does not get sign extended) also fixes it. ASAN or UBSAN is silent. `-D_GLIBCXX_DEBUG` does not make a difference. GCC 5.4, 6.3, 8.2, 9.2 are affected.
Looks like I missed this one when it was filed. The fix is: --- a/libstdc++-v3/src/c++98/istream.cc +++ b/libstdc++-v3/src/c++98/istream.cc @@ -126,6 +126,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION const int_type __eof = traits_type::eof(); __streambuf_type* __sb = this->rdbuf(); int_type __c = __sb->sgetc(); + __delim = traits_type::to_int_type(__cdelim); bool __large_ignore = false; while (true)
On second thoughts, I don't think that fix is right. istream::ignore takes an int_type for the delimiter, so passing it a char_type with a negative value will confuse it. For example, str.ignore(n, '\xff`) will treat the delimiter as EOF and ignore up to n chars, ignoring any '\xff` characters that might be in the stream buffer's get area. That means it's wrong to call ignore(n, '\xff') on a platform where char is signed, because it won't do what you expect (unless you're really intending to treat \xff as EOF). This case is similar, it ignores up to n characters, or until sgetc() returns a character equal to '\x80' ... but that can never happen because sgetc() never returns a negative value unless it reaches EOF. So there is a gcc bug here, because we should not loop forever. But the problem is that we use inconsistent conditions for deciding whether we've found the delimiter.
So maybe: --- a/libstdc++-v3/src/c++98/istream.cc +++ b/libstdc++-v3/src/c++98/istream.cc @@ -112,7 +112,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION basic_istream<char>:: ignore(streamsize __n, int_type __delim) { - if (traits_type::eq_int_type(__delim, traits_type::eof())) + // If __delim is eof() we ignore up to __n chars, and for any other + // negative value using eq_int_type(sgetc(), __delim) will never be true, + // so just treat all negative __delim values as eof(). + if (__delim < 0) return ignore(__n); _M_gcount = 0;
The incorrect behaviour was introduced with r0-59139-g80dddedcaf316e
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>: https://gcc.gnu.org/g:2d694414ada8e3b58f504c1b175d31088529632e commit r14-9978-g2d694414ada8e3b58f504c1b175d31088529632e Author: Jonathan Wakely <jwakely@redhat.com> Date: Thu Apr 4 10:33:33 2024 +0100 libstdc++: Fix infinite loop in std::istream::ignore(n, delim) [PR93672] A negative delim value passed to std::istream::ignore can never match any character in the stream, because the comparison is done using traits_type::eq_int_type(sb->sgetc(), delim) and sgetc() never returns negative values (except at EOF). The optimized version of ignore for the std::istream specialization uses traits_type::find to locate the delim character in the streambuf, which _can_ match a negative delim on platforms where char is signed, but then we do another comparison using eq_int_type which fails. The code then keeps looping forever, with traits_type::find locating the character and traits_type::eq_int_type saying it's not a match, so traits_type::find is used again and finds the same character again. A possible fix would be to check with eq_int_type after a successful find, to see whether we really have a match. However, that would be suboptimal since we know that a negative delimiter will never match using eq_int_type. So a better fix is to adjust the check at the top of the function that handles delim==eof(), so that we treat all negative delim values as equivalent to EOF. That way we don't bother using find to search for something that will never match with eq_int_type. The version of ignore in the primary template doesn't need a change, because it doesn't use traits_type::find, instead characters are extracted one-by-one and always matched using eq_int_type. That avoids the inconsistency between find and eq_int_type. The specialization for std::wistream does use traits_type::find, but traits_type::to_int_type is equivalent to an implicit conversion from wchar_t to wint_t, so passing a wchar_t directly to ignore without using to_int_type works. libstdc++-v3/ChangeLog: PR libstdc++/93672 * src/c++98/istream.cc (istream::ignore(streamsize, int_type)): Treat all negative delimiter values as eof(). * testsuite/27_io/basic_istream/ignore/char/93672.cc: New test. * testsuite/27_io/basic_istream/ignore/wchar_t/93672.cc: New test.
Fixed on trunk so far.
The releases/gcc-13 branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>: https://gcc.gnu.org/g:fcf60d0baafa1245f768ac375dc60a07e92e9673 commit r13-8675-gfcf60d0baafa1245f768ac375dc60a07e92e9673 Author: Jonathan Wakely <jwakely@redhat.com> Date: Thu Apr 4 10:33:33 2024 +0100 libstdc++: Fix infinite loop in std::istream::ignore(n, delim) [PR93672] A negative delim value passed to std::istream::ignore can never match any character in the stream, because the comparison is done using traits_type::eq_int_type(sb->sgetc(), delim) and sgetc() never returns negative values (except at EOF). The optimized version of ignore for the std::istream specialization uses traits_type::find to locate the delim character in the streambuf, which _can_ match a negative delim on platforms where char is signed, but then we do another comparison using eq_int_type which fails. The code then keeps looping forever, with traits_type::find locating the character and traits_type::eq_int_type saying it's not a match, so traits_type::find is used again and finds the same character again. A possible fix would be to check with eq_int_type after a successful find, to see whether we really have a match. However, that would be suboptimal since we know that a negative delimiter will never match using eq_int_type. So a better fix is to adjust the check at the top of the function that handles delim==eof(), so that we treat all negative delim values as equivalent to EOF. That way we don't bother using find to search for something that will never match with eq_int_type. The version of ignore in the primary template doesn't need a change, because it doesn't use traits_type::find, instead characters are extracted one-by-one and always matched using eq_int_type. That avoids the inconsistency between find and eq_int_type. The specialization for std::wistream does use traits_type::find, but traits_type::to_int_type is equivalent to an implicit conversion from wchar_t to wint_t, so passing a wchar_t directly to ignore without using to_int_type works. libstdc++-v3/ChangeLog: PR libstdc++/93672 * src/c++98/istream.cc (istream::ignore(streamsize, int_type)): Treat all negative delimiter values as eof(). * testsuite/27_io/basic_istream/ignore/char/93672.cc: New test. * testsuite/27_io/basic_istream/ignore/wchar_t/93672.cc: New test. (cherry picked from commit 2d694414ada8e3b58f504c1b175d31088529632e)
Fixed for 13.3 and 14.1 so far.