Bug 94749 - std::istream::ignore discards too many characters
Summary: std::istream::ignore discards too many characters
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 9.3.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-24 17:12 UTC by serpent7776
Modified: 2021-04-19 10:40 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work: 11.0
Known to fail:
Last reconfirmed: 2020-04-25 00:00:00


Attachments
full source example (134 bytes, text/plain)
2020-04-24 17:12 UTC, serpent7776
Details

Note You need to log in before you can comment on or make changes to this bug.
Description serpent7776 2020-04-24 17:12:38 UTC
Created attachment 48369 [details]
full source example

The following code results in '=' assigned to c when using libstdc++, but '+' is assigned when using libc++, I believe libc++ is correct here

	std::stringstream s(" +=");
	char c;
	s.ignore(1, '+');
	s >> c;

Tested this on godbolt with newest gcc and locally on FreeBSD with gcc7 (command: g++7 -Wall -Wextra -pedantic test.cpp)

https://godbolt.org/z/yS5je8
Comment 1 Jonathan Wakely 2020-04-27 09:53:53 UTC
It looks like the bug is that when we have finished ignoring we assume that if the next char is the delimiter, then that must be why we stopped, and so we ignore that as well. But if we stopped because we reached the specified number of characters, then whether the next char is the delimiter is irrelevant.

Shouldn't be hard to fix.
Comment 2 serpent7776 2020-05-29 20:14:05 UTC
any update?
Comment 3 Jonathan Wakely 2020-05-29 20:29:49 UTC
I have a patch but was waiting until after the GCC 11 release.

I'll look into it next week.
Comment 4 GCC Commits 2020-06-11 17:41:49 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:b32eea9c0c25a03e77170675abc4e4bcab6d2b3b

commit r11-1238-gb32eea9c0c25a03e77170675abc4e4bcab6d2b3b
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Thu Jun 11 18:41:37 2020 +0100

    libstdc++: Fix istream::ignore discarding too many chars (PR 94749)
    
    The current code assumes that if the next character in the stream is
    equal to the delimiter then we stopped because we saw that delimiter,
    and so discards it.  But in the testcase for the PR we stop because we
    reached the maximum number of characters, and it's coincidence that the
    next character equals the delimiter. We should not discard the next
    character in that case.
    
    The fix is to check that we haven't discarded __n characters already,
    instead of checking whether the next character equals __delim. Because
    we've already checked for EOF, if we haven't discarded __n yet then we
    know we stopped because we saw the delimiter. On the other hand, if the
    next character is the delimiter we don't know if that's why we stopped.
    
            PR libstdc++/94749
            * include/bits/istream.tcc (basic_istream::ignore(streamsize, CharT)):
            Only discard an extra character if we didn't already reach the
            maximum number.
            * src/c++98/istream.cc (istream::ignore(streamsiz, char))
            (wistream::ignore(streamsize, wchar_t)): Likewise.
            * testsuite/27_io/basic_istream/ignore/char/94749.cc: New test.
            * testsuite/27_io/basic_istream/ignore/wchar_t/94749.cc: New test.
Comment 5 Jonathan Wakely 2020-06-11 17:43:35 UTC
Fixed in master. I'll keep the bug open as I will probably backport the fix.
Comment 6 serpent7776 2020-06-11 20:03:41 UTC
thanks
Comment 7 Jonathan Wakely 2020-07-09 21:34:07 UTC
The fix is actually not right, it fails to discard the delimiter if it occurs after ignoring more than numeric_limits<streamsize>::max() characters.

I have a fix for that though.
Comment 8 GCC Commits 2020-07-13 11:26:35 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:ba8fe4b4832e30277f2e4a73b5d35b2e55074d07

commit r11-2054-gba8fe4b4832e30277f2e4a73b5d35b2e55074d07
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Mon Jul 13 10:26:39 2020 +0100

    libstdc++: Fix istream::ignore exit conditions (PR 94749, PR 96161)
    
    My previous fix for PR 94749 did fix the reported case, so that the next
    character is not discarded if it happens to equal the delimiter when __n
    characters have already been read. But it introduced a new bug, which is
    that the delimiter character would *not* be discarded if the number of
    characters discarded is numeric_limits<streamsize>::max() or more before
    reaching the delimiter.
    
    The new bug happens because I changed the code to check _M_gcount < __n.
    But when __n == numeric_limits<streamsize>::max() that is false, and so
    we don't discard the delimiter. It's not sufficient to check for the
    delimiter when the __large_ignore condition is true, because there's an
    edge case where the delimiter is reached when _M_gcount == __n and so
    we break out of the loop without setting __large_ignore.
    
    PR 96161 is a similar bug to the original PR 94749 report, where eofbit
    is set after discarding __n characters if there happen to be no more
    characters in the stream.
    
    This patch fixes both cases (and the regression) by checking different
    conditions for the __n == max case and the __n < max case. For the
    former case, we know that we must have either reached the delimiter or
    EOF, and the value of _M_gcount doesn't matter (except to avoid integer
    overflow). For the latter case we need to check _M_gcount first and only
    set eofbit or discard the delimiter if it didn't reach __n. For the
    latter case overflow can't happen because _M_gcount <= __n < max.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/94749
            PR libstdc++/96161
            * include/bits/istream.tcc (basic_istream::ignore(streamsize))
            [n == max]: Check overflow conditions on _M_gcount. Rely on
            the fact that either EOF or the delimiter was reached.
            [n < max]: Check _M_gcount < n before checking for EOF or
            delimiter.
            (basic_istream::ignore(streamsize, char_type): Likewise.
            * src/c++98/compatibility.cc (istream::ignore(streamsize))
            (wistream::ignore(streamsize)): Likewise.
            * src/c++98/istream.cc (istream::ignore(streamsize, char_type))
            (wistream::ignore(streamsize, char_type)): Likewise.
            * testsuite/27_io/basic_istream/ignore/char/94749.cc: Check that
            delimiter is discarded if the number of characters ignored
            doesn't fit in streamsize.
            * testsuite/27_io/basic_istream/ignore/wchar_t/94749.cc:
            Likewise.
            * testsuite/27_io/basic_istream/ignore/char/96161.cc: New test.
            * testsuite/27_io/basic_istream/ignore/wchar_t/96161.cc: New test.