[committed] libstdc++: Replace operator>>(istream&, char*) [LWG 2499]

Martin Sebor msebor@redhat.com
Wed Aug 5 22:31:08 GMT 2020

On 8/5/20 3:25 PM, Jonathan Wakely wrote:
> P0487R1 resolved LWG 2499 for C++20 by removing the operator>> overloads
> that have high risk of buffer overflows. They were replaced by
> equivalents that only accept a reference to an array, and so can
> guarantee not to write past the end of the array.
> In order to support both the old and new functionality, this patch
> introduces a new overloaded __istream_extract function which takes a
> maximum length. The new operator>> overloads use the array size as the
> maximum length. The old overloads now use __builtin_object_size to
> determine the available buffer size if available (which requires -O2) or
> use numeric_limits<streamsize>::max()/sizeof(char_type) otherwise. This
> is a change in behaviour, as the old overloads previously always used
> numeric_limits<streamsize>::max(), without considering sizeof(char_type)
> and without attempting to prevent overflows.
> Because they now do little more than call __istream_extract, the old
> operator>> overloads are very small inline functions. This means there
> is no advantage to explicitly instantiating them in the library (in fact
> that would prevent the __builtin_object_size checks from ever working).
> As a result, the explicit instantiation declarations can be removed from
> the header. The explicit instantiation definitions are still needed, for
> backwards compatibility with existing code that expects to link to the
> definitions in the library.
> While working on this change I noticed that src/c++11/istream-inst.cc
> has the following explicit instantiation definition:
>    template istream& operator>>(istream&, char*);
> This had no effect (and so should not have been present in that file),
> because there was an explicit specialization declared in <istream> and
> defined in src/++98/istream.cc. However, this change removes the
> explicit specialization, and now the explicit instantiation definition
> is necessary to ensure the symbol gets defined in the library.
> libstdc++-v3/ChangeLog:
> 	* config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Export new symbols.
> 	* include/bits/istream.tcc (__istream_extract): New function
> 	template implementing both of operator>>(istream&, char*) and
> 	operator>>(istream&, char(&)[N]). Add explicit instantiation
> 	declaration for it. Remove explicit instantiation declarations
> 	for old function templates.
> 	* include/std/istream (__istream_extract): Declare.
> 	(operator>>(basic_istream<C,T>&, C*)): Define inline and simply
> 	call __istream_extract.
> 	(operator>>(basic_istream<char,T>&, signed char*)): Likewise.
> 	(operator>>(basic_istream<char,T>&, unsigned char*)): Likewise.
> 	(operator>>(basic_istream<C,T>&, C(7)[N])): Define for LWG 2499.
> 	(operator>>(basic_istream<char,T>&, signed char(&)[N])):
> 	Likewise.
> 	(operator>>(basic_istream<char,T>&, unsigned char(&)[N])):
> 	Likewise.
> 	* include/std/streambuf (basic_streambuf): Declare char overload
> 	of __istream_extract as a friend.
> 	* src/c++11/istream-inst.cc: Add explicit instantiation
> 	definition for wchar_t overload of __istream_extract. Remove
> 	explicit instantiation definitions of old operator>> overloads
> 	for versioned-namespace build.
> 	* src/c++98/istream.cc (operator>>(istream&, char*)): Replace
> 	with __istream_extract(istream&, char*, streamsize).
> 	* testsuite/27_io/basic_istream/extractors_character/char/3.cc:
> 	Do not use variable-length array.
> 	* testsuite/27_io/basic_istream/extractors_character/char/4.cc:
> 	Do not run test for C++20.
> 	* testsuite/27_io/basic_istream/extractors_character/char/9555-ic.cc:
> 	Do not test writing to pointers for C++20.
> 	* testsuite/27_io/basic_istream/extractors_character/char/9826.cc:
> 	Use array instead of pointer.
> 	* testsuite/27_io/basic_istream/extractors_character/wchar_t/3.cc:
> 	Do not use variable-length array.
> 	* testsuite/27_io/basic_istream/extractors_character/wchar_t/4.cc:
> 	Do not run test for C++20.
> 	* testsuite/27_io/basic_istream/extractors_character/wchar_t/9555-ic.cc:
> 	Do not test writing to pointers for C++20.
> 	* testsuite/27_io/basic_istream/extractors_character/char/lwg2499.cc:
> 	New test.
> 	* testsuite/27_io/basic_istream/extractors_character/char/lwg2499_neg.cc:
> 	New test.
> 	* testsuite/27_io/basic_istream/extractors_character/char/overflow.cc:
> 	New test.
> 	* testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499.cc:
> 	New test.
> 	* testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499_neg.cc:
> 	New test.
> Tested powerpc64le-linux. Committed to trunk.
> Martin, Jakub, could you please double-check the usage of
> __builtin_object_size? (line 805 in libstdc++-v3/include/std/istream)
> Do you see any problems with using it here? If it can't tell the size
> then we just assume it's larger than the string to be extracted, which
> is what the old code did anyway. I do have an idea for stronger
> checking in debug mode, which I'll post in a minute.

I've always found the second argument to __builtin_object_size
confusing for types above 1.  I don't see anything wrong in
the diff but I believe the most useful results are with type 1
for string functions and type 0 for raw memory functions like
memcpy (that's what _FORTIFY_SOURCE uses for the two sets of
functions).  In type 2 when the result is zero it means one of
two things: either the size of the array couldn't be determined
or it really is zero.  That's less than helpful in cases like:

   char a[8];
   strcpy (a + 8, s);

where it prevents detecting the buffer overflow.

I would suggest to use type 1 in the iostream functions instead.

Adding attribute access to the __istream_extract overloads should
also let GCC check for out-of-bounds accesses by the function and
issue warnings.  This goes beyond what __builtin_object_size makes
possible (it considers also variable sizes in known ranges).


More information about the Gcc-patches mailing list