[committed] libstdc++: Replace operator>>(istream&, char*) [LWG 2499]

Jonathan Wakely jwakely@redhat.com
Thu Aug 6 13:14:48 GMT 2020


On 05/08/20 16:31 -0600, Martin Sebor via Libstdc++ wrote:
>On 8/5/20 3:25 PM, Jonathan Wakely wrote:
>>P0487R1 resolved LWG 2499 for C++20 by removing the operator>> overloads
>>that have high risk of buffer overflows. They were replaced by
>>equivalents that only accept a reference to an array, and so can
>>guarantee not to write past the end of the array.
>>
>>In order to support both the old and new functionality, this patch
>>introduces a new overloaded __istream_extract function which takes a
>>maximum length. The new operator>> overloads use the array size as the
>>maximum length. The old overloads now use __builtin_object_size to
>>determine the available buffer size if available (which requires -O2) or
>>use numeric_limits<streamsize>::max()/sizeof(char_type) otherwise. This
>>is a change in behaviour, as the old overloads previously always used
>>numeric_limits<streamsize>::max(), without considering sizeof(char_type)
>>and without attempting to prevent overflows.
>>
>>Because they now do little more than call __istream_extract, the old
>>operator>> overloads are very small inline functions. This means there
>>is no advantage to explicitly instantiating them in the library (in fact
>>that would prevent the __builtin_object_size checks from ever working).
>>As a result, the explicit instantiation declarations can be removed from
>>the header. The explicit instantiation definitions are still needed, for
>>backwards compatibility with existing code that expects to link to the
>>definitions in the library.
>>
>>While working on this change I noticed that src/c++11/istream-inst.cc
>>has the following explicit instantiation definition:
>>   template istream& operator>>(istream&, char*);
>>This had no effect (and so should not have been present in that file),
>>because there was an explicit specialization declared in <istream> and
>>defined in src/++98/istream.cc. However, this change removes the
>>explicit specialization, and now the explicit instantiation definition
>>is necessary to ensure the symbol gets defined in the library.
>>
>>libstdc++-v3/ChangeLog:
>>
>>	* config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Export new symbols.
>>	* include/bits/istream.tcc (__istream_extract): New function
>>	template implementing both of operator>>(istream&, char*) and
>>	operator>>(istream&, char(&)[N]). Add explicit instantiation
>>	declaration for it. Remove explicit instantiation declarations
>>	for old function templates.
>>	* include/std/istream (__istream_extract): Declare.
>>	(operator>>(basic_istream<C,T>&, C*)): Define inline and simply
>>	call __istream_extract.
>>	(operator>>(basic_istream<char,T>&, signed char*)): Likewise.
>>	(operator>>(basic_istream<char,T>&, unsigned char*)): Likewise.
>>	(operator>>(basic_istream<C,T>&, C(7)[N])): Define for LWG 2499.
>>	(operator>>(basic_istream<char,T>&, signed char(&)[N])):
>>	Likewise.
>>	(operator>>(basic_istream<char,T>&, unsigned char(&)[N])):
>>	Likewise.
>>	* include/std/streambuf (basic_streambuf): Declare char overload
>>	of __istream_extract as a friend.
>>	* src/c++11/istream-inst.cc: Add explicit instantiation
>>	definition for wchar_t overload of __istream_extract. Remove
>>	explicit instantiation definitions of old operator>> overloads
>>	for versioned-namespace build.
>>	* src/c++98/istream.cc (operator>>(istream&, char*)): Replace
>>	with __istream_extract(istream&, char*, streamsize).
>>	* testsuite/27_io/basic_istream/extractors_character/char/3.cc:
>>	Do not use variable-length array.
>>	* testsuite/27_io/basic_istream/extractors_character/char/4.cc:
>>	Do not run test for C++20.
>>	* testsuite/27_io/basic_istream/extractors_character/char/9555-ic.cc:
>>	Do not test writing to pointers for C++20.
>>	* testsuite/27_io/basic_istream/extractors_character/char/9826.cc:
>>	Use array instead of pointer.
>>	* testsuite/27_io/basic_istream/extractors_character/wchar_t/3.cc:
>>	Do not use variable-length array.
>>	* testsuite/27_io/basic_istream/extractors_character/wchar_t/4.cc:
>>	Do not run test for C++20.
>>	* testsuite/27_io/basic_istream/extractors_character/wchar_t/9555-ic.cc:
>>	Do not test writing to pointers for C++20.
>>	* testsuite/27_io/basic_istream/extractors_character/char/lwg2499.cc:
>>	New test.
>>	* testsuite/27_io/basic_istream/extractors_character/char/lwg2499_neg.cc:
>>	New test.
>>	* testsuite/27_io/basic_istream/extractors_character/char/overflow.cc:
>>	New test.
>>	* testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499.cc:
>>	New test.
>>	* testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499_neg.cc:
>>	New test.
>>
>>Tested powerpc64le-linux. Committed to trunk.
>>
>>Martin, Jakub, could you please double-check the usage of
>>__builtin_object_size? (line 805 in libstdc++-v3/include/std/istream)
>>Do you see any problems with using it here? If it can't tell the size
>>then we just assume it's larger than the string to be extracted, which
>>is what the old code did anyway. I do have an idea for stronger
>>checking in debug mode, which I'll post in a minute.
>
>I've always found the second argument to __builtin_object_size
>confusing for types above 1.  I don't see anything wrong in
>the diff but I believe the most useful results are with type 1
>for string functions and type 0 for raw memory functions like
>memcpy (that's what _FORTIFY_SOURCE uses for the two sets of
>functions).  In type 2 when the result is zero it means one of
>two things: either the size of the array couldn't be determined
>or it really is zero.  That's less than helpful in cases like:
>
>  char a[8];
>  strcpy (a + 8, s);
>
>where it prevents detecting the buffer overflow.
>
>I would suggest to use type 1 in the iostream functions instead.
>
>Adding attribute access to the __istream_extract overloads should
>also let GCC check for out-of-bounds accesses by the function and
>issue warnings.  This goes beyond what __builtin_object_size makes
>possible (it considers also variable sizes in known ranges).

I tried the attached patch with a testcase like:

#include <istream>

void
test01(std::istream& in)
{
   char pc[1];
   in >> (pc + 1); // { dg-warning "writing 1 byte into a region of size 0" }
}

I get a -Wstringop-overflow warning with -O0, but not at -O2. Is that
because the call to operator>> gets inlined, and __builtin_object_size
returns 0, and we call __istream_extract(in, s, 0) which says we won't
write anything, so it's now considered safe?

That's unfortunate, because actually __istream_extract will always
write at least one byte as a null terminator.

So I tried various ways toi try and get the warning back at -O2 and
nothing worked e.g. when __builtin_object_size(__s, 0) returns 0 I
tried calling __istream_extract(__in, __s, 1) which should say I'm
going to write a character, and so should warn if the size is known to
be zero. But no wanring.

So now I'm considering this:

   template<typename _CharT, typename _Traits>
     __attribute__((__nonnull__(2), __access__(__write_only__, 2)))
     inline basic_istream<_CharT, _Traits>&
     operator>>(basic_istream<_CharT, _Traits>& __in, _CharT* __s)
     {
       size_t __n = __builtin_object_size(__s, 0);
       if (__builtin_expect(__n < sizeof(_CharT), false))
	{
	  // not even space for null terminator
	  __glibcxx_assert(__n >= sizeof(_CharT));
	  __in.width(0);
	  __in.setstate(ios_base::failbit);
	}
       else
	{
	  if (__n == (size_t)-1)
	    __n = __gnu_cxx::__numeric_traits<streamsize>::__max;
	  std::__istream_extract(__in, __s, __n / sizeof(_CharT));
	}
       return __in;
     }

This will give a -Wstringop-overflow warning at -O0 and then overflow
the buffer, with undefined behaviour. And it will give no warning but
avoid the overflow when optimising. This isn't my preferred outcome,
I'd prefer to always get a warning, *and* be able to avoid the
overflow when optimising and the size is known.




More information about the Gcc-patches mailing list