[committed] libstdc++: Replace operator>>(istream&, char*) [LWG 2499]
Jonathan Wakely
jwakely@redhat.com
Thu Aug 6 13:14:48 GMT 2020
On 05/08/20 16:31 -0600, Martin Sebor via Libstdc++ wrote:
>On 8/5/20 3:25 PM, Jonathan Wakely wrote:
>>P0487R1 resolved LWG 2499 for C++20 by removing the operator>> overloads
>>that have high risk of buffer overflows. They were replaced by
>>equivalents that only accept a reference to an array, and so can
>>guarantee not to write past the end of the array.
>>
>>In order to support both the old and new functionality, this patch
>>introduces a new overloaded __istream_extract function which takes a
>>maximum length. The new operator>> overloads use the array size as the
>>maximum length. The old overloads now use __builtin_object_size to
>>determine the available buffer size if available (which requires -O2) or
>>use numeric_limits<streamsize>::max()/sizeof(char_type) otherwise. This
>>is a change in behaviour, as the old overloads previously always used
>>numeric_limits<streamsize>::max(), without considering sizeof(char_type)
>>and without attempting to prevent overflows.
>>
>>Because they now do little more than call __istream_extract, the old
>>operator>> overloads are very small inline functions. This means there
>>is no advantage to explicitly instantiating them in the library (in fact
>>that would prevent the __builtin_object_size checks from ever working).
>>As a result, the explicit instantiation declarations can be removed from
>>the header. The explicit instantiation definitions are still needed, for
>>backwards compatibility with existing code that expects to link to the
>>definitions in the library.
>>
>>While working on this change I noticed that src/c++11/istream-inst.cc
>>has the following explicit instantiation definition:
>> template istream& operator>>(istream&, char*);
>>This had no effect (and so should not have been present in that file),
>>because there was an explicit specialization declared in <istream> and
>>defined in src/++98/istream.cc. However, this change removes the
>>explicit specialization, and now the explicit instantiation definition
>>is necessary to ensure the symbol gets defined in the library.
>>
>>libstdc++-v3/ChangeLog:
>>
>> * config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Export new symbols.
>> * include/bits/istream.tcc (__istream_extract): New function
>> template implementing both of operator>>(istream&, char*) and
>> operator>>(istream&, char(&)[N]). Add explicit instantiation
>> declaration for it. Remove explicit instantiation declarations
>> for old function templates.
>> * include/std/istream (__istream_extract): Declare.
>> (operator>>(basic_istream<C,T>&, C*)): Define inline and simply
>> call __istream_extract.
>> (operator>>(basic_istream<char,T>&, signed char*)): Likewise.
>> (operator>>(basic_istream<char,T>&, unsigned char*)): Likewise.
>> (operator>>(basic_istream<C,T>&, C(7)[N])): Define for LWG 2499.
>> (operator>>(basic_istream<char,T>&, signed char(&)[N])):
>> Likewise.
>> (operator>>(basic_istream<char,T>&, unsigned char(&)[N])):
>> Likewise.
>> * include/std/streambuf (basic_streambuf): Declare char overload
>> of __istream_extract as a friend.
>> * src/c++11/istream-inst.cc: Add explicit instantiation
>> definition for wchar_t overload of __istream_extract. Remove
>> explicit instantiation definitions of old operator>> overloads
>> for versioned-namespace build.
>> * src/c++98/istream.cc (operator>>(istream&, char*)): Replace
>> with __istream_extract(istream&, char*, streamsize).
>> * testsuite/27_io/basic_istream/extractors_character/char/3.cc:
>> Do not use variable-length array.
>> * testsuite/27_io/basic_istream/extractors_character/char/4.cc:
>> Do not run test for C++20.
>> * testsuite/27_io/basic_istream/extractors_character/char/9555-ic.cc:
>> Do not test writing to pointers for C++20.
>> * testsuite/27_io/basic_istream/extractors_character/char/9826.cc:
>> Use array instead of pointer.
>> * testsuite/27_io/basic_istream/extractors_character/wchar_t/3.cc:
>> Do not use variable-length array.
>> * testsuite/27_io/basic_istream/extractors_character/wchar_t/4.cc:
>> Do not run test for C++20.
>> * testsuite/27_io/basic_istream/extractors_character/wchar_t/9555-ic.cc:
>> Do not test writing to pointers for C++20.
>> * testsuite/27_io/basic_istream/extractors_character/char/lwg2499.cc:
>> New test.
>> * testsuite/27_io/basic_istream/extractors_character/char/lwg2499_neg.cc:
>> New test.
>> * testsuite/27_io/basic_istream/extractors_character/char/overflow.cc:
>> New test.
>> * testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499.cc:
>> New test.
>> * testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499_neg.cc:
>> New test.
>>
>>Tested powerpc64le-linux. Committed to trunk.
>>
>>Martin, Jakub, could you please double-check the usage of
>>__builtin_object_size? (line 805 in libstdc++-v3/include/std/istream)
>>Do you see any problems with using it here? If it can't tell the size
>>then we just assume it's larger than the string to be extracted, which
>>is what the old code did anyway. I do have an idea for stronger
>>checking in debug mode, which I'll post in a minute.
>
>I've always found the second argument to __builtin_object_size
>confusing for types above 1. I don't see anything wrong in
>the diff but I believe the most useful results are with type 1
>for string functions and type 0 for raw memory functions like
>memcpy (that's what _FORTIFY_SOURCE uses for the two sets of
>functions). In type 2 when the result is zero it means one of
>two things: either the size of the array couldn't be determined
>or it really is zero. That's less than helpful in cases like:
>
> char a[8];
> strcpy (a + 8, s);
>
>where it prevents detecting the buffer overflow.
>
>I would suggest to use type 1 in the iostream functions instead.
>
>Adding attribute access to the __istream_extract overloads should
>also let GCC check for out-of-bounds accesses by the function and
>issue warnings. This goes beyond what __builtin_object_size makes
>possible (it considers also variable sizes in known ranges).
I tried the attached patch with a testcase like:
#include <istream>
void
test01(std::istream& in)
{
char pc[1];
in >> (pc + 1); // { dg-warning "writing 1 byte into a region of size 0" }
}
I get a -Wstringop-overflow warning with -O0, but not at -O2. Is that
because the call to operator>> gets inlined, and __builtin_object_size
returns 0, and we call __istream_extract(in, s, 0) which says we won't
write anything, so it's now considered safe?
That's unfortunate, because actually __istream_extract will always
write at least one byte as a null terminator.
So I tried various ways toi try and get the warning back at -O2 and
nothing worked e.g. when __builtin_object_size(__s, 0) returns 0 I
tried calling __istream_extract(__in, __s, 1) which should say I'm
going to write a character, and so should warn if the size is known to
be zero. But no wanring.
So now I'm considering this:
template<typename _CharT, typename _Traits>
__attribute__((__nonnull__(2), __access__(__write_only__, 2)))
inline basic_istream<_CharT, _Traits>&
operator>>(basic_istream<_CharT, _Traits>& __in, _CharT* __s)
{
size_t __n = __builtin_object_size(__s, 0);
if (__builtin_expect(__n < sizeof(_CharT), false))
{
// not even space for null terminator
__glibcxx_assert(__n >= sizeof(_CharT));
__in.width(0);
__in.setstate(ios_base::failbit);
}
else
{
if (__n == (size_t)-1)
__n = __gnu_cxx::__numeric_traits<streamsize>::__max;
std::__istream_extract(__in, __s, __n / sizeof(_CharT));
}
return __in;
}
This will give a -Wstringop-overflow warning at -O0 and then overflow
the buffer, with undefined behaviour. And it will give no warning but
avoid the overflow when optimising. This isn't my preferred outcome,
I'd prefer to always get a warning, *and* be able to avoid the
overflow when optimising and the size is known.
More information about the Gcc-patches
mailing list