[committed] libstdc++: Replace operator>>(istream&, char*) [LWG 2499]

Martin Sebor msebor@gmail.com
Thu Aug 6 14:56:59 GMT 2020


On 8/6/20 7:30 AM, Jonathan Wakely via Libstdc++ wrote:
> On 06/08/20 14:14 +0100, Jonathan Wakely via Libstdc++ wrote:
>> On 05/08/20 16:31 -0600, Martin Sebor via Libstdc++ wrote:
>>> On 8/5/20 3:25 PM, Jonathan Wakely wrote:
>>>> P0487R1 resolved LWG 2499 for C++20 by removing the operator>> 
>>>> overloads
>>>> that have high risk of buffer overflows. They were replaced by
>>>> equivalents that only accept a reference to an array, and so can
>>>> guarantee not to write past the end of the array.
>>>>
>>>> In order to support both the old and new functionality, this patch
>>>> introduces a new overloaded __istream_extract function which takes a
>>>> maximum length. The new operator>> overloads use the array size as the
>>>> maximum length. The old overloads now use __builtin_object_size to
>>>> determine the available buffer size if available (which requires 
>>>> -O2) or
>>>> use numeric_limits<streamsize>::max()/sizeof(char_type) otherwise. This
>>>> is a change in behaviour, as the old overloads previously always used
>>>> numeric_limits<streamsize>::max(), without considering 
>>>> sizeof(char_type)
>>>> and without attempting to prevent overflows.
>>>>
>>>> Because they now do little more than call __istream_extract, the old
>>>> operator>> overloads are very small inline functions. This means there
>>>> is no advantage to explicitly instantiating them in the library (in 
>>>> fact
>>>> that would prevent the __builtin_object_size checks from ever working).
>>>> As a result, the explicit instantiation declarations can be removed 
>>>> from
>>>> the header. The explicit instantiation definitions are still needed, 
>>>> for
>>>> backwards compatibility with existing code that expects to link to the
>>>> definitions in the library.
>>>>
>>>> While working on this change I noticed that src/c++11/istream-inst.cc
>>>> has the following explicit instantiation definition:
>>>>  template istream& operator>>(istream&, char*);
>>>> This had no effect (and so should not have been present in that file),
>>>> because there was an explicit specialization declared in <istream> and
>>>> defined in src/++98/istream.cc. However, this change removes the
>>>> explicit specialization, and now the explicit instantiation definition
>>>> is necessary to ensure the symbol gets defined in the library.
>>>>
>>>> libstdc++-v3/ChangeLog:
>>>>
>>>>     * config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Export new symbols.
>>>>     * include/bits/istream.tcc (__istream_extract): New function
>>>>     template implementing both of operator>>(istream&, char*) and
>>>>     operator>>(istream&, char(&)[N]). Add explicit instantiation
>>>>     declaration for it. Remove explicit instantiation declarations
>>>>     for old function templates.
>>>>     * include/std/istream (__istream_extract): Declare.
>>>>     (operator>>(basic_istream<C,T>&, C*)): Define inline and simply
>>>>     call __istream_extract.
>>>>     (operator>>(basic_istream<char,T>&, signed char*)): Likewise.
>>>>     (operator>>(basic_istream<char,T>&, unsigned char*)): Likewise.
>>>>     (operator>>(basic_istream<C,T>&, C(7)[N])): Define for LWG 2499.
>>>>     (operator>>(basic_istream<char,T>&, signed char(&)[N])):
>>>>     Likewise.
>>>>     (operator>>(basic_istream<char,T>&, unsigned char(&)[N])):
>>>>     Likewise.
>>>>     * include/std/streambuf (basic_streambuf): Declare char overload
>>>>     of __istream_extract as a friend.
>>>>     * src/c++11/istream-inst.cc: Add explicit instantiation
>>>>     definition for wchar_t overload of __istream_extract. Remove
>>>>     explicit instantiation definitions of old operator>> overloads
>>>>     for versioned-namespace build.
>>>>     * src/c++98/istream.cc (operator>>(istream&, char*)): Replace
>>>>     with __istream_extract(istream&, char*, streamsize).
>>>>     * testsuite/27_io/basic_istream/extractors_character/char/3.cc:
>>>>     Do not use variable-length array.
>>>>     * testsuite/27_io/basic_istream/extractors_character/char/4.cc:
>>>>     Do not run test for C++20.
>>>>     * 
>>>> testsuite/27_io/basic_istream/extractors_character/char/9555-ic.cc:
>>>>     Do not test writing to pointers for C++20.
>>>>     * testsuite/27_io/basic_istream/extractors_character/char/9826.cc:
>>>>     Use array instead of pointer.
>>>>     * testsuite/27_io/basic_istream/extractors_character/wchar_t/3.cc:
>>>>     Do not use variable-length array.
>>>>     * testsuite/27_io/basic_istream/extractors_character/wchar_t/4.cc:
>>>>     Do not run test for C++20.
>>>>     * 
>>>> testsuite/27_io/basic_istream/extractors_character/wchar_t/9555-ic.cc:
>>>>     Do not test writing to pointers for C++20.
>>>>     * 
>>>> testsuite/27_io/basic_istream/extractors_character/char/lwg2499.cc:
>>>>     New test.
>>>>     * 
>>>> testsuite/27_io/basic_istream/extractors_character/char/lwg2499_neg.cc:
>>>>     New test.
>>>>     * 
>>>> testsuite/27_io/basic_istream/extractors_character/char/overflow.cc:
>>>>     New test.
>>>>     * 
>>>> testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499.cc:
>>>>     New test.
>>>>     * 
>>>> testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499_neg.cc: 
>>>>
>>>>     New test.
>>>>
>>>> Tested powerpc64le-linux. Committed to trunk.
>>>>
>>>> Martin, Jakub, could you please double-check the usage of
>>>> __builtin_object_size? (line 805 in libstdc++-v3/include/std/istream)
>>>> Do you see any problems with using it here? If it can't tell the size
>>>> then we just assume it's larger than the string to be extracted, which
>>>> is what the old code did anyway. I do have an idea for stronger
>>>> checking in debug mode, which I'll post in a minute.
>>>
>>> I've always found the second argument to __builtin_object_size
>>> confusing for types above 1.  I don't see anything wrong in
>>> the diff but I believe the most useful results are with type 1
>>> for string functions and type 0 for raw memory functions like
>>> memcpy (that's what _FORTIFY_SOURCE uses for the two sets of
>>> functions).  In type 2 when the result is zero it means one of
>>> two things: either the size of the array couldn't be determined
>>> or it really is zero.  That's less than helpful in cases like:
>>>
>>> char a[8];
>>> strcpy (a + 8, s);
>>>
>>> where it prevents detecting the buffer overflow.
>>>
>>> I would suggest to use type 1 in the iostream functions instead.
>>>
>>> Adding attribute access to the __istream_extract overloads should
>>> also let GCC check for out-of-bounds accesses by the function and
>>> issue warnings.  This goes beyond what __builtin_object_size makes
>>> possible (it considers also variable sizes in known ranges).
>>
>> I tried the attached patch with a testcase like:
>>
>> #include <istream>
>>
>> void
>> test01(std::istream& in)
>> {
>>  char pc[1];
>>  in >> (pc + 1); // { dg-warning "writing 1 byte into a region of size 
>> 0" }
>> }
>>
>> I get a -Wstringop-overflow warning with -O0, but not at -O2. Is that
>> because the call to operator>> gets inlined, and __builtin_object_size
>> returns 0, and we call __istream_extract(in, s, 0) which says we won't
>> write anything, so it's now considered safe?
>>
>> That's unfortunate, because actually __istream_extract will always
>> write at least one byte as a null terminator.
>>
>> So I tried various ways toi try and get the warning back at -O2 and
>> nothing worked e.g. when __builtin_object_size(__s, 0) returns 0 I
>> tried calling __istream_extract(__in, __s, 1) which should say I'm
>> going to write a character, and so should warn if the size is known to
>> be zero. But no wanring.
> 
> There is a warning, when calling __istream_extract, but it's
> suppressed because it comes from within the library itself.
> 
> So optimisation means we inline the user's erroneous call, and
> suppress the warning from the (still erroneous) call to the library
> internals.
> 
> That sucks.
> 

Well, yes, I suppose it does.  Function attributes and inlining
don't mesh very well.  Once a function is inlined, the effects
of its attributes, are gone, not just on warnings but also on
subsequent optimizations.  The exceptions are attributes that
the inliner knows about.  For instance, attributes alloc_size
and malloc suffer from the same problem as access.

For the effect of attribute access in particular to persist
the inliner would need to record it somewhere.  For instance,
it could emit a call to a synthesized intrinsic function with
the same attribute and arguments that the expander would then
check as it does ordinary calls but then discard.

Other attributes would need their own special treatment.

FWIW, there are a whole number of contexts in the middle end
where attribute access could be used to good effect.  I have
been working to enhance a number of these areas but I confess
that the interaction with inlining hasn't been on my radar.
It's something to think about.

For this specific use case, I saw __istream_extract defined
as an ordinary (non-template) function in a .tcc file in
the patch so I thought it was out of line.  If it's inline
or if it's a template the only workaround I can think of
to retain the warning is to have it make a call to (no-op)
function with the attribute that is not inlined.  It's too
bad there is no attribute to tell the expander to avoid
emitting such a function (which would be the equivalent of
the idea I outlined in my second paragraph above).

Martin


More information about the Gcc-patches mailing list