Bug 97659 - Invalid pointer subtraction in vector::insert() (reported by pointer-subtract AddressSanitizer)
Summary: Invalid pointer subtraction in vector::insert() (reported by pointer-subtract...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 10.2.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-31 17:36 UTC by Paweł Bylica
Modified: 2021-02-03 17:11 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-11-01 00:00:00


Attachments
Minimal test case source code (156 bytes, text/plain)
2020-10-31 21:01 UTC, Paweł Bylica
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paweł Bylica 2020-10-31 17:36:01 UTC
When vector<uint8_t>::insert(iterator pos, InputIt first, InputIt last) is used the AddressSanitizer additional check "pointer-subtract" reports invalid pointer pair in c++/10/bits/vector.tcc:729.

The relevant code is this:

  template<typename _Tp, typename _Alloc>
    template<typename _ForwardIterator>
      void
      vector<_Tp, _Alloc>::
      _M_range_insert(iterator __position, _ForwardIterator __first,
		      _ForwardIterator __last, std::forward_iterator_tag)
      {
	if (__first != __last)
	  {
	    const size_type __n = std::distance(__first, __last);
	    if (size_type(this->_M_impl._M_end_of_storage
			  - this->_M_impl._M_finish) >= __n)  // FAILS HERE!
	      {


My core code causing the problem is this:

void push(std::vector<uint8_t>& b, uint32_t value)
{
    uint8_t storage[sizeof(value)];
    __builtin_memcpy(storage, &value, sizeof(value));
    b.insert(b.end(), std::begin(storage), std::end(storage));
}


My program is pushing single bytes and uint32_t value using the above helper to a vector, without preallocation. But I was not able to reproduce this issues on a side. I will need more time to reduce my code to a proper regression test.

gcc-10 (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0
export ASAN_OPTIONS=detect_invalid_pointer_pairs=1 

=================================================================
==3327279==ERROR: AddressSanitizer: invalid-pointer-pair: 0x602000006e5c 0x602000006e5a
    #0 0x556e32bfecbf in void std::vector<unsigned char, std::allocator<unsigned char> >::_M_range_insert<unsigned char*>(__gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > >, unsigned char*, unsigned char*, std::forward_iterator_tag) /usr/include/c++/10/bits/vector.tcc:729
    #1 0x556e32bfecbf in void std::vector<unsigned char, std::allocator<unsigned char> >::_M_insert_dispatch<unsigned char*>(__gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > >, unsigned char*, unsigned char*, std::__false_type) /usr/include/c++/10/bits/stl_vector.h:1665
    #2 0x556e32bfecbf in __gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > > std::vector<unsigned char, std::allocator<unsigned char> >::insert<unsigned char*, void>(__gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > >, unsigned char*, unsigned char*) /usr/include/c++/10/bits/stl_vector.h:1383
    #3 0x556e32bfecbf in push /home/chfast/Projects/wasmx/fizzy/lib/fizzy/parser_expr.cpp:26
...

0x602000006e5c is located 0 bytes to the right of 12-byte region [0x602000006e50,0x602000006e5c)
allocated by thread T0 here:
    #0 0x7f0bfa861f17 in operator new(unsigned long) (/lib/x86_64-linux-gnu/libasan.so.6+0xb1f17)
    #1 0x556e32bff1e1 in __gnu_cxx::new_allocator<unsigned char>::allocate(unsigned long, void const*) /usr/include/c++/10/ext/new_allocator.h:115
    #2 0x556e32bff1e1 in std::allocator_traits<std::allocator<unsigned char> >::allocate(std::allocator<unsigned char>&, unsigned long) /usr/include/c++/10/bits/alloc_traits.h:460
    #3 0x556e32bff1e1 in std::_Vector_base<unsigned char, std::allocator<unsigned char> >::_M_allocate(unsigned long) /usr/include/c++/10/bits/stl_vector.h:346
    #4 0x556e32bff1e1 in void std::vector<unsigned char, std::allocator<unsigned char> >::_M_range_insert<unsigned char*>(__gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > >, unsigned char*, unsigned char*, std::forward_iterator_tag) /usr/include/c++/10/bits/vector.tcc:769
    #5 0x556e32bff1e1 in void std::vector<unsigned char, std::allocator<unsigned char> >::_M_insert_dispatch<unsigned char*>(__gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > >, unsigned char*, unsigned char*, std::__false_type) /usr/include/c++/10/bits/stl_vector.h:1665
    #6 0x556e32bff1e1 in __gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > > std::vector<unsigned char, std::allocator<unsigned char> >::insert<unsigned char*, void>(__gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > >, unsigned char*, unsigned char*) /usr/include/c++/10/bits/stl_vector.h:1383
    #7 0x556e32bff1e1 in push /home/chfast/Projects/wasmx/fizzy/lib/fizzy/parser_expr.cpp:26
...

0x602000006e5a is located 10 bytes inside of 12-byte region [0x602000006e50,0x602000006e5c)
allocated by thread T0 here:
    #0 0x7f0bfa861f17 in operator new(unsigned long) (/lib/x86_64-linux-gnu/libasan.so.6+0xb1f17)
    #1 0x556e32bff1e1 in __gnu_cxx::new_allocator<unsigned char>::allocate(unsigned long, void const*) /usr/include/c++/10/ext/new_allocator.h:115
    #2 0x556e32bff1e1 in std::allocator_traits<std::allocator<unsigned char> >::allocate(std::allocator<unsigned char>&, unsigned long) /usr/include/c++/10/bits/alloc_traits.h:460
    #3 0x556e32bff1e1 in std::_Vector_base<unsigned char, std::allocator<unsigned char> >::_M_allocate(unsigned long) /usr/include/c++/10/bits/stl_vector.h:346
    #4 0x556e32bff1e1 in void std::vector<unsigned char, std::allocator<unsigned char> >::_M_range_insert<unsigned char*>(__gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > >, unsigned char*, unsigned char*, std::forward_iterator_tag) /usr/include/c++/10/bits/vector.tcc:769
    #5 0x556e32bff1e1 in void std::vector<unsigned char, std::allocator<unsigned char> >::_M_insert_dispatch<unsigned char*>(__gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > >, unsigned char*, unsigned char*, std::__false_type) /usr/include/c++/10/bits/stl_vector.h:1665
    #6 0x556e32bff1e1 in __gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > > std::vector<unsigned char, std::allocator<unsigned char> >::insert<unsigned char*, void>(__gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > >, unsigned char*, unsigned char*) /usr/include/c++/10/bits/stl_vector.h:1383
    #7 0x556e32bff1e1 in push /home/chfast/Projects/wasmx/fizzy/lib/fizzy/parser_expr.cpp:26
Comment 1 Jonathan Wakely 2020-10-31 18:13:25 UTC
This looks like a bug in the sanitizer. I assume it's triggering because the memory returned by the allocator doesn't refer to an array, so the two addresses are not pointing to subobjects of a single object.

But that's just how std::vector works, and the C++ standard has been changed to make it valid, so the sanitizer needs to cope with the real world.
Comment 2 Paweł Bylica 2020-10-31 21:01:52 UTC
Created attachment 49482 [details]
Minimal test case source code

It turned out the problem is related to vector's internal instrumentation _GLIBCXX_SANITIZE_VECTOR.

The minimal test case is the following:

#define _GLIBCXX_SANITIZE_VECTOR 1
#include <vector>

int main()
{
    std::vector<char> v;
    v.reserve(1);

    char in[1] = {};
    v.insert(v.end(), in, in + 1);

    return 0;
}


export ASAN_OPTIONS=detect_invalid_pointer_pairs=1
g++ pointer_subtract_bug.cpp -fsanitize=address,pointer-subtract
./a.out
Comment 3 Jakub Jelinek 2020-10-31 23:11:50 UTC
That sanitizer diagnoses
http://eel.is/c++draft/expr.add#5.3
which still seems UB.
Of course there can be bugs on the sanitizer side too; the sanitizer generally works by scanning the shadow memory in between the two pointers and if it finds an unaccessible byte in there (memory not part of an object, e.g. the inter-object redzone), it shall diagnose it.
Comment 4 Paweł Bylica 2020-11-01 10:56:41 UTC
I'd like to explain some things here (to my best knowledge):

1. The "pointer-subtract" checks is ASan extension, not enabled by default. When running with this check enabled in my application I have not detected any issues in std::vector.

2. The "pointer-subtract" checks if you pointer subtraction operands are from the same memory allocation. Allowed values are all pointers from the memory region plus the "end" pointer one element outside of the region. Other subtractions are UB in C to my information.

3. The issue shows up only when "pointer-subtract" is combined with _GLIBCXX_SANITIZE_VECTOR. Moreover, the report looks like false positive because the subtraction is between the "end" pointer and a pointer from inside of a memory region.
Comment 5 Jonathan Wakely 2020-11-01 11:33:38 UTC
(In reply to Jakub Jelinek from comment #3)
> That sanitizer diagnoses
> http://eel.is/c++draft/expr.add#5.3
> which still seems UB.

Not since http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0593r6.html said that an array of T[n] can be implicitly created in the storage returned by the allocator.

> Of course there can be bugs on the sanitizer side too; the sanitizer
> generally works by scanning the shadow memory in between the two pointers
> and if it finds an unaccessible byte in there (memory not part of an object,
> e.g. the inter-object redzone), it shall diagnose it.

I think the problem is that the unused capacity at the end of the vector is marked as inaccessible. We need to flip it to accessible again before doing that subtraction, then flip it back to inaccessible. Similarly in the vector::capacity() member function. Maybe it would be simpler to add the instrumentation in capacity() and then in the _M_range_insert function shown in comment 0, use (capacity() - size()) >= n
Comment 6 Jonathan Wakely 2020-11-01 11:38:36 UTC
Alternatively, we could reinterpret_cast<uintptr_t> before doing the subtractions (and divide the result by sizeof(_Tp)), which would avoid the overhead of adjusting the ASan tracking.

Either way, if my assumption about the cause is right, it's not a real bug in std::vector, it's caused by incorrect (or insufficient) ASan instrumentation.