Bug 108846 - std::copy, std::copy_n and std::copy_backward on potentially overlapping subobjects
Summary: std::copy, std::copy_n and std::copy_backward on potentially overlapping sub...
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 13.0
: P3 normal
Target Milestone: ---
Assignee: Jonathan Wakely
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2023-02-18 21:22 UTC by Arthur O'Dwyer
Modified: 2024-03-18 14:13 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work: 8.1.0, 8.5.0
Known to fail: 9.1.0
Last reconfirmed: 2023-02-23 00:00:00


Attachments
libstdc++: Do not use memmove for 1-element ranges [PR108846] (1.04 KB, patch)
2023-02-25 14:29 UTC, Jonathan Wakely
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Arthur O'Dwyer 2023-02-18 21:22:45 UTC
// https://gcc.godbolt.org/z/EfdG4nzv9
#include <algorithm>
#include <cassert>
struct B { 
    B(int i, short j) : i(i), j(j) {}
    int i; 
    short j; /* 2 byte padding */ 
};
struct D : B { 
    D(int i, short j, short x) : B(i, j), x(x) {}    
    short x;
};
int main() {
    D ddst(1, 2, 3);
    D dsrc(4, 5, 6);
    B *dst = &ddst;
    B *src = &dsrc;
    std::copy_n(src, 1, dst);  // should do `*dst = *src`
    assert(ddst.x == 3);  // FAILS
}

Similarly if you use `std::copy(src, src+1, dst)`.
The problem here is that `std::copy` and `std::copy_n` (and presumably some other algorithms too, like `std::move`) believe that if a type is trivially copyable, then it's safe to use `memcpy` or `memmove` on it. But in fact that's safe only if either
- the type's "sizeof" is equal to its "data size, without trailing padding"; or
- you happen to have external information proving that the object being copied is not a potentially overlapping subobject.

I think it's safe for `std::copy` and `std::copy_n` to use memcpy/memmove if they are given a destination range of 2-or-more elements; but if it's just a single object, then (by the letter of the law) they must assume the destination object might be a potentially overlapping subobject, and not memcpy into it.

(Full disclosure: I will be THRILLED if you close this as "not a bug," because that will be ammunition to go to LWG and say "look, vendors think this behavior is fine, we should actually standardize this behavior and remove the expectation that STL algorithms can ever handle potentially overlapping subobjects.")
Comment 1 Andrew Pinski 2023-02-18 21:55:19 UTC
The memcpy/memmove change was done in r9-1943-g20a0c4e3dc9948
Comment 2 Andrew Pinski 2023-02-18 21:58:26 UTC
Related to https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2403 but instead of language, it is the library methods.
Comment 3 Andrew Pinski 2023-02-18 22:00:28 UTC
Even https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2544 is very much related.
Comment 4 Jiang An 2023-02-20 08:38:19 UTC
(In reply to Andrew Pinski from comment #2)
> Related to https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2403
> but instead of language, it is the library methods.

CWG2403 looks like an unfortunate result of ABI design.

If any non-padding byte of a potentially-overlapping subobject X is actually overlapping with a padding byte of subobject Y, Y should be initialized before X, because all bytes of Y may be written in initialization. However, on Itanium ABI virtual base class subobjects violate such intuitive rule.

But I think it's only closedly related to uninitialized memory algorithms (which perform initialization instead of assignment), and there's already a note saying that it's UB to use them to initialize potentially-overlapping objects ([specialized.algorithms.general]/3).

(In reply to Andrew Pinski from comment #3)
> Even https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2544 is
> very much related.

(As the submitter) I think this issue only legitimates current implementation of pointer arithmetic.

For example, assuming both sizeof(B) and sizeof(D) are 8, which means the last 2 bytes of B are padding (true for common implementations on Itanium ABI):

struct B { int32_t x; int16_t y; };
struct D : B { int16_t z; };

The current wording in [basic.compound]/3 requires that given `B b{};` and `D d{};`, `static_cast<B*>(&d) + 1` represents the address of the byte at offset 6, which is almost unimplementable.

However, subtraction between pointers are performed as usual for even for potentially-overlapping subobjects, which is correctly implemented.

In the case of copy family algorithms, I believe it's OK to specially handle cases where last - first == 1.


BTW, there's CWG2527 that is closed as NAD, which seemingly means that an array subobject can be potentially-overlapping, even if element subobjects can't.
Comment 5 Giuseppe D'Angelo 2023-02-20 23:42:04 UTC
> In the case of copy family algorithms, I believe it's OK to specially handle cases where last - first == 1.

https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L417-L437

Is the extent of the fix just to add another branch to the if (_Num) check here? I've just tried locally, and it seems to work. It also doesn't seem that libstdc++ uses memcpy/memmove elsewhere (e.g. in std::fill).
Comment 6 Jiang An 2023-02-21 01:58:55 UTC
> For example, assuming both sizeof(B) and sizeof(D) are 8, which means the last
> 2 bytes of B are padding (true for common implementations on Itanium ABI):

Oh, I forgot the strange design caused by CWG43 - this requires some non-C++98-POD mechanism for B (e.g. an explicitly default default constructor)...
Comment 7 Andrew Pinski 2023-02-21 07:21:06 UTC
I suspect std::move has the same issue too. The ability to use memmove with trivial copyable subobjects ...
Comment 8 Jonathan Wakely 2023-02-23 17:20:39 UTC
I don't think we need the ABI keyword here, the library can just avoid the optimization for distance == 1. That has no ABI impact.
Comment 9 Jonathan Wakely 2023-02-23 17:25:57 UTC
(In reply to Giuseppe D'Angelo from comment #5)
> https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/
> stl_algobase.h#L417-L437
> 
> Is the extent of the fix just to add another branch to the if (_Num) check
> here?

Yes, I think that's the simplest place to fix it.

And we can even get rid of the static assertion, because the Num==1 branch will require assignability now.

--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -737,16 +737,11 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
        static _Tp*
        __copy_move_b(const _Tp* __first, const _Tp* __last, _Tp* __result)
        {
-#if __cplusplus >= 201103L
-         using __assignable = __conditional_t<_IsMove,
-                                              is_move_assignable<_Tp>,
-                                              is_copy_assignable<_Tp>>;
-         // trivial types can have deleted assignment
-         static_assert( __assignable::value, "type must be assignable" );
-#endif
          const ptrdiff_t _Num = __last - __first;
-         if (_Num)
+         if (__builtin_expect(_Num > 1, true))
            __builtin_memmove(__result - _Num, __first, sizeof(_Tp) * _Num);
+         else if (_Num == 1)
+           *__result = *__first;
          return __result - _Num;
        }
     };
Comment 10 Jonathan Wakely 2023-02-23 17:27:03 UTC
(In reply to Andrew Pinski from comment #7)
> I suspect std::move has the same issue too. The ability to use memmove with
> trivial copyable subobjects ...

std::move(x,y,z) and std::copy(z,y,z) use the same underlying implementation, so it does have the same issue, but will be fixed by the same change.
Comment 11 Arthur O'Dwyer 2023-02-23 17:29:03 UTC
(In reply to Jonathan Wakely from comment #10)
> std::move(x,y,z) and std::copy(z,y,z) use the same underlying
> implementation, so it does have the same issue, but will be fixed by the
> same change.

Right; also std::rotate, std::set_union, etc., I'd expect would all be fixed by the same change to std::copy.

You might want to peek at std::swap_ranges, though. Microsoft's STL has trouble there; I don't *think* libstdc++ does, but you might as well check me.
Comment 12 Giuseppe D'Angelo 2023-02-24 19:31:49 UTC
(In reply to Jonathan Wakely from comment #9)
> (In reply to Giuseppe D'Angelo from comment #5)
> > https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/
> > stl_algobase.h#L417-L437
> > 
> > Is the extent of the fix just to add another branch to the if (_Num) check
> > here?
> 
> Yes, I think that's the simplest place to fix it.
> 
> And we can even get rid of the static assertion, because the Num==1 branch
> will require assignability now.
> 
> --- a/libstdc++-v3/include/bits/stl_algobase.h
> +++ b/libstdc++-v3/include/bits/stl_algobase.h
> @@ -737,16 +737,11 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>         static _Tp*
>         __copy_move_b(const _Tp* __first, const _Tp* __last, _Tp* __result)
>         {
> -#if __cplusplus >= 201103L
> -         using __assignable = __conditional_t<_IsMove,
> -                                              is_move_assignable<_Tp>,
> -                                              is_copy_assignable<_Tp>>;
> -         // trivial types can have deleted assignment
> -         static_assert( __assignable::value, "type must be assignable" );
> -#endif
>           const ptrdiff_t _Num = __last - __first;
> -         if (_Num)
> +         if (__builtin_expect(_Num > 1, true))
>             __builtin_memmove(__result - _Num, __first, sizeof(_Tp) * _Num);
> +         else if (_Num == 1)
> +           *__result = *__first;
>           return __result - _Num;
>         }
>      };


If this code path also takes care of std::move(b,e,o), then this doesn't sound correct -- for _Num==1 and _IsMove==true, then it should use a move assignment (std::move(*__first)). But then that std::move doesn't actually work as __first is a pointer to const...
Comment 13 Jonathan Wakely 2023-02-25 09:06:49 UTC
That specialization is only used for trivial types, so moving is equivalent to copying (otherwise we couldn't use memmove anyway). So it doesn't make any difference whether you move an lvalue or rvalue, it's just a bitwise copy anyway.
Comment 14 Jonathan Wakely 2023-02-25 09:07:36 UTC
s/move an lvalue or rvalue/assign from an lvalue or rvalue/
Comment 15 Giuseppe D'Angelo 2023-02-25 11:40:47 UTC
That's not what I meant; a type can be trivial(ly copyable) and move-only.

Here's a modification of Arthur's example:

// move-only
struct B {
    B(int i, short j) : i(i), j(j) {}
    B(B &&) = default;
    B &operator=(B &&) = default;
    int i;
    short j; 
};
struct D : B {
    D(int i, short j, short x) : B(i, j), x(x) {}
    D(D &&) = default;
    D &operator=(D &&) = default;
    short x;
};
int main() {
    D ddst(1, 2, 3);
    D dsrc(4, 5, 6);
    B *dst = &ddst;
    B *src = &dsrc;
    static_assert(std::is_trivially_copyable_v<B>);

    std::move(src, src+1, dst);
    assert(ddst.x == 3); 
}


The call to std::move ends up in the same memmove codepath as std::copy_n (B is trivially copyable), but with the proposed patch it will fail to compile because it's not actually copy assignable.
Comment 16 Jonathan Wakely 2023-02-25 14:13:47 UTC
Ah gotcha.

So then something like this (untested):

--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -422,16 +422,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
        static _Tp*
        __copy_m(const _Tp* __first, const _Tp* __last, _Tp* __result)
        {
-#if __cplusplus >= 201103L
-         using __assignable = __conditional_t<_IsMove,
-                                              is_move_assignable<_Tp>,
-                                              is_copy_assignable<_Tp>>;
-         // trivial types can have deleted assignment
-         static_assert( __assignable::value, "type must be assignable" );
-#endif
          const ptrdiff_t _Num = __last - __first;
-         if (_Num)
+         if (__builtin_expect(_Num > 1, true))
            __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
+         else if (_Num == 1)
+           return std::__copy_move<_IsMove, false,
+                                   random_access_iterator_tag>::
+                    __copy_m(__first, __last, __result);
          return __result + _Num;
        }
     };
@@ -737,16 +734,13 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
        static _Tp*
        __copy_move_b(const _Tp* __first, const _Tp* __last, _Tp* __result)
        {
-#if __cplusplus >= 201103L
-         using __assignable = __conditional_t<_IsMove,
-                                              is_move_assignable<_Tp>,
-                                              is_copy_assignable<_Tp>>;
-         // trivial types can have deleted assignment
-         static_assert( __assignable::value, "type must be assignable" );
-#endif
          const ptrdiff_t _Num = __last - __first;
-         if (_Num)
+         if (__builtin_expect(_Num > 1, true))
            __builtin_memmove(__result - _Num, __first, sizeof(_Tp) * _Num);
+         else if (_Num == 1)
+           return std::__copy_move_backward<_IsMove, false,
+                                            random_access_iterator_tag>::
+                    __copy_move_b(__first, __last, __result);
          return __result - _Num;
        }
     };
Comment 17 Jonathan Wakely 2023-02-25 14:16:04 UTC
We need to change the arguments too, as that specialization takes a const _Tp* as the input range.
Comment 18 Jonathan Wakely 2023-02-25 14:29:42 UTC
Created attachment 54537 [details]
libstdc++: Do not use memmove for 1-element ranges [PR108846]

This is a more complete patch that passes Arthur's test in comment 0.
Comment 19 GCC Commits 2023-02-28 09:50:04 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:822a11a1e642e0abe92a996e7033a5066905a447

commit r13-6372-g822a11a1e642e0abe92a996e7033a5066905a447
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Sat Feb 25 14:28:36 2023 +0000

    libstdc++: Do not use memmove for 1-element ranges [PR108846]
    
    This avoids overwriting tail padding when algorithms like std::copy are
    used to write a single value through a pointer to a base subobject.
    
    The pointer arithmetic on a Base* is valid for N==1, but the copy/move
    operation needs to be done using assignment, not a memmove or memcpy of
    sizeof(Base) bytes.
    
    Instead of putting a check for N==1 in all of copy, copy_n, move etc.
    this adds it to the __copy_move and __copy_move_backward partial
    specializations used for trivially copyable types. When N==1 those
    partial specializations dispatch to new static member functions of the
    partial specializations for non-trivial types, so that a copy/move
    assignment is done appropriately for the _IsMove constant.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/108846
            * include/bits/stl_algobase.h (__copy_move<false, false, RA>)
            Add __assign_one static member function.
            (__copy_move<true, false, RA>): Likewise.
            (__copy_move<IsMove, true, RA>): Do not use memmove for a single
            value.
            (__copy_move_backward<IsMove, true, RA>): Likewise.
            * testsuite/25_algorithms/copy/108846.cc: New test.
            * testsuite/25_algorithms/copy_backward/108846.cc: New test.
            * testsuite/25_algorithms/copy_n/108846.cc: New test.
            * testsuite/25_algorithms/move/108846.cc: New test.
            * testsuite/25_algorithms/move_backward/108846.cc: New test.
Comment 20 Giuseppe D'Angelo 2023-03-02 10:49:56 UTC
Thanks for the patch!

Should ranges_algobase.h also be similarly changed? There's a memmove in its copy/move code as well:

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/bits/ranges_algobase.h;hb=HEAD#l263
Comment 21 Jonathan Wakely 2023-03-02 11:19:45 UTC
Indeed it should.
Comment 22 Jonathan Wakely 2023-04-20 13:50:35 UTC
(In reply to Jonathan Wakely from comment #16)
>           const ptrdiff_t _Num = __last - __first;
> -         if (_Num)
> +         if (__builtin_expect(_Num > 1, true))
>             __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
> +         else if (_Num == 1)

Richi suggested that we could avoid these runtime branches (which hurt optimization, see PR 109445) if we knew how many bytes of tail padding there are in _Tp, e.g.,

  __builtin_memmove(__result, __first, sizeof(_Tp) * _Num - __padding_bytes);

This would mean we copy whole objects for the first Num - 1 elements, and then only copy the non-padding bytes for the last element. This would be conforming, because there's no guarantee that padding bits are copied by assignment anyway, and std::copy is specified in terms of assignment.

When copying a single base-class object (the subject of this bug) the last element is the only element, so we solve the problem.

We don't have a built-in to tell us the number of padding bytes, but we might be able to use something like this:

template<typename T>
constexpr size_t
potentially_overlapping_tail_padding()
{
  if constexpr (is_final<T>::value)
    return 0;
  else
  {
    struct D1 : T
    {
      char c1;
    };
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Winvalid-offsetof"
    return sizeof(T) - offsetof(D1, c1);
#pragma GCC diagnostic pop
  }
}

For pre-C++17 we would need to use tag dispatching and __is_final instead of if-constexpr, but that's doable. For pre-C++14 it can't be constexpr, but that's OK too.
Comment 23 Arthur O'Dwyer 2023-04-20 14:12:14 UTC
(In reply to Jonathan Wakely from comment #22)
> 
> Richi suggested that we could avoid these runtime branches (which hurt
> optimization, see PR 109445) if we knew how many bytes of tail padding there
> are in _Tp [...]
> We don't have a built-in to tell us the number of [trailing] padding bytes, but we
> might be able to use something like this

Three thoughts that might be helpful:
- There may be padding in the middle of an object, and I'm not confident that the Standard actually forbids it from being used. Of course your approach works fine on the Itanium ABI, and probably works everywhere that actually exists. If you've got chapter+verse proving that it must work *everywhere*, I'd appreciate seeing it, just for my own information.
- If GCC were ever to add a builtin for this notion, IMO the proper name would be `__datasizeof(T)`. See https://danakj.github.io/2023/01/15/trivially-relocatable.html#data-size
- You can implement your library trait like this; see https://quuxplusone.github.io/blog/2023/03/04/trivial-swap-prize-update/#step-1.-std-swap-can-t-assume-it

    template <class _Tp>
    struct __libcpp_datasizeof {
        struct _Up {
            [[__no_unique_address__]] _Tp __t_;
            char __c_;
        };
        static const size_t value = __builtin_offsetof(_Up, __c_);
    };

Unfortunately it looks like GCC doesn't support `__attribute__((__no_unique_address__))` in C++03 mode. (Neither does Clang. What is up with that!)

Your suggested trait implementation is slightly wrong for `is_final` types: you return 0 but really a final type _can_ have usable tail padding. See https://godbolt.org/z/P6x459MEq
Comment 24 Jonathan Wakely 2023-04-20 17:10:27 UTC
(In reply to Arthur O'Dwyer from comment #23)
> - There may be padding in the middle of an object, and I'm not confident
> that the Standard actually forbids it from being used. Of course your
> approach works fine on the Itanium ABI, and probably works everywhere that
> actually exists. If you've got chapter+verse proving that it must work
> *everywhere*, I'd appreciate seeing it, just for my own information.

I don't care about everywhere, only GCC-like compilers on the targets they support.

> - If GCC were ever to add a builtin for this notion, IMO the proper name
> would be `__datasizeof(T)`. See
> https://danakj.github.io/2023/01/15/trivially-relocatable.html#data-size

I think it should have a __builtin_ prefix unless it directly implements a standard trait like __is_object.

> - You can implement your library trait like this; see
> https://quuxplusone.github.io/blog/2023/03/04/trivial-swap-prize-update/
> #step-1.-std-swap-can-t-assume-it
> 
>     template <class _Tp>
>     struct __libcpp_datasizeof {
>         struct _Up {
>             [[__no_unique_address__]] _Tp __t_;
>             char __c_;
>         };
>         static const size_t value = __builtin_offsetof(_Up, __c_);
>     };
> 
> Unfortunately it looks like GCC doesn't support
> `__attribute__((__no_unique_address__))` in C++03 mode. (Neither does Clang.
> What is up with that!)

Which is OK, because you can't have final types in C++03 either, so testing it using inheritance is good enough for C++03.

> Your suggested trait implementation is slightly wrong for `is_final` types:
> you return 0 but really a final type _can_ have usable tail padding. See
> https://godbolt.org/z/P6x459MEq

Ah, I didn't think that padding would actually be used by an adjacent member subobject. But of course it can be for an empty class, where it consists of a single byte of padding which can be reused by another member of a different type.
Comment 25 Jonathan Wakely 2023-04-20 17:11:01 UTC
... and non-empty types, I'm just saying that's the obvious case that proves it's possible.
Comment 26 GCC Commits 2024-03-18 14:05:56 UTC
The releases/gcc-12 branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:8ec265c1464dec74f98e6914cd164af5090a39ff

commit r12-10250-g8ec265c1464dec74f98e6914cd164af5090a39ff
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Sat Feb 25 14:28:36 2023 +0000

    libstdc++: Do not use memmove for 1-element ranges [PR108846]
    
    This avoids overwriting tail padding when algorithms like std::copy are
    used to write a single value through a pointer to a base subobject.
    
    The pointer arithmetic on a Base* is valid for N==1, but the copy/move
    operation needs to be done using assignment, not a memmove or memcpy of
    sizeof(Base) bytes.
    
    Instead of putting a check for N==1 in all of copy, copy_n, move etc.
    this adds it to the __copy_move and __copy_move_backward partial
    specializations used for trivially copyable types. When N==1 those
    partial specializations dispatch to new static member functions of the
    partial specializations for non-trivial types, so that a copy/move
    assignment is done appropriately for the _IsMove constant.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/108846
            * include/bits/stl_algobase.h (__copy_move<false, false, RA>)
            Add __assign_one static member function.
            (__copy_move<true, false, RA>): Likewise.
            (__copy_move<IsMove, true, RA>): Do not use memmove for a single
            value.
            (__copy_move_backward<IsMove, true, RA>): Likewise.
            * testsuite/25_algorithms/copy/108846.cc: New test.
            * testsuite/25_algorithms/copy_backward/108846.cc: New test.
            * testsuite/25_algorithms/copy_n/108846.cc: New test.
            * testsuite/25_algorithms/move/108846.cc: New test.
            * testsuite/25_algorithms/move_backward/108846.cc: New test.
    
    (cherry picked from commit 822a11a1e642e0abe92a996e7033a5066905a447)
Comment 27 Jonathan Wakely 2024-03-18 14:13:13 UTC
Backported for 12.4