I'm seeing this with both gcc 11.1 on: https://godbolt.org/z/1Wv9jj3r1 and gcc (GCC) 11.0.1 20210324 (Red Hat 11.0.1-0), below: Compiling the following with -O2 -Wall -Werror: #include <vector> #include <memory> static char UTC[4]; void func(std::vector<char> &vec) { vec.clear(); vec.insert(vec.end(), UTC, UTC+4); } This generates a massive complaint that boils down to: /usr/include/c++/11/bits/stl_algobase.h:431:30: error: ‘void* __builtin_memcpy(v oid*, const void*, long unsigned int)’ writing 1 or more bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=] 431 | __builtin_memmove(__result, __first, sizeof(_Tp) * _Num); | ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /usr/include/c++/11/x86_64-redhat-linux/bits/c++allocator. ...skipping 1 line from /usr/include/c++/11/bits/allocator.h:46, from /usr/include/c++/11/vector:64, from t.C:1: Removing vec.clear() makes this go away. Adding vec.reserve(4) after vec.clear() makes this go away.
Assuming the warning happens during the strlen pass, we are still missing a lot of optimizations at that point if (_6 != _7) goto <bb 4>; [70.00%] else goto <bb 3>; [30.00%] <bb 3> [local count: 322122544]: _158 = _7 - _6; once VRP2 (2 passes after strlen) replaces _158 with 0 and propagates it, maybe the code becomes nice enough to avoid confusing this fragile warning (I didn't check). Before FRE3, we have _6 = vec_2(D)->D.33506._M_impl.D.32819._M_start; _7 = vec_2(D)->D.33506._M_impl.D.32819._M_finish; if (_6 != _7) goto <bb 3>; [70.00%] else goto <bb 4>; [30.00%] <bb 4> [local count: 1073741824]: _5 = MEM[(char * const &)vec_2(D) + 8]; MEM[(struct __normal_iterator *)&D.33862] ={v} {CLOBBER}; MEM[(struct __normal_iterator *)&D.33862]._M_current = _5; __position = D.33862; _12 = MEM[(const char * const &)vec_2(D)]; _13 = MEM[(const char * const &)&__position]; _14 = _13 - _12; and after FRE3 <bb 4> [local count: 1073741824]: _5 = MEM[(char * const &)vec_2(D) + 8]; MEM[(struct __normal_iterator *)&D.33862] ={v} {CLOBBER}; MEM[(struct __normal_iterator *)&D.33862]._M_current = _5; __position = D.33862; _14 = _5 - _6; Only PRE manages to notice that _5 is the same as _7, which is already late. And it then takes until VRP2 to realize that _7 - _6 must be 0 in the else branch of _6 != _7. * I am not sure why FRE manages to optimize _12 and not _5, that seems like the first thing to check (maybe the +8 means it is obviously "partial") * I don't know if some other pass than VRP could learn that b-a is 0 if not a!=b.
The warning is issued during expansion of the memcpy call. It seems fishy that although it mentions __builtin_memcpy it points to __builtin_memmove: error: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ writing 1 or more bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=] 431 | __builtin_memmove(__result, __first, sizeof(_Tp) * _Num); | ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The IL looks like the warning is justified: void func (struct vector & vec) { const ptrdiff_t _Num; char * _6; char * _7; char * prephitmp_16; char * _21; long int _23; long unsigned int _24; sizetype prephitmp_31; char * _36; char * _38; char * prephitmp_42; char * _50; long unsigned int _Num.10_52; char * _54; char * _61; char * pretmp_67; long int _69; char * pretmp_70; long int _71; unsigned int pretmp_72; unsigned int _89; char * _90; char * _98; char * _101; long unsigned int _112; char * _116; long unsigned int _144; long int _146; char * _147; char * _155; sizetype _157; char * _158; <bb 2> [local count: 1073741824]: _6 = vec_2(D)->D.33449._M_impl.D.32762._M_start; _7 = vec_2(D)->D.33449._M_impl.D.32762._M_finish; if (_6 != _7) goto <bb 4>; [70.00%] else goto <bb 3>; [30.00%] <bb 3> [local count: 322122544]: _21 = vec_2(D)->D.33449._M_impl.D.32762._M_end_of_storage; _23 = _21 - _7; _24 = (long unsigned int) _23; if (_24 > 3) goto <bb 5>; [67.00%] else goto <bb 13>; [33.00%] <bb 4> [local count: 751619281]: vec_2(D)->D.33449._M_impl.D.32762._M_finish = _6; _147 = vec_2(D)->D.33449._M_impl.D.32762._M_end_of_storage; _146 = _147 - _6; _144 = (long unsigned int) _146; if (_144 > 3) goto <bb 5>; [67.00%] else goto <bb 13>; [33.00%] <bb 5> [local count: 237404317]: # prephitmp_16 = PHI <_7(3), _6(4)> _89 = MEM <unsigned int> [(char * {ref-all})&UTC]; MEM <unsigned int> [(char * {ref-all})prephitmp_16] = _89; _36 = vec_2(D)->D.33449._M_impl.D.32762._M_finish; _38 = _36 + 4; vec_2(D)->D.33449._M_impl.D.32762._M_finish = _38; goto <bb 12>; [100.00%] <bb 6> [local count: 108584968]: __builtin_memmove (_90, pretmp_67, _157); MEM <unsigned int> [(char * {ref-all})_155] = pretmp_72; _116 = vec_2(D)->D.33449._M_impl.D.32762._M_finish; _Num_117 = _116 - prephitmp_42; if (_Num_117 != 0) goto <bb 8>; [33.00%] >>> _Num_117 != 0 else goto <bb 10>; [67.00%] <bb 7> [local count: 220460391]: MEM <unsigned int> [(char * {ref-all})_155] = pretmp_72; _50 = vec_2(D)->D.33449._M_impl.D.32762._M_finish; _Num_51 = _50 - prephitmp_42; if (_Num_51 != 0) goto <bb 8>; [33.00%] >>> _Num_51 != 0 else goto <bb 9>; [67.00%] <bb 8> [local count: 108584968]: # _Num_119 = PHI <_Num_51(7), _Num_117(6)> <<< _Num_119 != 0 _Num.10_52 = (long unsigned int) _Num_119; __builtin_memcpy (_61, prephitmp_42, _Num.10_52); <<< -Wstringop-overflow <bb 9> [local count: 256293430]: # prephitmp_31 = PHI <0(7), _Num.10_52(8)> _54 = _61 + prephitmp_31; if (pretmp_67 != 0B) goto <bb 10>; [40.26%] else goto <bb 11>; [59.74%] <bb 10> [local count: 175940553]: # _98 = PHI <_54(9), _61(6)> _71 = pretmp_70 - pretmp_67; _112 = (long unsigned int) _71; operator delete (pretmp_67, _112); <bb 11> [local count: 329045359]: # _101 = PHI <_54(9), _98(10)> vec_2(D)->D.33449._M_impl.D.32762._M_start = _90; vec_2(D)->D.33449._M_impl.D.32762._M_finish = _101; vec_2(D)->D.33449._M_impl.D.32762._M_end_of_storage = _158; <bb 12> [local count: 1048452384]: return; <bb 13> [local count: 230225493]: # prephitmp_42 = PHI <_6(4), _7(3)> _90 = operator new (4); <<< _90 is 4 bytes pretmp_67 = vec_2(D)->D.33449._M_impl.D.32762._M_start; _69 = prephitmp_42 - pretmp_67; _157 = (sizetype) _69; _158 = _90 + 4; pretmp_70 = vec_2(D)->D.33449._M_impl.D.32762._M_end_of_storage; _155 = _90 + _157; <<< _155 is in [_90, _90 + 4] _61 = _155 + 4; <<< _61 = _90 + 4 pretmp_72 = MEM <unsigned int> [(char * {ref-all})&UTC]; if (_69 != 0) goto <bb 6>; [58.89%] else goto <bb 7>; [41.11%] } In the translation unit the warning is triggered by a call to the __copy_m() function below: template<bool _IsMove> struct __copy_move<_IsMove, true, random_access_iterator_tag> { template<typename _Tp> static _Tp* __copy_m(const _Tp* __first, const _Tp* __last, _Tp* __result) { using __assignable = conditional<_IsMove, is_move_assignable<_Tp>, is_copy_assignable<_Tp>>; static_assert( __assignable::type::value, "type is not assignable" ); const ptrdiff_t _Num = __last - __first; if (_Num) __builtin_memmove(__result, __first, sizeof(_Tp) * _Num); return __result + _Num; } }; The test for _Num is what GCC uses to determine that the number of bytes to copy is nonzero.
If the warning is justified then something else isn't adding up. I double-checked (with cppreference.com) something that I was pretty sure of: and an insert() at the end() iterator is valid. The insert()ed range is valid. If the C++ code is UB, or even potentially UB, I don't see it.
The issue is that FRE sees <bb 2> [local count: 1073741824]: _6 = vec_2(D)->D.33436._M_impl.D.32749._M_start; _7 = vec_2(D)->D.33436._M_impl.D.32749._M_finish; if (_6 != _7) goto <bb 3>; [70.00%] else goto <bb 4>; [30.00%] <bb 3> [local count: 751619281]: vec_2(D)->D.33436._M_impl.D.32749._M_finish = _6; <bb 4> [local count: 1073741824]: _5 = MEM[(char * const &)vec_2(D) + 8]; so clearly _5 is _not_ equal to _7 but on the path 2 -> 3 -> 4 it is equal to _6. Only PRE transforms the load in _5 to a select of both (a PHI node): <bb 2> [local count: 1073741824]: _6 = vec_2(D)->D.33436._M_impl.D.32749._M_start; _7 = vec_2(D)->D.33436._M_impl.D.32749._M_finish; if (_6 != _7) goto <bb 4>; [70.00%] else goto <bb 3>; [30.00%] <bb 3> [local count: 322122544]: _86 = _7 - _6; _125 = (sizetype) _86; goto <bb 5>; [100.00%] <bb 4> [local count: 751619281]: vec_2(D)->D.33436._M_impl.D.32749._M_finish = _6; <bb 5> [local count: 1073741824]: # prephitmp_141 = PHI <_7(3), _6(4)> # prephitmp_83 = PHI <_86(3), 0(4)> # prephitmp_65 = PHI <_125(3), 0(4)> _21 = vec_2(D)->D.33436._M_impl.D.32749._M_end_of_storage; PRE also removes some partial redundancies here (the other two PHIs).
(In reply to Martin Sebor from comment #2) > The IL looks like the warning is justified: The memcpy call is dead code, we just fail to notice it. > <bb 13> [local count: 230225493]: > # prephitmp_42 = PHI <_6(4), _7(3)> This is always _6, because in bb 3 we have _6 == _7. > pretmp_67 = vec_2(D)->D.33449._M_impl.D.32762._M_start; > _69 = prephitmp_42 - pretmp_67; Always 0. > <bb 7> [local count: 220460391]: > MEM <unsigned int> [(char * {ref-all})_155] = pretmp_72; > _50 = vec_2(D)->D.33449._M_impl.D.32762._M_finish; > _Num_51 = _50 - prephitmp_42; Always 0, in bb 4 we copy _M_start in _M_finish if they are not already equal. (sorry for the wrong FRE comment earlier) Note that if I replace operator new/delete with malloc/free inline void* operator new(std::size_t n){return __builtin_malloc(n);} inline void operator delete(void*p)noexcept{__builtin_free(p);} inline void operator delete(void*p,std::size_t)noexcept{__builtin_free(p);} we optimize quite a bit more and the warning disappears.
So, apart from the small missed PHI optimization, this is probably the common issue that since operator new is replacable, we can't really assume that it does not clobber anything, and that hurts optimizations :-( Not sure if there would be any convenient workaround for this specific case.
It seems to help if we save the values before the allocation in vector.tcc, although I cannot promise it won't pessimize something else... And that's just a workaround, not a solution. @@ -766,13 +766,16 @@ { const size_type __len = _M_check_len(__n, "vector::_M_range_insert"); + pointer __old_start(this->_M_impl._M_start); + pointer __old_finish(this->_M_impl._M_finish); + pointer __old_end_of_storage(this->_M_impl._M_end_of_storage); pointer __new_start(this->_M_allocate(__len)); pointer __new_finish(__new_start); __try { __new_finish = std::__uninitialized_move_if_noexcept_a - (this->_M_impl._M_start, __position.base(), + (__old_start, __position.base(), __new_start, _M_get_Tp_allocator()); __new_finish = std::__uninitialized_copy_a(__first, __last, @@ -780,7 +783,7 @@ _M_get_Tp_allocator()); __new_finish = std::__uninitialized_move_if_noexcept_a - (__position.base(), this->_M_impl._M_finish, + (__position.base(), __old_finish, __new_finish, _M_get_Tp_allocator()); } __catch(...) @@ -790,12 +793,12 @@ _M_deallocate(__new_start, __len); __throw_exception_again; } - std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish, + std::_Destroy(__old_start, __old_finish, _M_get_Tp_allocator()); _GLIBCXX_ASAN_ANNOTATE_REINIT; - _M_deallocate(this->_M_impl._M_start, - this->_M_impl._M_end_of_storage - - this->_M_impl._M_start); + _M_deallocate(__old_start, + __old_end_of_storage + - __old_start); this->_M_impl._M_start = __new_start; this->_M_impl._M_finish = __new_finish; this->_M_impl._M_end_of_storage = __new_start + __len;
If the warning is spurious, then changing vector.tcc won't, of course, keep it from popping up elsewhere when the same sequence of events get triggered. Here, it drew my attention to the missed micro-optimization of using reserve(), which had the benefit of avoiding the warning and allowing me to keep the -Werror option.
Just to clarify: when I said the warning is justified I meant that there is nothing in the IL to keep the warning from triggering. I.e., it works as designed. The problem may be solved by optimizing the code better, which might be possible by enhancing the optimizers or by changing or annotating the libstdc++ code to enable existing optimizations. In the latter case it will of course only help libstdc++ and not other code like it. A third possibility is to make the warning itself smarter than the optimizer and figure out the code is unreachable without its help.
Unchanged in GCC 12.
This can even be triggered via std::vector::operator=, at least in gcc 11.2. My own code is slightly more involved than this, but I can boil it down to: std::vector<int> src(2,1), dest(2); dest = src; and that's enough to trigger this warning in a -O2 -Wall -Werror build. Oddly, if I write dest(2,0) rather than dest(2), I don't see any warning. That's a fine workaround for my own non-performance-critical code path, but I wonder if it also indicates that the underlying optimization is no longer being enabled here for code that would benefit.
GCC 11.3 is being released, retargeting bugs to GCC 11.4.
*** Bug 106199 has been marked as a duplicate of this bug. ***
I also ran into this with: > auto data = std::make_shared<std::vector<uint8_t>>(); > data->insert(data->end(), {0xAA, 0xBB, 0xCC}); I do not hit it with: > std::vector<uint8_t> data; > data.insert(data.end(), {0xAA, 0xBB, 0xCC}); Bisecting shows it was introduced with 81d6cdd335ffc60c216a020d5c99306f659377a2. In gcc/gimple-ssa-warn-access.cc pass_waccess::check() there's now a call to the new pass_waccess::check_builtin() which throws this warning (BUILT_IN_MEMMOVE, check_memop_access()). The old behavior had these checks scattered throughout builtins.c but now with it being part of pass_waccess::check() it is called for each bb as part of pass_waccess::execute().
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>: https://gcc.gnu.org/g:cca06f0d6d76b08ed4ddb7667eda93e2e9f2589e commit r13-4393-gcca06f0d6d76b08ed4ddb7667eda93e2e9f2589e Author: Jonathan Wakely <jwakely@redhat.com> Date: Tue Nov 29 15:50:06 2022 +0000 libstdc++: Avoid bogus warning in std::vector::insert [PR107852] GCC assumes that any global variable might be modified by operator new, and so in the testcase for this PR all data members get reloaded after allocating new storage. By making local copies of the _M_start and _M_finish members we avoid that, and then the compiler has enough info to remove the dead branches that trigger bogus -Warray-bounds warnings. libstdc++-v3/ChangeLog: PR libstdc++/107852 PR libstdc++/106199 PR libstdc++/100366 * include/bits/vector.tcc (vector::_M_fill_insert): Copy _M_start and _M_finish members before allocating. (vector::_M_default_append): Likewise. (vector::_M_range_insert): Likewise.
The releases/gcc-12 branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>: https://gcc.gnu.org/g:2e4210698c644e44f9e0645dc7bc49710fd60ce8 commit r12-9457-g2e4210698c644e44f9e0645dc7bc49710fd60ce8 Author: Jonathan Wakely <jwakely@redhat.com> Date: Tue Nov 29 15:50:06 2022 +0000 libstdc++: Avoid bogus warning in std::vector::insert [PR107852] GCC assumes that any global variable might be modified by operator new, and so in the testcase for this PR all data members get reloaded after allocating new storage. By making local copies of the _M_start and _M_finish members we avoid that, and then the compiler has enough info to remove the dead branches that trigger bogus -Warray-bounds warnings. libstdc++-v3/ChangeLog: PR libstdc++/107852 PR libstdc++/106199 PR libstdc++/100366 * include/bits/vector.tcc (vector::_M_fill_insert): Copy _M_start and _M_finish members before allocating. (vector::_M_default_append): Likewise. (vector::_M_range_insert): Likewise. (cherry picked from commit cca06f0d6d76b08ed4ddb7667eda93e2e9f2589e)
GCC 11.4 is being released, retargeting bugs to GCC 11.5.
GCC 11 branch is being closed.