This (https://github.com/gcc-mirror/gcc/commit/bf78b43873b0b7e8f9a430df38749b8b61f9c9b8#diff-c8a656ef205ec6452ef0bed111a387dc9e7eb2404fb3222a48f9f93b2460bd55R278) __builtin_unreachable() is reached using this code -------------------------------------- #include <boost/algorithm/string/case_conv.hpp> #include <string> int main() { auto lower = std::string{"0123456789ABCDEF"}; auto upper = boost::algorithm::to_upper_copy(lower); } -------------------------------------- with Boost 1.82. Doing "g++ -O1 -fsanitize=undefined -o main main.cpp && ./main" results in /usr/include/c++/13/bits/basic_string.h:278:29: runtime error: execution reached an unreachable program point This is with gcc-13.1.1-1.fc38.x86_64 from Fedora 38. AFAICT the bug is in libstdc++. During the first iteration, is not until https://github.com/gcc-mirror/gcc/blob/releases/gcc-13.1.0/libstdc%2B%2B-v3/include/bits/basic_string.tcc#L201 that the data pointer is replaced from the SSO to the heap one; but one line before, in https://github.com/gcc-mirror/gcc/blob/releases/gcc-13.1.0/libstdc%2B%2B-v3/include/bits/basic_string.tcc#L200 https://github.com/gcc-mirror/gcc/blob/releases/gcc-13.1.0/libstdc%2B%2B-v3/include/bits/basic_string.h#L293 https://github.com/gcc-mirror/gcc/blob/releases/gcc-13.1.0/libstdc%2B%2B-v3/include/bits/basic_string.h#L275 , it checks whether it should destroy the data based on whether _M_data() == _M_local_data().
See https://gcc.gnu.org/pipermail/libstdc++/2023-May/055903.html
*** Bug 109706 has been marked as a duplicate of this bug. ***
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>: https://gcc.gnu.org/g:cbf6c7a1d16490a1e63e9a5ce00e9a5c44c4c2f2 commit r14-430-gcbf6c7a1d16490a1e63e9a5ce00e9a5c44c4c2f2 Author: Kefu Chai <kefu.chai@scylladb.com> Date: Mon May 1 21:24:26 2023 +0100 libstdc++: Set _M_string_length before calling _M_dispose() [PR109703] This always sets _M_string_length in the constructor for ranges of input iterators, such as stream iterators. We copy from the source range to the local buffer, and then repeatedly reallocate a larger one if necessary. When disposing the old buffer, _M_is_local() is used to tell if the buffer is the local one or not (and so must be deallocated). In addition to comparing the buffer address with the local buffer, _M_is_local() has an optimization hint so that the compiler knows that for a string using the local buffer, there is an invariant that _M_string_length <= _S_local_capacity (added for PR109299 via r13-6915-gbf78b43873b0b7). But we failed to set _M_string_length in the constructor taking a pair of iterators, so the invariant might not hold, and __builtin_unreachable() is reached. This causes UBsan errors, and potentially misoptimization. To ensure the invariant holds, _M_string_length is initialized to zero before doing anything else, so that _M_is_local() doesn't see an uninitialized value. This issue only surfaces when constructing a string with a range of input iterator, and the uninitialized _M_string_length happens to be greater than _S_local_capacity, i.e., 15 for the std::string specialization. libstdc++-v3/ChangeLog: PR libstdc++/109703 * include/bits/basic_string.h (basic_string(Iter, Iter, Alloc)): Initialize _M_string_length. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
The releases/gcc-13 branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>: https://gcc.gnu.org/g:d50f2599d7b23bdba05a9102645d082ed9bcb05f commit r13-7282-gd50f2599d7b23bdba05a9102645d082ed9bcb05f Author: Kefu Chai <kefu.chai@scylladb.com> Date: Mon May 1 21:24:26 2023 +0100 libstdc++: Set _M_string_length before calling _M_dispose() [PR109703] This always sets _M_string_length in the constructor for ranges of input iterators, such as stream iterators. We copy from the source range to the local buffer, and then repeatedly reallocate a larger one if necessary. When disposing the old buffer, _M_is_local() is used to tell if the buffer is the local one or not (and so must be deallocated). In addition to comparing the buffer address with the local buffer, _M_is_local() has an optimization hint so that the compiler knows that for a string using the local buffer, there is an invariant that _M_string_length <= _S_local_capacity (added for PR109299 via r13-6915-gbf78b43873b0b7). But we failed to set _M_string_length in the constructor taking a pair of iterators, so the invariant might not hold, and __builtin_unreachable() is reached. This causes UBsan errors, and potentially misoptimization. To ensure the invariant holds, _M_string_length is initialized to zero before doing anything else, so that _M_is_local() doesn't see an uninitialized value. This issue only surfaces when constructing a string with a range of input iterator, and the uninitialized _M_string_length happens to be greater than _S_local_capacity, i.e., 15 for the std::string specialization. libstdc++-v3/ChangeLog: PR libstdc++/109703 * include/bits/basic_string.h (basic_string(Iter, Iter, Alloc)): Initialize _M_string_length. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Co-authored-by: Jonathan Wakely <jwakely@redhat.com> (cherry picked from commit cbf6c7a1d16490a1e63e9a5ce00e9a5c44c4c2f2)
The releases/gcc-12 branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>: https://gcc.gnu.org/g:04fbfafbd9657a36e8c3e80708a18fba08136854 commit r12-9508-g04fbfafbd9657a36e8c3e80708a18fba08136854 Author: Kefu Chai <kefu.chai@scylladb.com> Date: Mon May 1 21:24:26 2023 +0100 libstdc++: Set _M_string_length before calling _M_dispose() [PR109703] This always sets _M_string_length in the constructor for ranges of input iterators, such as stream iterators. We copy from the source range to the local buffer, and then repeatedly reallocate a larger one if necessary. When disposing the old buffer, _M_is_local() is used to tell if the buffer is the local one or not (and so must be deallocated). In addition to comparing the buffer address with the local buffer, _M_is_local() has an optimization hint so that the compiler knows that for a string using the local buffer, there is an invariant that _M_string_length <= _S_local_capacity (added for PR109299 via r13-6915-gbf78b43873b0b7). But we failed to set _M_string_length in the constructor taking a pair of iterators, so the invariant might not hold, and __builtin_unreachable() is reached. This causes UBsan errors, and potentially misoptimization. To ensure the invariant holds, _M_string_length is initialized to zero before doing anything else, so that _M_is_local() doesn't see an uninitialized value. This issue only surfaces when constructing a string with a range of input iterator, and the uninitialized _M_string_length happens to be greater than _S_local_capacity, i.e., 15 for the std::string specialization. libstdc++-v3/ChangeLog: PR libstdc++/109703 * include/bits/basic_string.h (basic_string(Iter, Iter, Alloc)): Initialize _M_string_length. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Co-authored-by: Jonathan Wakely <jwakely@redhat.com> (cherry picked from commit cbf6c7a1d16490a1e63e9a5ce00e9a5c44c4c2f2)
Fixed for 12.3 and 13.2, thanks for the report.
*** Bug 109737 has been marked as a duplicate of this bug. ***
*** Bug 109786 has been marked as a duplicate of this bug. ***
*** Bug 110549 has been marked as a duplicate of this bug. ***