Bug 102048 - __gnu_cxx::rope.erase(size_t __p) implementation seems to be wrong
Summary: __gnu_cxx::rope.erase(size_t __p) implementation seems to be wrong
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 12.0
: P3 normal
Target Milestone: 11.3
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-24 18:46 UTC by Zonghan Yang
Modified: 2022-05-09 16:39 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2021-08-24 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zonghan Yang 2021-08-24 18:46:56 UTC
In ext/rope from all versions of gcc I could refer to (see https://github.com/gcc-mirror/gcc/blame/master/libstdc%2B%2B-v3/include/ext/rope#L2407, the last change is made 17 years ago), the implementation of __gnu_cxx::rope.erase(size_t) is:

      // Erase, single character
      void
      erase(size_type __p)
      { erase(__p, __p + 1); }

The comment said it will erase a single character from rope. However, the function actually erase (__p + 1) characters starting from position __p by calling erase(size_t, size_t) commented as "Erase, (position, size) variant.", which is just defined above the erase(size_t) version.

You can test the function by the following code:

#include <bits/stdc++.h>
#include <ext/rope>
int main() {
	__gnu_cxx::rope<int> R;
	for (int i = 0; i < 9; i++) R.push_back(i);
	R.erase(2);
	for (int i = 0; i < 4; i++) std::cout << R[i] << std::endl; 
}

If erase functions normally or replace erase(2) with erase(2, 1), it's expected to see 0 1 3 4. It turned out to be 0 1 5 6, that erases 3 elements starting from 2.
Comment 1 Jonathan Wakely 2021-08-24 19:07:48 UTC
We should just delete this class, so I don't have to keep fixing bugs in code that nobody uses or cares about.
Comment 2 Jonathan Wakely 2021-08-24 19:12:58 UTC
This would fix it, and provide the API that was apparently intended:

--- a/libstdc++-v3/include/ext/rope
+++ b/libstdc++-v3/include/ext/rope
@@ -2401,11 +2401,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
        this->_M_tree_ptr = __result;
       }
 
-      // Erase, single character
-      void
-      erase(size_type __p)
-      { erase(__p, __p + 1); }
-
       // Insert, iterator variants.
       iterator
       insert(const iterator& __p, const rope& __r)


If we want to retain an erase function that erases a single character, we could do:

@@ -2393,7 +2393,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
       // Erase, (position, size) variant.
       void
-      erase(size_type __p, size_type __n)
+      erase(size_type __p, size_type __n = 1)
       {
        _RopeRep* __result = replace(this->_M_tree_ptr, __p,
                                     __p + __n, 0);


But to be consistent with std::string it should erase from that position to the end of the string:

@@ -2393,7 +2393,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
       // Erase, (position, size) variant.
       void
-      erase(size_type __p, size_type __n)
+      erase(size_type __p, size_type __n = npos)
       {
        _RopeRep* __result = replace(this->_M_tree_ptr, __p,
                                     __p + __n, 0);

So maybe we should just remove it, since it's currently broken and "fixing" it would be inconsistent with std::string.
Comment 3 Zonghan Yang 2021-08-24 19:34:14 UTC
I couldn't find document explicitly states how the function works but only the line of comment above it shows the idea. So I guess it's just a small typo but not found for years since no one really document it.

In my opinion, it'd be better if it is consistent with std::string. Since current implementation is totally wrong, it's very unlikely that somebody really use this function historically (except unlucky wretches like me), it would be acceptable if we change the API.

Or, deleting the function erase(size_t) itself instead of whole rope is also a good choice. From my point of view, rope is somehow useful in some specific tasks, so it'd be awesome if it can be kept it in libstdc++.
Comment 4 Jonathan Wakely 2021-08-24 21:11:33 UTC
That function was not part of the SGI rope so I think we should just remove it:
https://www.boost.org/sgi/stl/Rope.html
Comment 5 Zonghan Yang 2021-08-25 15:20:18 UTC
I do agree this function should be deleted if SGI rope doesn't contain it. Also, it's the easiest way to fix the problem.
Comment 6 Jonathan Wakely 2021-08-25 15:50:33 UTC
(In reply to Jonathan Wakely from comment #4)
> That function was not part of the SGI rope so I think we should just remove
> it:
> https://www.boost.org/sgi/stl/Rope.html

Correction: it's present in the <stl_rope.h> header, but not documented in the API reference. So I still think its existence is a mistake, but a mistake that started with the original implementation.
Comment 7 GCC Commits 2021-08-25 21:29:02 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:2cd229dec8d6716938de5052479d059d306969da

commit r12-3146-g2cd229dec8d6716938de5052479d059d306969da
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Aug 25 16:42:49 2021 +0100

    libstdc++: Remove __gnu_cxx::rope::erase(size_type) [PR102048]
    
    This function claims to remove a single character at index p, but it
    actually removes p+1 characters beginning at p. So r.erase(0) removes
    the first character, but r.erase(1) removes the second and third, and
    r.erase(2) removes the second, third and fourth. This is not a useful
    API.
    
    The overload is present in the SGI STL <stl_rope.h> header that we
    imported, but it isn't documented in the API reference. The erase
    overloads that are documented are:
    
    erase(const iterator& p)
    erase(const iterator& f, const iterator& l)
    erase(size_type i, size_type n);
    
    Having an erase(size_type p) overload that erases a single character (as
    the comment says it does) might be useful, but would be inconsistent
    with std::basic_string::erase(size_type p = 0, size_type n = npos),
    which erases from p to the end of the string when called with a single
    argument.
    
    Since the function isn't part of the documented API, doesn't do what it
    claims to do (or anything useful) and "fixing" it would leave it
    inconsistent with basic_string, I'm just removing that overload.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/102048
            * include/ext/rope (rope::erase(size_type)): Remove broken
            function.
Comment 8 Jonathan Wakely 2021-08-25 22:12:19 UTC
Removed for GCC 12. We shouldn't remove it from the release branches, but I'll add a deprecated warning to it.
Comment 9 GCC Commits 2021-10-12 19:41:03 UTC
The releases/gcc-11 branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:a1dc688940ffade63452c8f9d80fd4b3204e5f40

commit r11-9134-ga1dc688940ffade63452c8f9d80fd4b3204e5f40
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Aug 25 16:42:49 2021 +0100

    libstdc++: Remove __gnu_cxx::rope::erase(size_type) [PR102048]
    
    This function claims to remove a single character at index p, but it
    actually removes p+1 characters beginning at p. So r.erase(0) removes
    the first character, but r.erase(1) removes the second and third, and
    r.erase(2) removes the second, third and fourth. This is not a useful
    API.
    
    The overload is present in the SGI STL <stl_rope.h> header that we
    imported, but it isn't documented in the API reference. The erase
    overloads that are documented are:
    
    erase(const iterator& p)
    erase(const iterator& f, const iterator& l)
    erase(size_type i, size_type n);
    
    Having an erase(size_type p) overload that erases a single character (as
    the comment says it does) might be useful, but would be inconsistent
    with std::basic_string::erase(size_type p = 0, size_type n = npos),
    which erases from p to the end of the string when called with a single
    argument.
    
    Since the function isn't part of the documented API, doesn't do what it
    claims to do (or anything useful) and "fixing" it would leave it
    inconsistent with basic_string, I'm just removing that overload.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/102048
            * include/ext/rope (rope::erase(size_type)): Remove broken
            function.
    
    (cherry picked from commit 2cd229dec8d6716938de5052479d059d306969da)
Comment 10 GCC Commits 2022-04-26 13:13:13 UTC
The releases/gcc-10 branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:dcd1fc4900fd101f8fa22883b6ac31e6f7601373

commit r10-10576-gdcd1fc4900fd101f8fa22883b6ac31e6f7601373
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Thu Apr 21 14:12:25 2022 +0100

    libstdc++: Deprecate __gnu_cxx::rope::erase(size_type) [PR102048]
    
    This function is broken, and has been removed for GCC 11 and 12.
    Deprecate it for GCC 10.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/102048
            * include/ext/rope (rope::erase(size_type)): Deprecate broken
            function.
Comment 11 Jakub Jelinek 2022-05-06 08:30:54 UTC
GCC 12.1 is being released, retargeting bugs to GCC 12.2.
Comment 12 Jonathan Wakely 2022-05-06 12:35:06 UTC
This was already fixed for 12.1 (and 11.3). The broken function is also deprecated in the gcc-10 branch.
Comment 13 GCC Commits 2022-05-09 16:39:30 UTC
The releases/gcc-9 branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:44afdb5f23ed752fb7b3ac095831ba65901c45f2

commit r9-10050-g44afdb5f23ed752fb7b3ac095831ba65901c45f2
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Thu Apr 21 14:12:25 2022 +0100

    libstdc++: Deprecate __gnu_cxx::rope::erase(size_type) [PR102048]
    
    This function is broken, and has been removed for GCC 11 and 12.
    Deprecate it for GCC 10 and GCC 9.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/102048
            * include/ext/rope (rope::erase(size_type)): Deprecate broken
            function.
    
    (cherry picked from commit 05ad54a8e67b31126351b9f15e1e42ac67650d6d)