This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[comp.lang.c++.moderated] reference counted std::basic_string: even 'const' operations may unshare!



This was just sent by Dietmar.  

--- Begin Message ---
Hi,

Consider the following program:

   #include <string>
   #include <iostream>

   int main()
   {
     std::string                 s1("ab");
     std::string const&          cs1 = s1;
     std::string const           cs2(cs1);

     char const&                 c1 = cs1[0];
     std::string::const_iterator i1 = cs1.begin() + 1;

     char const&                 c2 = cs2[0];
     std::string::const_iterator i2 = cs2.begin() + 1;

     s1[0] = 'A';
     s1[1] = 'B';

     std::cout << cs1 << "/" << c1 << "/" << *i1 << "\n";
     std::cout << cs2 << "/" << c2 << "/" << *i2 << "\n";

     return 0;
   }

This program is pretty simple and there is nothing really interesting
about it: It just uses strings and iterators and when compiled with
gcc-3.0.2 produces the following output:

   AB/a/b
   ab/a/b

Of course, this result wasn't a surprise since I know that this version
of compiler/library uses a shared representation for strings. The
correct output would, of course, be

   AB/A/B
   ab/a/b

In the context of shared representations for strings this program is
not at all that uninteresting: It may be surprising but there are
constant operations where the string has to immediately give up sharing
of the representation! More precisely, *all* operations obtaining an
iterator or a reference have to unshare the representation if there may
have been a reference or an iterator obtained to the shared
representation before through a different string object - even if the
reference or the iterator provides just read-only access!

For the given program this means that the shared representation has to
be made unshared latest when 'c2' is initialized. Since I have not seen
this effect mentioned when people discuss reference counted strings
(except I mentioned it already in a discussion way down in a thread in
this forum) I thought it is interesting to point this little detail out
especially as I would expect that not only the gcc-3.0.2 implementation
get this wrong: I would expect that all implementations using shared
strings get this wrong...

Since the underlying problem, ie. avoiding to copy a string when it is
returned from a function or being passed as a temporary to a function
(in all other context I think using 'const&' instead avoids this problem
in the first place), is pretty narrow, there are some options how this
could be addressed. Here is what I came up with immediately:

- Some form of "move construction" which basically moves the resource
   directly to the destination would be a general approach to address
   this problem, ie. this approach would not be limited to strings.
- A constant string (my favorite name would be 'std::basic_string'
   using a specialization for the case where the character type is of
   the form 'cT const') which would not allow any modification where
   iterators or references stay valid. Most of 'std::basic_string's
   members are suitable. There are only a few offenders: 'operator[]()',
   'begin()', and 'end()'.
- Optimizer guarantees would also help. How often is the copy-ctor
   called in the following program?

     #include <iostream>

     struct S {
       S() { std::cout << "S::S()\n"; }
       S(S const&) { std::cout << "S::S(S const&)\n"; }
       ~S() { std::cout << "S::~S()\n"; }
     };

     S a() { return S(); }
     S b() { return a(); }
     S c() { return b(); }
     S d() { return c(); }
     S e() { return d(); }
     S f() { return e(); }

     int main()
     {
       S const& s = f();
       return 0;
     }

   With all compilers I tried, it is not called at all, even if the
   functions are distributed over different object files. If things
   become more complex, compilers start to use copy constructors,
   however.
-- 
<mailto:dietmar_kuehl@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>


      [ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
      [ about comp.lang.c++.moderated. First time posters: do this! ]
--- End Message ---

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]