--- Begin Message ---
- From: Dietmar Kuehl <dietmar_kuehl at yahoo dot com>
- Date: 16 Dec 2001 06:39:31 -0500
- Subject: reference counted std::basic_string: even 'const' operations may unshare!
- Newsgroups: comp.lang.c++.moderated
- Organization: Phaidros Software AG
- Reply-to: dietmar_kuehl at yahoo dot com
Hi,
Consider the following program:
#include <string>
#include <iostream>
int main()
{
std::string s1("ab");
std::string const& cs1 = s1;
std::string const cs2(cs1);
char const& c1 = cs1[0];
std::string::const_iterator i1 = cs1.begin() + 1;
char const& c2 = cs2[0];
std::string::const_iterator i2 = cs2.begin() + 1;
s1[0] = 'A';
s1[1] = 'B';
std::cout << cs1 << "/" << c1 << "/" << *i1 << "\n";
std::cout << cs2 << "/" << c2 << "/" << *i2 << "\n";
return 0;
}
This program is pretty simple and there is nothing really interesting
about it: It just uses strings and iterators and when compiled with
gcc-3.0.2 produces the following output:
AB/a/b
ab/a/b
Of course, this result wasn't a surprise since I know that this version
of compiler/library uses a shared representation for strings. The
correct output would, of course, be
AB/A/B
ab/a/b
In the context of shared representations for strings this program is
not at all that uninteresting: It may be surprising but there are
constant operations where the string has to immediately give up sharing
of the representation! More precisely, *all* operations obtaining an
iterator or a reference have to unshare the representation if there may
have been a reference or an iterator obtained to the shared
representation before through a different string object - even if the
reference or the iterator provides just read-only access!
For the given program this means that the shared representation has to
be made unshared latest when 'c2' is initialized. Since I have not seen
this effect mentioned when people discuss reference counted strings
(except I mentioned it already in a discussion way down in a thread in
this forum) I thought it is interesting to point this little detail out
especially as I would expect that not only the gcc-3.0.2 implementation
get this wrong: I would expect that all implementations using shared
strings get this wrong...
Since the underlying problem, ie. avoiding to copy a string when it is
returned from a function or being passed as a temporary to a function
(in all other context I think using 'const&' instead avoids this problem
in the first place), is pretty narrow, there are some options how this
could be addressed. Here is what I came up with immediately:
- Some form of "move construction" which basically moves the resource
directly to the destination would be a general approach to address
this problem, ie. this approach would not be limited to strings.
- A constant string (my favorite name would be 'std::basic_string'
using a specialization for the case where the character type is of
the form 'cT const') which would not allow any modification where
iterators or references stay valid. Most of 'std::basic_string's
members are suitable. There are only a few offenders: 'operator[]()',
'begin()', and 'end()'.
- Optimizer guarantees would also help. How often is the copy-ctor
called in the following program?
#include <iostream>
struct S {
S() { std::cout << "S::S()\n"; }
S(S const&) { std::cout << "S::S(S const&)\n"; }
~S() { std::cout << "S::~S()\n"; }
};
S a() { return S(); }
S b() { return a(); }
S c() { return b(); }
S d() { return c(); }
S e() { return d(); }
S f() { return e(); }
int main()
{
S const& s = f();
return 0;
}
With all compilers I tried, it is not called at all, even if the
functions are distributed over different object files. If things
become more complex, compilers start to use copy constructors,
however.
--
<mailto:dietmar_kuehl@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>
[ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
--- End Message ---