This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Papers comparing SSO and COW strings


Hello,

Some time ago the libstdc++ had switched from the Copy-On-Write
implementation of std::string to the Small-String-Optimized one. [1]
It's widely accepted that SSO implementations deliver better
performance in multi-threaded applications. Speed considerations lie
at the core of the N2668 that had effectively banned COW
implementations in C++11 [2].

The thing is that N2668 doesn't reference any particular research on
the speed and downsides of COW string implementations and I'm having a
hard time finding one. So far I've seen the well-known article by Herb
Sutter [3] and one more paper [4] but both are built around a few
synthetic benchmarks and are 10+ years old. Unfortunately I can't find
any benchmarks featuring real-world applications and measured on a
modern hardware which changed a lot since then. For instance, atomics
have in some sense became both cheaper (with improvements in SMP
systems) and more expensive (with a wider spread of NUMA and a
constantly growing number of cores that increases contention).

In theory I see two different kinds of speed-up that may come from
non-COW strings:
1) Improvements that make the existing code run faster. Possible reasons are:
    a) No need for atomic reference counters
    b) Improved data locality on NUMA systems for threads that
maintain own copies of their strings
    c) Short string optimization (which could technically co-exist
with COW but normally doesn't. A notable exception is fbstring [5])
2) Improvements that allow writing a better code. By limiting the
number of cases where pointers and iterators may be invalidated, the
C++11 standard allows a wider use of non-owning references to strings.
This goes well with the string_view in C++17.

At the same time, a code that relies heavily on the COW-ness of
strings may face a performance degradation with the non-COW
implementation. I wonder if anyone have reported seeing this on
practice.

I'm looking for papers and articles that cover these topics. Anything
from a documented and analyzed speed-up of a given application with
GCC 5.1 to a comprehensive research. Regarding the hardware I'm
primarily interested in x86_64 but data on other architectures would
be also useful.

Does anyone have relevant links?

Alexey

[1] https://gcc.gnu.org/ml/libstdc%2B%2B/2014-11/msg00111.html
[2] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2668.htm
[3] http://www.gotw.ca/publications/optimizations.htm
[4] http://complement.sourceforge.net/compare.pdf
[5] https://github.com/facebook/folly/blob/master/folly/docs/FBString.md


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]