This is the mail archive of the
mailing list for the libstdc++ project.
Re: Dead code (was: Re: Good numbers from Ritter's new string allocator)
Loren James Rittle wrote:
> For my part, I prefer something like:
> * include/bits/basic_string.tcc (basic_string::_Rep::_S_create):
> Enforce allocation size blocking policy to reduce fragmentation and
> enhance performance with common malloc implementations.
Ok, I completely agree with you.
> > * include/bits/c++config: Remove comment.
> Only under the observation that no one has taken an interest in doing
> the whole job, I could support using pagesize as the primary blocking
> factor and checking in my old patch. However, as Ulrich observed,
> malloc implementations vary in a way that the test is subtly more
> complex than just taking pagesize. To my shock, most of the modern
> malloc implementations I studied appeared to have no minimal block
> size related to VM pagesize nor do they, in general, align "large"
> requests to that pagesize.
> And, from a study of various malloc implementations, ``4 *
> sizeof(void*)'' was seen to be the worst case malloc header size thus
> I suppose we could use it without a test.
Good, in my investigations I arrived to the very same conclusion.
Given that many platform's
> malloc did no alignment whatsoever to VM pagesize (at least when the
> allocation request was exactly one or two pages in size), I see little
> point in using one default over another. If anyone with configure
> magic wishes to revist this issue, see:
> as the basis for a better test. Anyone that wants to improve the
> exponential shaping of progression of allocation sizes, should
> consider this policy: first apply exponential growth, then round up to
> the nearest size contained by this blocking algorithm. Of course,
> this is only my best guess for general use without tuning it anywhere.
In fact, there is at least one well known and modern implementation of
the ISO stdlib which does not grow exponentially either, and, moreover,
does not round to the block. But I agree with you (and with Nathan) that
some form of exponential shaping should be implemented, sooner or later.
As regards your current approach, lately I have carried out many
additional benchmarks, and indeed, in every circumstance (varying the
size of each concatenated string in the range from 1 (original Stefan's
benchmark) to 1000) it definitely improves on the current allocator. Of
course, the gain is large (up to 100-200 times) when many allocations
are carried out and each concatenation is small.
> Also, this line needs to be updated before check-in:
> > const size_t __pagesize = 4096; // This magic constant, from OS.
> As you observed, there appears to be an autoconfig test for
> getpagesize() already. I am not sure that we like the idea of a
> function call on that code path.
> Since it has been 4-5 months since this issue was last raised, I would
> like to sleep on it until Paolo is available again (since he is
> willing to be a co-author of the patch perhaps he can do some more
> tuning before we add it to the mainline ;-).
Indeed I slept on it a few hours :-)
Seriously, I believe that at this point we (i.e., me, you, Benjamin and
Nathan) all agree that the current patch (with the amended Changelog)
could be a good starting point for the mainline. I think your approach
is very simple yet guarantees a big improvement on the current
situation. In my tests it's beaten in speed only by truly exponential
As regards __pagesize, I volunteer to take care of it in the next few
weeks, of course, but someone should introduce me to the configury
machinary needed. For instance, something that troubles me, is that the
pagesize is a TARGET parameter, so presently I have no idea how I could
test for it at build time. Also, many systems don't have the getpagesize
call at all, AFAIK...
For the time being, I think that we agree that a real pagesize of 8192,
f.i., would only worsen a little the performance of your scheme but
would not lead to a regression wrt the current one, right?