This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Tackling library IO performance


Jerry Quinn wrote:
>
...
Next, I tried to bypass the call to grouping() in
num_put<char>::_M_widen_int.  It turns out that the virtual call to
numpunct<char>::grouping() is expensive.  Bypassing this call brings
the runtime from 22s down to 11s!

...
The third thing I did was trying to further improve _M_widen_int().
For the test case, ctype<char>.widen translates into a useless memcpy.
Removing the call reduces the runtime by another 1s.  I saw a
reference in the archives and bug databse that said that
ctype<char>.do_is() may bypass the virtual call for efficiency.  If
the same is true of widen, then we could safely skip the call to
ctype<char>.widen.
I don't think these are safe assumptions, at least not according
to my understanding of Stage 2. Both grouping() and widen() must
be called even for the narrow char case unless the num_put facet
can determine that numpunct and ctype facets it uses are the
default ones. But since facet member functions are required to
return the same value for the same arguments, caching the values
is a safe way to eliminate all but the first expensive virtual
call.

Assuming that it is possible to safely skip the widen call and we have
the format cache, we can skip making a local copy of the locale.  This
tweak bought me another 1s or so.

Finally, I wanted to improve the num_put<char>::_M_insert function.
This function inserts using the iterator by default, which translates
into repeated calls to streambuf::sputc.  I replaced this with a
single call to streambuf::sputn.  This decreased the runtime by
another 1s.  To do this, I had to add an _M_xxx accessor to the
ostreambuf_iterator so that I could access the underlying streambuf.
Relying on extensions in general (i.e., exposed by the primary
template) may be unsafe. If the template (ostreambuf_iterator)
is specialized for a user-defined type, the part of the library
that relies on the extension (num_put) will either have to be
specialized as well or the extension will have to be duplicated
in the specialization. Either way, the extension presents
a portability problem.

Of course, the extension is perfectly safe if it is only exposed
and relied upon by the specializations of the templates provided
by the library, e.g., ostreambuf_iterator<char> and num_put<char>,
and not by the primary templates themselves.

Regards
Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]