This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: [patch] Preprocessorised tr1::tuple
- From: Chris Jefferson <caj at cs dot york dot ac dot uk>
- To: Paolo Carlini <pcarlini at suse dot de>
- Cc: libstdc++ <libstdc++ at gcc dot gnu dot org>
- Date: Wed, 23 Feb 2005 09:12:27 +0000
- Subject: Re: [patch] Preprocessorised tr1::tuple
- References: <421BAF7B.7030709@cs.york.ac.uk> <421BC863.4080102@suse.de>
Paolo Carlini wrote:
Chris Jefferson wrote:
Any comments are welcome. Apologises again that this is much less
neat than I might like, but I found it very hard to make it neat and
have been busy recently ¬_¬
First, thanks Chris for your contributions, always very stimulating.
I have a quick question (before going to sleep), not completely on
topic, maybe: compile time performance? Are you seeing any noticeable
degradation wrt your first implementation? I'm asking because often,
when I compare the behavior of Boost's type_traits to mine, I get the
impression that the former are (much ;) slower to compile... maybe
because of the pre-processor library, I wondered? To be clear: the
pre-processor library is one of those *very* smart pieces of code that
you can find in Boost - I barely understand it - but the only time I
looked briefly at it, I suspected that couldn't be very fast...
Yep, I actually meant to attach this to my previous mail. The
performance is somewhat pained..
As a comparison, compiling a file with just a main and <functional> with
"g++-cvs file.cc" takes 0.7 seconds on my computer.
If I limit the maximum tuple size to 10 (minimum by the standard), then
compiling the new header takes 1.3 seconds, compiling the old one (or
the new one already preprocessed, they are basically identical) takes
about 1.1 seconds, so there isn't much in it.
if I now try uping the maximum allowed tuple size to 20, then compiling
the old C generated tuple header (or the new one after I've
pre-preprocessed it) takes about 1.3 seconds. Compiling the new
macroised header takes about 2.0 seconds, so things are starting to look
bad...
Unfortunatly, unlike Douglas' function code, I feel that there are just
too many things that have to change in the tuple header depending on
length, so trying to write the code in the style he wrote, while much
faster would be much harder to read / maintain...
At the moment the header is designed to allow unrolling up to a depth of
40. If this was cut back, then a little (30%ish) extra speed would
ensue, while of course limiting the maximum size that the tuple header
could get to.
One thing I don't know a lot about is the precompiled header support. If
by the time TR1 is considered stable and generally usable the
precompiled headers support is on by default, and is capable of handling
this unrolling macro monstrosity, then that might "fix" the problem?
Chris