This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2

From: François Dumont <frs dot dumont at gmail dot com>
To: Jonathan Wakely <jwakely dot gcc at gmail dot com>
Cc: Paolo Carlini <paolo dot carlini at oracle dot com>, "libstdc++ at gcc dot gnu dot org" <libstdc++ at gcc dot gnu dot org>
Date: Sun, 18 Nov 2012 21:26:53 +0100
Subject: Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
References: <bug-54075-19885@http.gcc.gnu.org/bugzilla/> <bug-54075-19885-8bKDxmpStr@http.gcc.gnu.org/bugzilla/> <509ADA7E.7050504@gmail.com> <CAH6eHdRHCefZq1FQF64nJ04-OMgMBb8TeUa5y9Lvi-2H=xE0Ww@mail.gmail.com> <509B1130.7050401@oracle.com> <509B1834.5080106@oracle.com> <509C21B9.3070703@gmail.com> <509CE00A.1040704@oracle.com> <50A2BE5B.3030809@gmail.com> <50A2CF71.5060804@oracle.com> <50A40CB4.6040306@gmail.com> <50A41FA7.30301@oracle.com> <50A557EB.1020709@gmail.com> <CAH6eHdRS58LUaRTD2ienwMc05sE4mXZrbW4rYGSUPdnuQ_YzUg@mail.gmail.com>

On 11/16/2012 02:51 AM, Jonathan Wakely wrote:

On 15 November 2012 21:00, François Dumont wrote:

On 11/14/2012 11:48 PM, Paolo Carlini wrote:

Can somebody remind me why *exactly* we have a condition having to do with
the empty-ness of the functor?


     Because when hash code is not cache we need to embed the hash functor
into the local iterator implementation. So to minimize local implementation
memory footprint we prefered to cache the hash code as soon as it is not
empty.

What if sizeof(functor) == 4, it will still be smaller than the size_t
hash code, right? (At least for common 64-bit platforms.) And if
sizeof(functor) == sizeof(size_t) it doesn't increase the footprint to
store the functor.

I meant I wanted to minimize sizeof(local_iterator). iterators are most of the time passed by value to it is better to make them small, no ?

   And if sizeof(functor) > sizeof(size_t) then could
we store a single copy of the functor somewhere and store a pointer to
it in each node? That would have the same size as the hash code, so we
never increase the size of a node by more than sizeof(functor*), and
the memory footprint is bounded. Is that possible?

Do we have any performance numbers showing the impact of different
sizes of functor stored in the nodes?

No cause the functor is not stored in hashtable nodes but in local_iterator when hash codes are not cached.


But anyway, surely non-empty functors are not the common case, so are
unlikely to be a performance problem for most users.

Agree


Do we have any idea why caching the hash code seems to be slower than
recalculating it?

Martin Cracauer in comment #31 is saying: "As you can see there also is a reasonably catastrophic increase in minor page faults". I think it is the source of the peformance issue.

     Isn't gcc going to consider functor as noexcept even without the
noexcept qualifier ? Analyzing the code gcc could guess it, no ?

No.

You seem to be confusing what the optimisers can do and the
language-level `noexcept' operator.  The optimisers can do clever
analysis to change code generation, but the result of the noexcept
operator does not depend on such clever analysis. Try

void f() { }
static_assert( noexcept(f()), "" );

The assertion fails. Obviously this function cannot throw, but it is
not "noexcept"

Traits such as is_nothrow_constructible and move_if_noexcept and
__is_noexcept_hash depend *only* on the statically provable "noexcept"
property, *not* on how smart the optimisers are.

Ok, good to know.

Users might not being force to forget about optimization just to cache hash
code or not.

I don't understand what you mean here, sorry.

If not caching always(?) makes things faster then it seems sensible to
only cache when we really need to (i.e. when the functor can throw)

No, not caching do not always make things faster. performance/23_containers/insert_erase/41975.cc is testing behavior of unordered containers using a string key. It shows that with string it is better to cache hash code. So if adding noexcept might improve generated code but is also making behavior of unordered container worst users might complains of not being able to have best performance in both cases. But I guess impact of caching hash code is higher than the one of noexcept on code generation so it is surely not a big problem.

François

Follow-Ups:
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: Jonathan Wakely

References:
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: FranÃois Dumont
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: Jonathan Wakely
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: Paolo Carlini
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: Paolo Carlini
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: François Dumont
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: Paolo Carlini
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: François Dumont
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: Paolo Carlini
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: François Dumont
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: Paolo Carlini
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: François Dumont
- Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
  - From: Jonathan Wakely

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]