This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: hashtable local iterator

From: Paolo Carlini <paolo dot carlini at oracle dot com>
To: François Dumont <frs dot dumont at gmail dot com>
Cc: "libstdc++ at gcc dot gnu dot org" <libstdc++ at gcc dot gnu dot org>
Date: Fri, 16 Dec 2011 12:06:08 +0100
Subject: Re: hashtable local iterator
References: <4EE7C11F.7030109@gmail.com> <4EE7F286.6000203@oracle.com> <4EEA5F3E.2010707@gmail.com>

Hi,

On 12/14/2011 01:49 AM, Paolo Carlini wrote:
Hi,
Hi

Since modification of hashtable implementation I was quite unhappy of the impact on local_iterator and especially on the end(size_type bkt) method that required to loop through all bucket nodes to access the past-the-end one. This is why I would like to propose, if it is still time, this new local iterator implementation that allow to implement begin(size_t)/end(size_t) in constant time. Note that:
well, either we do this or, alternately, I'm afraid we have to revert the whole thing because the current behavior makes the new implementation non-conforming, right? Actually, in my opinion you should have pointed out the issue much more explicitly much earlier. If you are aware of similar issues, breaking the conformance of the implementation, please say it now, I would hate large last minute reversions.
Sorry, I hadn't noticed that it was a non Standard conformity, I though that not being O(N) with N the number of elements in the container or even the number of bucket was enough but you are right the C++11 Standard is really clear on operation complexity and current implementation is wrong.

Well, yes, I think it's wrong. I'm concerned not just because of the possible impact on the typical user experience, but also because the problem may point to a more fundamental issue with the new implementation which we may want to catch as early as possible. And in terms of regressions, since we are so aggressively redesigning the whole thing we should be rather strict about regressions too, in my opinion.

I have one other point that do not make me happy, it is in the management of the iterator hint in insert and emplace_hint methods. libstdc++ do not make anything of it while the Standard say that insertion should be achieved as close as possible to this hint. This is not however a regression of the new implementation, former one was also ignoring this hint information.

Good point. But I just had a look and things like Table 103 don't seem that different compared to TR1?! I think we are safe to ignore the hint, for now. But definitely something worth further exploring in terms of QoI. Note, the associative containers changed quite a bit in this area between C++98 and C++11, via a DR, which however we already implemented (the emplace methods would do the same).

- this new iterator type needs a pointer to its parent container to work which is quite unusual on normal iterators.
On the spot, I cannot see anything wrong with this. Normally, when such kinds of links are present one must be careful about what happens when one of the two changes without telling the other. Are we risking, I don't know, local iterators becoming invalid when they shouldn't?
- I had to make an overload of _M_bucket_index public to use it in local iterator implem', do you prefer to introduce a friend declaration ?
I guess friendship is much better.
I have done the modification to use friendship but with the issue raised by Christopher and you it is finally not so important. This swap issue in indeed rude. With current implementation I need at least the hash functor and bucket count to get the bucket index of the nodes on incrementation. I will study this problem but for the moment the only solution I see is to re-introduce the 2 nodes per bucket I had done in one of my propositions.

Ok good. Let's keep in touch, privately too, as usual, we don't want to make the release managers too much nervous. About your last observation, I would recommend you to seriously explore the idea: the last time I discusses these designs with other implementers, a few months ago, I'm pretty sure that designs using a "forward_list" for the nodes, then had 2 nodes per bucket (ie, two pointers to local begin and local end essentially). I guess that at this point we really want to keep the "forward_list" nature of the links between the nodes stable but for the buckets we can be more flexible.

Ah, one last unrelated note: while you review the implementation, please double check that all the data members are correctly ordered in the layouts, roughly from the larger to the smaller, if in doubt, assuming 64 bits for the pointers. Remember that we don't have any optimization pass reordering those (the last time I checked) and we don't want to stupidly waste space because one day, in a rush, we write (stupid example):

    bool b1member;
    void* pmember;
    bool b2member;

instead of
    ...
    void* pmember;
    bool b1member;
    bool b2member;
    ...

Ok, stupid things, but please keep an eye.

Thanks!
Paolo.

Follow-Ups:
- Re: hashtable local iterator
  - From: François Dumont

References:
- hashtable local iterator
  - From: François Dumont
- Re: hashtable local iterator
  - From: Paolo Carlini
- Re: hashtable local iterator
  - From: François Dumont

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]