This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: hashtable local iterator
Hi,
On 12/14/2011 01:49 AM, Paolo Carlini wrote:
Hi,
Hi
Since modification of hashtable implementation I was quite
unhappy of the impact on local_iterator and especially on the
end(size_type bkt) method that required to loop through all bucket
nodes to access the past-the-end one. This is why I would like to
propose, if it is still time, this new local iterator implementation
that allow to implement begin(size_t)/end(size_t) in constant time.
Note that:
well, either we do this or, alternately, I'm afraid we have to revert
the whole thing because the current behavior makes the new
implementation non-conforming, right? Actually, in my opinion you
should have pointed out the issue much more explicitly much earlier.
If you are aware of similar issues, breaking the conformance of the
implementation, please say it now, I would hate large last minute
reversions.
Sorry, I hadn't noticed that it was a non Standard conformity, I
though that not being O(N) with N the number of elements in the
container or even the number of bucket was enough but you are right
the C++11 Standard is really clear on operation complexity and current
implementation is wrong.
Well, yes, I think it's wrong. I'm concerned not just because of the
possible impact on the typical user experience, but also because the
problem may point to a more fundamental issue with the new
implementation which we may want to catch as early as possible. And in
terms of regressions, since we are so aggressively redesigning the whole
thing we should be rather strict about regressions too, in my opinion.
I have one other point that do not make me happy, it is in the
management of the iterator hint in insert and emplace_hint methods.
libstdc++ do not make anything of it while the Standard say that
insertion should be achieved as close as possible to this hint. This
is not however a regression of the new implementation, former one was
also ignoring this hint information.
Good point. But I just had a look and things like Table 103 don't seem
that different compared to TR1?! I think we are safe to ignore the hint,
for now. But definitely something worth further exploring in terms of
QoI. Note, the associative containers changed quite a bit in this area
between C++98 and C++11, via a DR, which however we already implemented
(the emplace methods would do the same).
- this new iterator type needs a pointer to its parent container to
work which is quite unusual on normal iterators.
On the spot, I cannot see anything wrong with this. Normally, when
such kinds of links are present one must be careful about what
happens when one of the two changes without telling the other. Are we
risking, I don't know, local iterators becoming invalid when they
shouldn't?
- I had to make an overload of _M_bucket_index public to use it in
local iterator implem', do you prefer to introduce a friend
declaration ?
I guess friendship is much better.
I have done the modification to use friendship but with the issue
raised by Christopher and you it is finally not so important. This
swap issue in indeed rude. With current implementation I need at least
the hash functor and bucket count to get the bucket index of the nodes
on incrementation. I will study this problem but for the moment the
only solution I see is to re-introduce the 2 nodes per bucket I had
done in one of my propositions.
Ok good. Let's keep in touch, privately too, as usual, we don't want to
make the release managers too much nervous. About your last observation,
I would recommend you to seriously explore the idea: the last time I
discusses these designs with other implementers, a few months ago, I'm
pretty sure that designs using a "forward_list" for the nodes, then had
2 nodes per bucket (ie, two pointers to local begin and local end
essentially). I guess that at this point we really want to keep the
"forward_list" nature of the links between the nodes stable but for the
buckets we can be more flexible.
Ah, one last unrelated note: while you review the implementation, please
double check that all the data members are correctly ordered in the
layouts, roughly from the larger to the smaller, if in doubt, assuming
64 bits for the pointers. Remember that we don't have any optimization
pass reordering those (the last time I checked) and we don't want to
stupidly waste space because one day, in a rush, we write (stupid example):
bool b1member;
void* pmember;
bool b2member;
instead of
...
void* pmember;
bool b1member;
bool b2member;
...
Ok, stupid things, but please keep an eye.
Thanks!
Paolo.