Bug 50160 - vector<bool> comparison very slow (no overload)
Summary: vector<bool> comparison very slow (no overload)
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on: 50481
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-22 23:46 UTC by Jonathan Schmidt-Dominé
Modified: 2015-08-30 08:27 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
bits/stl_vector.h patch (633 bytes, patch)
2011-08-30 23:25 UTC, Jonathan Schmidt-Dominé
Details | Diff
b (550 bytes, patch)
2011-08-30 23:30 UTC, Jonathan Schmidt-Dominé
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Schmidt-Dominé 2011-08-22 23:46:19 UTC
Hi!

Comparison (operator<) on vector<bool> seems to be extremely slow, because it uses a generic implementation, i.e. bitwise iteration. It should be specialised and use integer-comparisons provided by the CPU.

Regards
Comment 1 Paolo Carlini 2011-08-23 08:14:06 UTC
Do you have some numbers? Of course we can make progress on this, but we'd like to know when it's good enough.
Comment 2 Jonathan Schmidt-Dominé 2011-08-23 13:50:09 UTC
Well, it made my specific application very slow, when using vector<bool> as key_type for (ordered) map.  What kind of numbers would be usefull? I think it should simply have a specialised operator< implementation comparing chunks of 32/64 bits, there is also a specialised hash implementation, it should be like that one.
Comment 3 Paolo Carlini 2011-08-23 16:31:40 UTC
I meant: which improvement do you expect, in practice? I supposed you are aware if other implementations of the library already including the optimization. I'm looking for some concrete evidence, because you know, vector<bool> is deprecated, isn't a real container, and all the well known bad things about it, thus we normally don't do much development work on it beyond ghe basic maintainance. Of course, if you have already a patch, that would definitely speed things up ;)
Comment 4 Jonathan Schmidt-Dominé 2011-08-23 18:21:36 UTC
In libc++ it is the same, specialised hash, but no specialised operator<.
Comment 5 Jonathan Schmidt-Dominé 2011-08-23 18:26:48 UTC
In libc++ it seems to be the same, specialised hashing, but no specialised comparison. Ok, I can write a patch and test if it is actually faster (it should). Is there any coding-guide line (style, variables etc.)?
Comment 6 Jonathan Schmidt-Dominé 2011-08-23 18:29:19 UTC
Ok, I can write a patch and test if it is actually faster (it
should). Is there any coding-guide line (style, variables etc.)?
Comment 7 Marc Glisse 2011-08-23 18:35:57 UTC
If I understand correctly, operator< is supposed to give a lexicographic order, and vector<bool> stores {true,false,false} as 1 and {false,false,true} as 4, so we can't just make operator< compare chunks. Reversing the order of the bits in an unsigned long is a bit complicated, although it would still lead to code faster than the current (easiest would be to compare byte by byte, and mirror each byte using a table).

But you don't seem particularly interested in using the lexicographic order, is that true? In that case you could try and specify some other order to std::map, say comparing hashes ;-) I understand that the lexicographic order of the chunks would be a fine order, but I don't think it can be exposed. I haven't thought about the potential drawbacks of implementing vector<bool> backwards.
Comment 8 Jonathan Schmidt-Dominé 2011-08-23 19:04:25 UTC
For my application I should simply use an unordered_map and all the overhead has gone. :D But operator< simply should not be that slow.

“I haven't thought about the potential drawbacks of implementing vector<bool> backwards.”
What do you mean? Chunk wise backwards?

“but I don't think it can be exposed.”
Well, that would not be portable.
Comment 9 Jonathan Schmidt-Dominé 2011-08-23 19:10:06 UTC
Hmm, reversing really is not that nice.
Comment 10 Jonathan Schmidt-Dominé 2011-08-23 19:15:20 UTC
There seem to be a lot of tricks to achieve that:
http://graphics.stanford.edu/~seander/bithacks.html#ReverseByteWith64BitsDiv
Comment 11 Marc Glisse 2011-08-23 19:16:40 UTC
(In reply to comment #8)
> For my application I should simply use an unordered_map and all the overhead
> has gone. :D

Great.

> “I haven't thought about the potential drawbacks of implementing vector<bool>
> backwards.”
> What do you mean? Chunk wise backwards?

Er, yes, start filling each chunk from the most significant bit to the least significant, but otherwise keep things the same.

> Hmm, reversing really is not that nice.

Any particular reason? Not that the current implementation is likely to change anytime soon, but if you found drawbacks it would be nice to have them written somewhere in case the same question ever arises again.
Comment 12 Jonathan Schmidt-Dominé 2011-08-23 19:31:37 UTC
“Any particular reason?”
No particular one, but a small specialisation would be nicer than changing everything or doing bitfiddling.
Comment 13 Paolo Carlini 2011-08-30 10:29:14 UTC
To be precise, we are talking about overloading not specializing. The issue, in general, reminds me the way we overload std::fill and std::copy for ranges of deque<>::iterator, but then in detail, there are the difficulties pointed out by Marc. I'm certainly available for patch reviews, anyway.
Comment 14 Jonathan Schmidt-Dominé 2011-08-30 23:25:25 UTC
Created attachment 25143 [details]
bits/stl_vector.h patch

moved operator== and operator< inside class, because I want to overload them
Comment 15 Jonathan Schmidt-Dominé 2011-08-30 23:30:00 UTC
Created attachment 25144 [details]
b

More efficient (non representative benchmark!) implementation of operator< and operator== for vector<bool>
Comment 16 Jonathan Schmidt-Dominé 2011-08-30 23:30:36 UTC
Added basic patch…
Comment 17 Jonathan Wakely 2011-08-30 23:34:42 UTC
(In reply to comment #14)
> moved operator== and operator< inside class, because I want to overload them

huh, why is that needed?  it's not acceptable anyway, it needs to be a non-member
Comment 18 Jonathan Schmidt-Dominé 2011-08-30 23:56:38 UTC
(In reply to comment #17)
> (In reply to comment #14)
> > moved operator== and operator< inside class, because I want to overload them
> 
> huh, why is that needed?  it's not acceptable anyway, it needs to be a
> non-member

Correct me, if I am wrong, isn't it considered as partial specialisation when writing it straight-forward as non-member?
Comment 19 Paolo Carlini 2011-08-31 00:04:36 UTC
I'm wondering if processing an unsigned long at a time wouldn't be a step in the right direction. Then, a compiler intrinsics would be the right place for that, would naturally fit in the set of builtins fiddling with bits of integer types. On top of that the library-proper code would become very simple, in particular no ugly reinterpret_cast & co (by the way, Ch. 7 of "Hacker's Delight" has some info about such operations). Assuming indeed, we *do* have to revert the order, I'm not 101% sure.
Comment 20 Paolo Carlini 2011-08-31 00:26:49 UTC
Largely irrelevant here, but partial specialization of function templates simply does not exist. We have been talking about adding an overload like:

 template<typename _Alloc>
   bool
   operator<(const vector<bool, _Alloc>&, const vector<bool, _Alloc>&)

or overloading std::lexicographical_compare, seems also an option.
Comment 21 Jonathan Schmidt-Dominé 2011-08-31 00:30:58 UTC
It would indeed be nice to have such a builtin function (8, 16, 32, 64 bit reversing), currently there is none in gcc, only bytewise reversing iirc. Should that be put as wishlist-item in the bugzilla or something like that?

Reversing is necessary because the ordering should be lexicographic, but the internal representation uses least-significant-first, reversing the internal representation would probably not be that nice.
Comment 22 Andrew Pinski 2011-08-31 00:33:52 UTC
Can't you do ~a < ~b ?
Comment 23 Jonathan Schmidt-Dominé 2011-08-31 00:34:28 UTC
@Paolo

Okay, I am sometimes overcautious with function-templates, because I often had a lot of errors because of partial specialisation when it was indeed necessary (function templates with explicit parameters).
Comment 24 Jonathan Schmidt-Dominé 2011-08-31 00:39:57 UTC
@Andrew

Nope:
10000001 > 00000001 (lexicographically)
10000001 > 00000001 (as little-endian)
01111110 < 11111110 (as little-endian)
Comment 25 Paolo Carlini 2011-08-31 00:56:42 UTC
I think a separate Bugzilla requesting as an enhancement such intrinsics would be certainly appropriate. I'm sure other code could exploit those.

Note, in the meanwhile we could as well do the exercise of separating the operation to a C function - for unsigned long, the 32-bit and 64-bit cases would be good enough for now - coded perhaps along the lines explained at the beginning of Ch. 7 of "Hacker's Delight" and use the function for operator< (please, not inline). See which performance come out.
Comment 26 jsm-csl@polyomino.org.uk 2011-08-31 15:23:56 UTC
Various processors have an instruction to reverse the bit order in a word 
(ARMv6T2 and later have RBIT, for example, and C6X has BITR on C64X and 
above).  I think a generic built-in function (variants for different type 
sizes) with associated generic RTL representation and lowering for 
processors without such an instruction makes sense.
Comment 27 Marc Glisse 2011-09-22 09:25:44 UTC
(In reply to comment #25)
[builtins to reverse the bit order]
> I think a separate Bugzilla requesting as an enhancement such intrinsics would
> be certainly appropriate.

Has this RFE already been filed?
Comment 28 Paolo Carlini 2011-09-22 09:49:18 UTC
I think we should. Can you do that? Thanks!
Comment 29 Marc Glisse 2011-09-22 10:29:07 UTC
See Bug 50481 about bit-reversal builtins (and feel free to add details there).
Comment 30 Jonathan Schmidt-Dominé 2011-09-22 13:39:22 UTC
Sorry, thank you for creating the feature-request.
Comment 31 DANISH 2012-03-13 08:30:00 UTC Comment hidden (spam)
Comment 32 Tim Parker 2012-03-29 10:42:52 UTC Comment hidden (spam)
Comment 33 albcl111 2012-10-31 11:09:32 UTC Comment hidden (spam)
Comment 34 albcl111 2012-12-21 06:46:07 UTC Comment hidden (spam)
Comment 35 albcl111 2012-12-21 11:58:37 UTC Comment hidden (spam)
Comment 36 brett davis 2013-04-30 12:30:06 UTC Comment hidden (spam)
Comment 37 jandyu rata 2013-05-17 05:58:39 UTC Comment hidden (spam)
Comment 38 tysonroy 2013-12-14 11:06:15 UTC Comment hidden (spam)
Comment 39 Jmaescraig 2014-02-25 12:19:20 UTC Comment hidden (spam)
Comment 40 Elizbath Martin 2014-04-09 05:41:48 UTC Comment hidden (spam)
Comment 41 Wellamjames 2014-05-21 09:06:38 UTC Comment hidden (spam)
Comment 42 steve 2015-08-30 08:26:57 UTC Comment hidden (spam)
Comment 43 steve 2015-08-30 08:27:50 UTC Comment hidden (spam)