This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Search algorithms in __gnu_cxx::


Dhruv Matani wrote:
 I have with me, presently a working implementation of Boyer Moore.
However, the only problem is that the lookup using hash_map is very
slow, since it needs to be done at every stage. If I use std::map,
it's even slower. However, while testing with strings(char arrays), I
tried using a static array of 256 entries(one for each char), and that
made whole process faster by about 70%, which is a LOT!!!! So, I'm
contemplating providing specialization for char, unsigned char, and
string for which this algo will probably be used the most, and have
that implementation use the array logic, while the other
implementations can use __gnu_cxx::hash_map<>. Again, I hope no one
uses this algo for anything other than string searching, else they
will be in for a big performance hit.
A couple of quick comments.
1/ Please, dont'use the legacy __gnu_cxx stuff for new projects. That code has small and large bugs which we don't want to spend time on and now we have got much better and soon-to-be-standardized replacement in tr1 (also performing quite a bit better in practice, as far as I know).
2/ If we are really going for benchmarks, I'd like to see something more, for instance also the case of a string which actually does occur and, in case of strings, a comparison to the current string::find, which is very simple, still often so much better than the previous best relying on std::search. That said, I think we should concentrate first on a solid, neat, generic implementation, only at a later stage provide specializations. For sure, that kind of algorithm is not going to beat in every possible circumstance the simplistic ones (we knew that at the outset) but we are anyway confident that a good implementation can be useful to the users, that's why there is that much research going on pattern matching, BM variants and so on and so on...


Paolo.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]