Search algorithms in __gnu_cxx::

Dhruv Matani dhruvbird@gmail.com
Fri Sep 15 07:09:00 GMT 2006


On 9/15/06, Paolo Carlini <pcarlini@suse.de> wrote:
> Dhruv Matani wrote:
> > Let me know if there is any problem, or if I can go ahead and make a
> > patch.
> Lots of problems ;)

> Besides trivial stylistic issues,

This will be changed in the final patch.

> __is_char which we
> already have, and so on,

Where???? I did a complete grep -r "__is_char" . and didn't find
anything. Is it under some other name?

> frankly I don't like much where you say that
> you are deviating from the real Boyer-Moore algorithm for simplicity: as
> I tried to explain already, I'd rather start with a clean, neat,
> implementation of the original algorithm or, as a second choice, of one
> among the published, well known variants (many in the already mentioned
> web page, for example). Otherwise, we cannot call the algorithm
> Boyer-Moore, we have to analyze from scratch its complexity, convince
> people that it always works correctly, and so on, and so on...

Yes, I understand your concerns fully. However, implementing the
Boyer-Moore in all it's entirety would imply that we would need to
maintain extra tables, and code complexity, etc.... which is not worth
the effort. The algorithm that I have implemented is a slight variant,
and is also sometimes called the Turbo Boyer Moore(a simpler version
of Boyer Moore, which captures the essence of the concept). The
complexity remains the same as that of Boyer Moore, since Boyer Moore
is not very easy to analyse when we get suffix matches, and depends
highly upon the input. The complexity for the normal case is the same
as the original concept.


Regards,
-Dhruv.


ps. bug in test case. Change:
tmp1 = __bm_search(tmp, src.end(),
TO:
tmp1 = __bm_search(tmp1, src.end(),


-- 
   -Dhruv Matani.
http://www.geocities.com/dhruvbird/

"Life is wasted on the living"
  - Zaphod Beeblebrox the Fourth.



More information about the Libstdc++ mailing list