RFC: basic regex implementation
Stephen M. Webb
stephenw.webb@bregmasoft.ca
Thu Jun 10 12:53:00 GMT 2010
On 09/06/10 17:59, Jonathan Wakely wrote:
> On 9 June 2010 21:10, Stephen M. Webb wrote:
> > This is a first pass for a C++0x regex implementation. Very basic
> > functionality only (BREs only, no backrefs, no character classes), not
> > production code yet, still not caught up to n3090. Looking for feedback
> > before I put more effort in.
>
> I like the debugging stuff guarded by SMW_NDEBUG, it would be a shame
> to lose that. Maybe it could be polished so it could be kept in the
> final version, at least when _GLIBCXX_DEBUG is defined.
Yes, I think a big chunk of this implementation predates _GLIBCXX_DEBUG so I
wasn't tracking it. I will add this to the to-do list.
> AFAICT there is no fix needed for threading issues w.r.t
> __unmatched_sub, initialisation of local statics is reentrant with GCC
> (as required by C++0x). That's why I used a local static not a global
> static.
Ok. I am not completely up to speed on whole swathes of C++0x, and certainly
this was an issue with C++03.
> The std:: qualifiers could be removed, the code is already in namespace
> std.
Did we not go through an exercise a few years ago to explictly qualify names
with std:: to avoid a whole class of problems?
> A comment on the C++0x regex spec, rather than your implementation:
> It's my understanding that users are not supposed to instantiate
> sub_match objects, or at least shouldn't need to, as doing so results
> in an uninitialised "matched" member. I intend to file an NB comment
> suggesting a deleted default constructor. There could be a private
> constructor which is used internally by the library. If you have any
> comments on that point I'd be glad to hear them.
The problem with making sub_match constructors private is that the
implementation of the match engine(s) and token_iterator engine(s) could get
pretty hairy, or else sub_match would have to have so many friends it should
have its own page on Facebook. A sub_match is effectively a POD, and if a
user creates a POD without initializing it he or she can keep both halves.
I think a safer design might have been to make sub_match.matched a member
function and provide non-trivial constructors. Then again, keeping it
PODlike simplifies implementing the rest iof the regex library.
It would be interesting to hear what the Committee has to say on the matter.
--
Stephen M. Webb
stephen.webb@bregmasoft.ca
More information about the Libstdc++
mailing list