[patch] regex_traits implementation

Jonathan Wakely jwakely.gcc@gmail.com
Tue Jul 2 17:11:00 GMT 2013

On 30 June 2013 08:55, Tim Shen wrote:
> ChangeLog entry fixed.

Some comments on the latest patch (it's a long-ish email but most of
my comments are very minor things regarding the _RegexMask type) ...

Defining struct _RegexMask in the typedef declaration makes it public,
I think I would prefer if the struct definition was a private member
of regex_traits and only the char_class_type typedef was public.
Either that or just rename the type to char_class_type and do away
with the typedef.

I'm not sure it's necessarily required for bitmask types, but I would
expect to be able to default-construct a char_class_type, is that left
out intentionally?

I think the _RegexMask constructor should be explicit.  Its parameters
should be named __base and __extended, to conform to the libstdc++
coding guidelines.

It looks like _RegexMask could be a literal type, with a constexpr
constructor and constexpr operators.  I think we might as well make
use of those features if we can.

The operator functions take const parameters by value, is that a
conscious decision to prefer that to either non-const parameters or
pass by reference?

The definition of operator== considers all bits of _M_extened to be
part of the object's value representation, so:

    char_class_type{{},4} != char_class_type{{},0}

even though those two values have the same flags set and behave
identically when passed to regex_traits::isctype. It would be better
to define operator== to use _M_extended&3 so only the bits we care
about contribute to the value.

In theory we shouldn't need a special case for the "blank" character
class, I got that added to std::ctype_base via
http://cplusplus.github.io/LWG/lwg-defects.html#2019, but we don't
support it yet in libstdc++ (I forgot about it.)

Your isctype implementation tests __f._M_extended ==
_RegexMask::_S_word, shouldn't that use & not ==, since __f could have
both the _S_word and _S_blank bits set?   Similarly for the _S_blank

The _S_word case tests against ctype_base::alpha, but it should be
alnum not alpha.  I'd suggest getting rid of that test entirely
though:  The "w" class is represented by _RegexMask{0, _S_word},
whereas in the uncommitted patch I posted to
http://gcc.gnu.org/ml/libstdc++/2010-10/msg00049.html I represented it
by the equivalent of _RegexMask{ctype_base::alnum,
_S_char_class_under}, which means you don't need to check the alnum
case because the earlier __fctyp.is(__f._M_base, __c) test will
already have handled it. That means instead of a _S_word bit you only
need _S_under instead.

More information about the Libstdc++ mailing list