This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Our match_results has some problems: size() returns N+1 where N is the number of elements, which already includes the 0th sub-match (i.e. the entire match) so the returned value is off-by-one. It should be n+1 where n is the number of marked sub-expressions. operator[] does not check for sub >= size() so allows out-of-range accesses. position() can result in undefined behaviour if called with an out-of-range argument because it will try to compare a potentially singular iterator. This is really a problem with the draft standard, which doesn't make it clear that a match_results can be singular and there are implied preconditions on some member functions. I plan to file an NB comment about this. _M_prefix and _M_suffix could be stored in the vector, reducing the size of an unmatched match_results. This also makes _M_matched unnecessary because !_M_matched is implied by the vector being empty. This reduces the size of match_results by 4 pointers and 3 bools and simplifies copying, moving and swapping because the only state is in the vector. Finally, it's missing move operations and other changes since TR1. The patch below addresses all these issues, so I would like to check it in. It changes the ABI of match_results, but as that class is completely non-functional it can't break any programs unless they explicitly make use of sizeof(match_results) but don't actually instantiate match_results. My suggested implementation is documented in the code: The vector base is empty if this does not represent a successful match. Otherwise it contains n+3 elements where n is the number of marked sub-expressions: [0] entire match [1] 1st marked subexpression ... [n] nth marked subexpression [n+1] prefix [n+2] suffix This means size() == n+1 == (N ? N-2 : 0) Copying, moving and swapping, as well as empty(), are now simply forwarded to the vector base. I made position() return -1 for out-of-range, consistent with boost::regex (and with e.g. string::find) operator[] checks its argument and returns a static object representing an unmatched subexpression (as required). Does anyone have any objections or improvements to this change?
Attachment:
regex.txt
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |