This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
As metioned in this[1] email, I here propose a refactored version of regex, and show the status: Modules: 1) Automaton module(regex_automaton.h, regex_automaton.tcc), now consists of an NFA implementation; In the future, the NFA should be able to be optimized to a DFA. 2) Regex compiler module(regex_compiler.h, regex_compiler.tcc), which take a string and then emit a NFA. Now it provides a regex token scanner and parser(NFA generator). It's not fully implemented, and will be main task in the near future. It should includes various regex flavors(ECMAScript, basic, extended, awk, grep and egrep), but they are not supported. Since the ECMAScript style is the default one, I'll start my work here. 3) Executor(or matcher) module, which is constructed with a NFA and a string, then match and group it. We have a DFS and BFS(Thompson matcher) approach now; the BFS approach can effectively prevent the matching from being exponential, but cannot handle back-references. The current strategy is checking whether there're back-referneces in the pattern; If there are, a DFS matcher will be generated, else the BFS one. Matching options are not supported now. It shouldn't be a great deal since the executors are there. This patch eliminate _SpecializedCursor and _SpecializedResults. Instead, I use _BiIter and match_results, the actual data structure, directly. It should enabled the easy and obvious approch to implement back-reference. 'make bootstrap && make -k check' tested under x86_64. Thanks! [1] http://gcc.gnu.org/ml/libstdc++/2013-07/msg00140.html -- Tim Shen
Attachment:
changelog
Description: Binary data
Attachment:
regex.patch
Description: Binary data
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |