This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch 2/3] Make regex's infinite loop detection accurate


On 25/04/14 18:03 -0400, Tim Shen wrote:

   	* include/bits/regex_executor.h: Add _M_rep_count to track how
   	many times this repeat node are visited.

"is visited" not "are visited"

@@ -151,6 +156,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
      // character.
      std::unique_ptr<vector<pair<_StateIdT, _ResultsVec>>> _M_match_queue;
      // Used in BFS, indicating that which state is already visited.
+      std::vector<pair<_BiIter, int>>                       _M_rep_count;

We don't need the std:: qualifiers here, but it doesn't do any harm.

      std::unique_ptr<vector<bool>>                         _M_visited;
      _FlagT                                                _M_flags;
      // To record current solution.
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index 68a5e04..bb10a72 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -161,6 +161,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
      return false;
    }

+  // __rep_count records how many times (__rep_count.second)
+  // this node are visited under certain input iterator
+  // (__rep_count.first). This prevent the executor from entering
+  // infinite loop by refusing to continue when it's already been
+  // visited more than twice. It's `twice` instead of `once` because
+  // we need to spare one more time for potential group capture.
+  template<typename _BiIter, typename _Alloc, typename _TraitsT,
+    bool __dfs_mode>
+  template<bool __match_mode>
+    void _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
+    _M_rep_once_more(_StateIdT __i)
+    {
+      const auto& __state = _M_nfa[__i];
+      auto& __rep_count = _M_rep_count[__i];
+      if (__rep_count.second == 0 || __rep_count.first != _M_current)
+	{
+	  auto back = __rep_count;

This variable should be uglified.

+	  __rep_count.first = _M_current;
+	  __rep_count.second = 1;

Maybe a dumb question (I don't understand a lot of the regex code!)
but is it correct to set this to 1 in the case where
__rep_count.first != _M_current ?  Could that result in the count
going downwards from 2 to 1? (Maybe that's correct?)



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]