This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch 3/N] std::regex refactoring - _Executor DFS / BFS
- From: Jonathan Wakely <jwakely at redhat dot com>
- To: Tim Shen <timshen91 at gmail dot com>
- Cc: libstdc++ <libstdc++ at gcc dot gnu dot org>, gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 28 Apr 2014 17:51:21 +0100
- Subject: Re: [patch 3/N] std::regex refactoring - _Executor DFS / BFS
- Authentication-results: sourceware.org; auth=none
- References: <20140428124003 dot GW928 at redhat dot com> <20140428144659 dot GY928 at redhat dot com> <CAPrifD=Bc4D63KJONG5d7YzX1t73HDm276AKPY5mEOZi942_EA at mail dot gmail dot com>
On 28/04/14 11:40 -0400, Tim Shen wrote:
On Mon, Apr 28, 2014 at 10:46 AM, Jonathan Wakely <jwakely@redhat.com> wrote:
This change splits _Executor::_M_main() into two overloaded
_M_main_dispatch() functions, choosing which to run based on the
__dfs_mode template parameter.
I think this gives a (very) small improvement in compilation time when
using regexes.
Splitting _M_main() allows the _M_match_queue and _M_visited members
(which are only used in BFS mode) to be replaced with an instantiation
of the new _States class template. _States<__dfs> only contains the
start state (so that it's not empty and doesn't waste space) but
_States<__bfs> contains the start state and also the match queue and
visited list.
This is great! I keep worrying about the ugly all-in-one _Executor
class; I've tried to specialize two versions of it, but that
introduced much duplicated code. Your abstract is nice!
Thanks. I'll clean up that patch and commit it soon too.
The next thing I plan to look at, which I haven't done yet, is to see
if passing the __match_mode template parameter as a runtime function
parameter makes any difference to the way the code is structuted. Do
you have any thoughts in that, before I waste time doing something
that won't work?
As the visited list never changes size after construction I changed
its type from unique_ptr<vector<bool>> to unique_ptr<bool[]>, which
avoids an indirection, although now that that member is only present
when actually required it could just be vector<bool>. An array of bool
is simpler to access, but takes more heap memory. vector<bool> uses
less space but the compiler has to do all the masking to address
individual bits.
What about bitset? Anyway we should get rid of vector<bool>.
The size of a bitset is fixed at compile-time. If we could use
dynarray<bool> that might be nice, but we can't :-)