This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch 3/N] std::regex refactoring - _Executor DFS / BFS
- From: Jonathan Wakely <jwakely at redhat dot com>
- To: Tim Shen <timshen91 at gmail dot com>
- Cc: libstdc++ <libstdc++ at gcc dot gnu dot org>, gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 29 Apr 2014 11:17:36 +0100
- Subject: Re: [patch 3/N] std::regex refactoring - _Executor DFS / BFS
- Authentication-results: sourceware.org; auth=none
- References: <20140428144659 dot GY928 at redhat dot com> <CAPrifD=Bc4D63KJONG5d7YzX1t73HDm276AKPY5mEOZi942_EA at mail dot gmail dot com> <20140428165121 dot GB928 at redhat dot com> <CAPrifDnRhfm2pCDAN5fRs81g8m45u8Kt40n8NHXLmGJrh8aiOA at mail dot gmail dot com> <20140428191027 dot GD928 at redhat dot com> <CAPrifDnTP8CzJtvZ2kYLXt7+=bij9kHj7acArkHGst_MUsw4sw at mail dot gmail dot com> <20140428192950 dot GE928 at redhat dot com> <CAPrifDnLBbCzADcHZrMqFhQnbLUDLjq2Hv_EGoUqLeOXmoep0Q at mail dot gmail dot com> <20140428201853 dot GG928 at redhat dot com> <CAPrifD=fvO0+ha+t7pd48VDht-uzsbBVuaeSEtR1Q6z4jnNQXA at mail dot gmail dot com>
On 28/04/14 17:14 -0400, Tim Shen wrote:
On Mon, Apr 28, 2014 at 4:18 PM, Jonathan Wakely <jwakely@redhat.com> wrote:
I thought I'd make a 5x speedup to the run-time of the regex matching,
but I was comparing the wrong version and the improvement actually
came from one of your patches yesterday - maybe this one:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01725.html
Nice work!
That's surprising. May I ask for the performance testcase?
See below. These are just microbenchmarks, so not very good or
reliable, but the <regex> code is large enough and complicated enough
that the results are fairly consistent and reproducible.
(I use __builtin_printf and __builtin_puts when I'm too lazy to
include a file, there's no other reason for that!)
I was using something like this to test basic_regex<char>:
#include <regex>
int main()
{
auto re = std::regex("[abc]{3}d*x?");
int count = 0;
for (int i = 0; i < 100000; ++i)
if (std::regex_match("bab", re)
&& std::regex_search("aacdddddddddddddddddddddddxaa", re))
++count;
__builtin_printf("%d\n", count);
}
This runs much faster on trunk than with 4.9.0, so we might want to
backport your recent patches to the gcc-4_9-branch.
I was using examples like this for testing _BracketMatcher with
basic_regex<wchar_t>
#include <regex>
int main()
{
auto re = std::wregex(L'[' + std::wstring(300, L'a') + L"bc"
+ std::wstring(1000, 'a') + L"d]");
int count = 0;
for (int i = 0; i < 100000; ++i)
if (std::regex_match(L"b", re) && std::regex_match(L"d", re))
++count;
__builtin_printf("%d\n", count);
}
This runs faster with my patch to use std::sort and std::unique on the
_M_char_set vector, because the regex compiles to be equivalent to
std::wregex(L"[abcd]")