[PATCH] improve string find algorithm

Jonathan Wakely jwakely@redhat.com
Mon Jan 9 12:33:00 GMT 2017


On 06/01/17 22:19 +0000, Aditya K wrote:
>> Could you try the corrected patch on your benchmarks?
>
>For the test-case you gave there is a regression.
>
>Benchmark                                                                    Time           CPU      Iterations
>-------------------------------------------------------------------
>Without the patch: BM_StringRegression                  81 ns         81 ns    8503740
>With the patch: BM_StringRegression                      109 ns      109 ns    6346500
>
>
>The real advantage is when there are fewer matches as seen in BM_StringFindNoMatch. The code for the benchmark  can be found in https://github.com/llvm-mirror/libcxx/blob/master/benchmarks/string.bench.cpp
>However, I have written an independent std-benchmark that can be used just by exporting the CC, CXX, LD_LIBRARY_FLAGS: https://github.com/hiraditya/std-benchmark

I think the large improvements are worth the smaller regression, so
I'm committing the patch I sent last week.



>Following are the results:
>----------------------------------------------------------------------------------------------------------------------------------
>Without the patch:
>
>
>Run on (8 X 3403.85 MHz CPU s)
>2017-01-06 15:47:30
>***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
>Benchmark                             Time           CPU Iterations
>-------------------------------------------------------------------
>BM_StringFindNoMatch/10               6 ns          6 ns  114499418
>BM_StringFindNoMatch/64              34 ns         34 ns   20578576
>BM_StringFindNoMatch/512            222 ns        222 ns    3136787
>BM_StringFindNoMatch/4096          1728 ns       1729 ns     401913
>BM_StringFindNoMatch/32768        13679 ns      13684 ns      50680
>BM_StringFindNoMatch/131072       54570 ns      54591 ns      12506
>BM_StringFindAllMatch/1               4 ns          4 ns  180640260
>BM_StringFindAllMatch/8               6 ns          6 ns  119682220
>BM_StringFindAllMatch/64              7 ns          7 ns   97679753
>BM_StringFindAllMatch/512            19 ns         19 ns   36035174
>BM_StringFindAllMatch/4096           92 ns         92 ns    7516363
>BM_StringFindAllMatch/32768         849 ns        849 ns     809284
>BM_StringFindAllMatch/131072       3610 ns       3611 ns     193894
>BM_StringFindMatch1/1             27273 ns      27283 ns      25579
>BM_StringFindMatch1/8             27289 ns      27300 ns      25516
>BM_StringFindMatch1/64            27297 ns      27307 ns      25561
>BM_StringFindMatch1/512           27303 ns      27314 ns      25579
>BM_StringFindMatch1/4096          27488 ns      27499 ns      25366
>BM_StringFindMatch1/32768         28157 ns      28168 ns      24750
>BM_StringFindMatch2/1             27273 ns      27284 ns      25562
>BM_StringFindMatch2/8             27296 ns      27306 ns      25555
>BM_StringFindMatch2/64            27312 ns      27323 ns      25549
>BM_StringFindMatch2/512           27327 ns      27338 ns      25558
>BM_StringFindMatch2/4096          27513 ns      27524 ns      25349
>BM_StringFindMatch2/32768         28161 ns      28172 ns      24788
>BM_StringRegression                  81 ns         81 ns    8503740
>
>
>
>With the patch
>
>
>Run on (8 X 1071.8 MHz CPU s)
>2017-01-06 16:06:29
>***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
>Benchmark                             Time           CPU Iterations
>-------------------------------------------------------------------
>BM_StringFindNoMatch/10               6 ns          6 ns  121302159
>BM_StringFindNoMatch/64               7 ns          7 ns  102003502
>BM_StringFindNoMatch/512             15 ns         15 ns   44820639
>BM_StringFindNoMatch/4096            77 ns         77 ns    9016958
>BM_StringFindNoMatch/32768          555 ns        555 ns    1227219
>BM_StringFindNoMatch/131072        2688 ns       2689 ns     259488
>BM_StringFindAllMatch/1               8 ns          8 ns   85893410
>BM_StringFindAllMatch/8               9 ns          9 ns   80811804
>BM_StringFindAllMatch/64              9 ns          9 ns   74237599
>BM_StringFindAllMatch/512            23 ns         23 ns   31163379
>BM_StringFindAllMatch/4096           94 ns         94 ns    7317385
>BM_StringFindAllMatch/32768         847 ns        848 ns     803901
>BM_StringFindAllMatch/131072       3551 ns       3552 ns     196844
>BM_StringFindMatch1/1              1337 ns       1337 ns     518042
>BM_StringFindMatch1/8              1338 ns       1338 ns     519431
>BM_StringFindMatch1/64             1340 ns       1341 ns     513974
>BM_StringFindMatch1/512            1355 ns       1356 ns     511857
>BM_StringFindMatch1/4096           1489 ns       1489 ns     465629
>BM_StringFindMatch1/32768          2203 ns       2204 ns     316044
>BM_StringFindMatch2/1              1337 ns       1338 ns     519057
>BM_StringFindMatch2/8              1337 ns       1337 ns     517745
>BM_StringFindMatch2/64             1347 ns       1347 ns     514813
>BM_StringFindMatch2/512            1362 ns       1363 ns     507408
>BM_StringFindMatch2/4096           1491 ns       1492 ns     465979
>BM_StringFindMatch2/32768          2204 ns       2205 ns     315887
>BM_StringRegression                 109 ns        109 ns    6346500
>
>
>
>-Aditya
>
>
>From: Jonathan Wakely <jwakely@redhat.com>
>Sent: Friday, January 6, 2017 2:37 PM
>To: Aditya Kumar
>Cc: libstdc++@gcc.gnu.org; gcc-patches@gcc.gnu.org; hiraditya@msn.com
>Subject: Re: [PATCH] improve string find algorithm
> 
>On 06/01/17 14:34 -0600, Aditya Kumar wrote:
>>Thanks for the correction and updating the comments.
>
>Could you try the corrected patch on your benchmarks?
>
>The performance with the patch is worse for the first string::find
>benchmark in our testsuite:
>
>  s = "aabbaabbaaxd adbffdadgaxaabbbddhatyaaaabbbaabbaabbcsy";
>  f = "aabbaabbc";
>  s.find(f);
>
>



More information about the Gcc-patches mailing list