This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: [PATCH] improve string find algorithm
- From: Aditya K <hiraditya at msn dot com>
- To: Jonathan Wakely <jwakely at redhat dot com>, Manfred <mx2927 at gmail dot com>
- Cc: "libstdc++ at gcc dot gnu dot org" <libstdc++ at gcc dot gnu dot org>, Aditya Kumar <aditya dot k7 at samsung dot com>, Sebastian Pop <sebpop at gmail dot com>
- Date: Wed, 11 Jan 2017 05:32:53 +0000
- Subject: Re: [PATCH] improve string find algorithm
- Authentication-results: sourceware.org; auth=none
- References: <1481132816-31162-1-git-send-email-aditya.k7@samsung.com> <CGME20170106133507epcas2p15eaabe5ca279349c9f3603a6c2bb61d8@epcas2p1.samsung.com> <20170106133502.GB2966@redhat.com> <016101d2682b$136dc890$3a4959b0$@samsung.com> <20170106202058.GH2966@redhat.com> <a2f93eec-d72d-96e2-5736-23ee8f35b709@gmail.com>,<20170110173755.GN13348@redhat.com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
So there is a minor performance gain (maybe noise) when alignment is taken into account.
There is a regression when we start matching from the next character. The code of benchmark can be found here: https://github.com/hiraditya/std-benchmark/blob/master/cxx/string.libcxx.bench.cpp
Run on (8 X 3400 MHz CPU s) build-gcc(with the patch) Run on (8 X 3400 MHz CPU s) build-gcc-base(gcc-trunk with latest string::find)
2017-01-10 22:38:47 2017-01-10 22:39:52
name iterations real_time cpu_time time_unit name iterations real_time cpu_time time_unit Iteration-ratio(build-gcc:iteration/build-gcc-base:iteration)
BM_StringFindNoMatch/16 115921611 5.66729 5.66947 ns BM_StringFindNoMatch/16 118885145 5.68854 5.69073 ns 0.9750722935
BM_StringFindNoMatch/64 98420464 6.95582 6.95853 ns BM_StringFindNoMatch/64 101991434 6.71586 6.71845 ns 0.9649875498
BM_StringFindNoMatch/512 43358580 16.0791 16.0854 ns BM_StringFindNoMatch/512 43987907 15.7514 15.7576 ns 0.9856931815
BM_StringFindNoMatch/4096 8822423 77.7055 77.7353 ns BM_StringFindNoMatch/4096 8823762 77.9274 77.9578 ns 0.9998482507
BM_StringFindNoMatch/32768 1229953 555.193 555.407 ns BM_StringFindNoMatch/32768 1225945 556.461 556.673 ns 1.0032693147
BM_StringFindNoMatch/131072 259822 2676.78 2677.81 ns BM_StringFindNoMatch/131072 259911 2683.54 2684.58 ns 0.9996575751
BM_StringFindAllMatch/16 83381758 8.24367 8.24685 ns BM_StringFindAllMatch/16 80490602 8.4969 8.50019 ns 1.0359191748
BM_StringFindAllMatch/64 71160678 9.71789 9.72167 ns BM_StringFindAllMatch/64 75021615 9.1945 9.19809 ns 0.948535672
BM_StringFindAllMatch/512 30684934 22.6952 22.704 ns BM_StringFindAllMatch/512 31445558 22.1342 22.1428 ns 0.9758114008
BM_StringFindAllMatch/4096 7286619 94.6387 94.6751 ns BM_StringFindAllMatch/4096 7344032 94.1863 94.2228 ns 0.9921823598
BM_StringFindAllMatch/32768 811537 848.437 848.764 ns BM_StringFindAllMatch/32768 810791 847.399 847.725 ns 1.0009200891
BM_StringFindAllMatch/131072 198005 3534.13 3535.5 ns BM_StringFindAllMatch/131072 198264 3523.62 3525 ns 0.998693661
BM_StringFindMatch1/16 513431 1342.31 1342.83 ns BM_StringFindMatch1/16 516938 1337.14 1337.65 ns 0.9932158209
BM_StringFindMatch1/64 515083 1344.78 1345.31 ns BM_StringFindMatch1/64 515701 1338.27 1338.78 ns 0.9988016312
BM_StringFindMatch1/512 509857 1364.49 1365.03 ns BM_StringFindMatch1/512 510406 1355.81 1356.34 ns 0.9989243857
BM_StringFindMatch1/4096 464784 1488.54 1489.12 ns BM_StringFindMatch1/4096 464401 1490.19 1490.78 ns 1.0008247183
BM_StringFindMatch1/32768 312190 2229.9 2230.77 ns BM_StringFindMatch1/32768 319294 2174.86 2175.71 ns 0.977750913
BM_StringFindMatch2/16 514506 1341.99 1342.51 ns BM_StringFindMatch2/16 515915 1336.93 1337.45 ns 0.99726893
BM_StringFindMatch2/64 513383 1350.05 1350.58 ns BM_StringFindMatch2/64 515230 1343.71 1344.23 ns 0.9964151932
BM_StringFindMatch2/512 507300 1369.33 1369.87 ns BM_StringFindMatch2/512 508966 1361.42 1361.95 ns 0.9967266969
BM_StringFindMatch2/4096 465193 1493.35 1493.93 ns BM_StringFindMatch2/4096 464890 1491.98 1492.57 ns 1.0006517671
BM_StringFindMatch2/32768 311876 2235.61 2236.48 ns BM_StringFindMatch2/32768 319926 2179.43 2180.27 ns 0.9748379313
BM_StringRegression 6332697 108.944 108.986 ns BM_StringRegression 6524234 106.26 106.301 ns 0.9706422241
Geomean 0.9905757402
The patch used to test the benchmark.
~/work/gcc/gcc/build$ git diff
diff --git a/libstdc++-v3/include/bits/basic_string.tcc b/libstdc++-v3/include/bits/basic_string.tcc
index 41b7fa1..55e9209 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -1211,7 +1211,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
// Compare the full strings from the first occurrence of __elem0.
// We already know that __first[0] == __s[0] but compare them again
// anyway because __s is probably aligned, which helps memcmp.
- if (traits_type::compare(__first, __s, __n) == 0)
+ if (traits_type::compare(__first+1, __s+1, __n-1) == 0)
return __first - __data;
__len = __last - ++__first;
}
Thanks,
-Aditya
From: Jonathan Wakely <jwakely@redhat.com>
Sent: Tuesday, January 10, 2017 11:37 AM
To: Manfred
Cc: libstdc++@gcc.gnu.org; Aditya K; Aditya Kumar; Sebastian Pop
Subject: Re: [PATCH] improve string find algorithm
On 10/01/17 17:12 +0100, Manfred wrote:
>On 1/6/2017 9:20 PM, Jonathan Wakely wrote:
>>+ // Compare the full strings from the first occurrence of __elem0.
>>+ // We already know that __first[0] == __s[0] but compare them again
>>+ // anyway because __s is probably aligned, which helps memcmp.
>>+ if (traits_type::compare(__first, __s, __n) == 0)
>>+ return __first - __data;
>I am not sure if __s being aligned really helps here, given that
>__first in principle is not. Would it be worth testing the performance
>with traits_type::compare running from 1-after __s and __first?
Yes, it's probably worth checking. I assume Aditya or Sebastian did
verify that, because there's a comment in their libc++ patch about it,
but we might get diffeernt results with GCC, or with GNU libc.