This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Optimize strchr (s, 0) to strlen
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>
- Date: Wed, 20 Apr 2016 14:18:04 +0000
- Subject: Re: [PATCH] Optimize strchr (s, 0) to strlen
- Authentication-results: sourceware.org; auth=none
- Nodisclaimer: True
- References: <AM3PR08MB0088CA61259F65FAAB4D8196836B0 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com> <CAFiYyc1rGd2KWOaN4RTG45Y1uUp6O0A5qOm=i5ma0BZSK5CrXw at mail dot gmail dot com> <AM3PR08MB00881BE3867DF3FC5B5B7530836C0 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com> <CAFiYyc0cUOs19FV-2PnYxfba4_N8Qwox2tkgXZJEw2obe20zgg at mail dot gmail dot com> <CAFiYyc0FvFEUibja6ObDWw9rYf5Cu6puU3gaRrneNzpgkcEgtg at mail dot gmail dot com> <20160420103325 dot GD2850 at laptop dot zalov dot cz> <AM3PR08MB0088C250B9D4A322ED86350E836D0 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com>,<20160420122418 dot GF2850 at laptop dot zalov dot cz>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:23
Jakub Jelinek wrote:
> On Wed, Apr 20, 2016 at 11:17:06AM +0000, Wilco Dijkstra wrote:
>> Can you quantify "don't like"? I benchmarked rawmemchr on a few targets
>> and it's slower than strlen, so it's hard to guess what you don't like about it.
>
> This is the same stuff as has been discussed for mempcpy, rawmemchr is the
> API meant to use for getting pointer to the terminating '\0', if there are
> deficiencies on the library side, they should be fixed.
About mempcpy, GLIBC nowadays expands it into memcpy (p, q, n) + n by default
in string.h.
Generally after a lot of discussion on this last year, the consensus is that these
functions don't provide a useful gain and are often detrimental to performance even
if optimized assembly implementations happen to be available due to I-cache
pressure.
Emitting rawmemchr/mempcpy/stpcpy automatically as a result of optimization is
a bad idea for most targets given libraries often have inefficient default implementations.
I fixed the GLIBC mempcpy and stpcpy C implementations to use memcpy and strlen
so at least for these performance is no longer absolutely terrible.
Saying that all C libraries should be forced to provide highly optimized assembler
versions for these functions is onerous since they are not frequently used in code
(a quick grep of SPEC resulted in one use of mempcpy, 0 uses of rawmemchr,
strchrnul and stpcpy).
> If you hardcode in
> GCC emitting worse sequence at the caller (which s + strlen (s) is), then
> even once the library deficiency is fixed, you still don't get benefit from
> it.
What benefit exactly? Rawmemchr cannot ever beat strlen. There is a trick that can
make a good strlen faster than rawmemchr, but even ignoring that, an integer based
rawmemchr needs to do extra operations in its inner loop. A SIMD version could use
similar inner loops although rawmemchr still has a higher cost. You could special case
searching for '\0' and jump to strlen (I have patches for that), but that also adds cost...
>I wonder how you work around the
> define strchr(s, c) \
> ..
> in glibc headers anyway.
That should either be removed or changed to use strlen (I have patches for both
options out for review).
> Another thing is for the cases where strlen is desirable to be expanded
> inline, in that case rawmemchr (x, 0) or strchr (x, 0) is likely useful to be
> expanded inline as well and then this decision should be done at expansion
> time.
I'm not sure I'm following you here - that's an argument to expand into strlen early
as strlen is better optimized in GCC...
Wilco