This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Builtin/headers: Constant arguments and adding extra entry points.
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Richard Henderson <rth at twiddle dot net>
- Cc: gcc at gcc dot gnu dot org, libc-alpha at sourceware dot org
- Date: Tue, 9 Jun 2015 08:30:00 +0200
- Subject: Re: Builtin/headers: Constant arguments and adding extra entry points.
- Authentication-results: sourceware.org; auth=none
- References: <20150604193107 dot GA24282 at domone> <55760153 dot 40406 at twiddle dot net>
On Mon, Jun 08, 2015 at 01:55:47PM -0700, Richard Henderson wrote:
> On 06/04/2015 12:35 PM, OndÅej BÃlka wrote:
> >char *strchr_c(char *x, unsigned long u);
> >#define strchr(x,c) \
> >(__builtin_constant_p(c) ? strchr_c (x, c * (~0ULL / 255)) : strchr (x,c))
> >
>
> Certainly not a universal win, especially for 64-bit RISC. This
> constant can be just as expensive to construct as the original
> multiplication.
>
> Consider PPC64, where 4 insns are required to form this kind of
> replicated 64-bit constant, and 3 insns are required to replicate C.
>
> Then there's other RISC for which replicating C is easily done in
> parallel with the initial alignment checks.
>
Thats another problem that these transformations depend on platform so
you need to maintain somewhere table what is profitable and what is not.
As these functions go its better than you write as users frequently call
strchr in loop, there is potential of savings, like 75% of strchr calls
happened within 128 cycles of previous one which is evidence of that use
case.
Second saving would be in header checks. Unless you need to write then a
best way looks to initially check s % 4096 < 4096 - 32 to avoid page
fault. There could be entry point if gcc could prove that there are 32
more bytes allocated after s to other entry point.
I have todo project to add a interface which tranform
while(s=strchr(s+1,'c')) into something like
struct *strchrp = strchr_init (s,'c');
while (s = strchr_next (strchrp))
to avoid overhead of repeated calls, strchr_next inline will first check
mask with values in say 16 current bytes and if it insn't there it will
do libcall.