This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: speeding up parts of gcc by using count_leading_zero (long)
- From: Zack Weinberg <zack at codesourcery dot com>
- To: Andrew Pinski <pinskia at physics dot uc dot edu>
- Cc: gcc at gcc dot gnu dot org, apinski at apple dot com
- Date: Sat, 04 Jan 2003 22:51:57 -0800
- Subject: Re: speeding up parts of gcc by using count_leading_zero (long)
- References: <E034ACE0-2073-11D7-AB6F-000393A6D2F2@physics.uc.edu>
Andrew Pinski <pinskia@physics.uc.edu> writes:
> There are parts of gcc which can be sped up on PPC (and other
> architectures which have something similar) by using the instruction
> `cntlz{w,d}' (count leading zero word, double word [PPC64 only]).
The right thing here is to change this code to use ffs() which is
already recognized and optimized by GCC and other compilers. Put a
generic implementation of this primitive in libiberty, since it's not
part of C89.
Also, you missed ggc_alloc in ggc-page.c (which is a much more
important routine to optimize than compute_inverse).
Tangentially, ffs takes an int, which is 32 bits on all supported
hosts. It would make sense to define __builtin_ffs32() and
__builtin_ffs64() to nail down the sizes. ffs64 can be implemented
efficiently on machines with only a 32-bit ffs instruction, as
ffs high, r
test r
bnz 0f
ffs low, r
0f:
so it is useful to provide both of them always.
zw