Bit twiddling builtins
Geoff Keating
geoffk@geoffk.org
Tue Feb 4 00:12:00 GMT 2003
> Date: Mon, 3 Feb 2003 15:48:50 -0800
> From: Richard Henderson <rth@redhat.com>
> Cc: falk.hueffner@student.uni-tuebingen.de, gcc-patches@gcc.gnu.org
> On Mon, Feb 03, 2003 at 02:58:10PM -0800, Geoff Keating wrote:
> > I don't really care about the user program; I'm much more interested
> > in communicating with the middle-end of the compiler so it can apply
> > the above transformation itself and the user program doesn't have to know.
>
> Ah. Well in that case, I guess we could come up with something.
> I guess I agree that when we do constant folding we should wind
> up with something that matches what the target instruction would do.
>
> I have no idea what you have in mind wrt ffs. Since (x == 0) can
> be computed in parallel with ctz(x), I'm not sure why (ctz >> 5)
> is interesting at all.
This is an optimisation in itself; it has no particular relation to
ffs. On ppc, the fastest way to implement
int zero_p (int x) {
return x == 0;
}
is
cntlzw %r3,%r3
srwi %r3,%r3,5
blr
rather than the next best alternative using add-with-carry, which is
bad because anything touching the carry flag tends to synchronize
newer processors:
li %r4,0
subfic %r3,%r3,0
addze %r3,%r4
blr
or using branches (well, conditional return in this simple case):
cmpwi cr0,%r3,0
li %r3,1
beqlr
li %r3,0
blr
(The last two vary in relative speed depending on the processor, but
the first is uniformly good.)
--
- Geoffrey Keating <geoffk@geoffk.org>
More information about the Gcc-patches
mailing list